date:20130723

linux-next: manual merge of the akpm-current tree with the staging tree

2013-07-23 Thread Stephen Rothwell

Hi Andrew,

Today's linux-next merge of the akpm-current tree got a conflict in
drivers/staging/lustre/lustre/ldlm/ldlm_pool.c between commit
91a50030f05e ("staging/lustre/ldlm: split client namespaces into active
and inactive") from the staging tree and commit 48a91248649f
("staging/lustre/ldlm: convert to shrinkers to count/scan API") from the
akpm-current tree.

I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
index 101af4b,4c41e02..000
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
@@@ -597,16 -594,17 +593,17 @@@ int ldlm_pool_recalc(struct ldlm_pool *
count = pl->pl_ops->po_recalc(pl);
lprocfs_counter_add(pl->pl_stats, LDLM_POOL_RECALC_STAT,
count);
 -  return count;
}
 +  recalc_interval_sec = pl->pl_recalc_time - cfs_time_current_sec() +
 +pl->pl_recalc_period;
  
 -  return 0;
 +  return recalc_interval_sec;
  }
 -EXPORT_SYMBOL(ldlm_pool_recalc);
  
- /**
+ /*
   * Pool shrink wrapper. Will call either client or server pool recalc callback
-  * depending what pool \a pl is used.
+  * depending what pool pl is used. When nr == 0, just return the number of
+  * freeable locks. Otherwise, return the number of canceled locks.
   */
  int ldlm_pool_shrink(struct ldlm_pool *pl, int nr,
 unsigned int gfp_mask)
@@@ -1028,26 -1025,20 +1025,21 @@@ static struct ptlrpc_thread *ldlm_pools
  static struct completion ldlm_pools_comp;
  
  /*
-  * Cancel \a nr locks from all namespaces (if possible). Returns number of
-  * cached locks after shrink is finished. All namespaces are asked to
-  * cancel approximately equal amount of locks to keep balancing.
+  * count locks from all namespaces (if possible). Returns number of
+  * cached locks.
   */
- static int ldlm_pools_shrink(ldlm_side_t client, int nr,
-unsigned int gfp_mask)
+ static unsigned long ldlm_pools_count(ldlm_side_t client, unsigned int 
gfp_mask)
  {
-   int total = 0, cached = 0, nr_ns;
+   unsigned long total = 0, nr_ns;
struct ldlm_namespace *ns;
 +  struct ldlm_namespace *ns_old = NULL; /* loop detection */
void *cookie;
  
-   if (client == LDLM_NAMESPACE_CLIENT && nr != 0 &&
-   !(gfp_mask & __GFP_FS))
-   return -1;
+   if (client == LDLM_NAMESPACE_CLIENT && !(gfp_mask & __GFP_FS))
+   return 0;
  
-   CDEBUG(D_DLMTRACE, "Request to shrink %d %s locks from all pools\n",
-  nr, client == LDLM_NAMESPACE_CLIENT ? "client" : "server");
+   CDEBUG(D_DLMTRACE, "Request to count %s locks from all pools\n",
+  client == LDLM_NAMESPACE_CLIENT ? "client" : "server");
  
cookie = cl_env_reenter();
  
@@@ -1094,8 -1080,8 +1096,8 @@@ static unsigned long ldlm_pools_scan(ld
/*
 * Shrink at least ldlm_namespace_nr(client) namespaces.
 */
-   for (nr_ns = ldlm_namespace_nr_read(client) - nr_ns;
-nr_ns > 0; nr_ns--)
 -  for (tmp = nr_ns = atomic_read(ldlm_namespace_nr(client));
++  for (tmp = nr_ns = ldlm_namespace_nr_read(client) - nr_ns;
+tmp > 0; tmp--)
{
int cancel, nr_locks;
  
@@@ -1125,26 -1108,36 +1124,36 @@@
ldlm_namespace_put(ns);
}
cl_env_reexit(cookie);
-   /* we only decrease the SLV in server pools shrinker, return -1 to
-* kernel to avoid needless loop. LU-1128 */
-   return (client == LDLM_NAMESPACE_SERVER) ? -1 : cached;
+   /*
+* we only decrease the SLV in server pools shrinker, return
+* SHRINK_STOP to kernel to avoid needless loop. LU-1128
+*/
+   return (client == LDLM_NAMESPACE_SERVER) ? SHRINK_STOP : freed;
+ }
+ 
+ static unsigned long ldlm_pools_srv_count(struct shrinker *s, struct 
shrink_control *sc)
+ {
+   return ldlm_pools_count(LDLM_NAMESPACE_SERVER, sc->gfp_mask);
  }
  
- static int ldlm_pools_srv_shrink(SHRINKER_ARGS(sc, nr_to_scan, gfp_mask))
+ static unsigned long ldlm_pools_srv_scan(struct shrinker *s, struct 
shrink_control *sc)
  {
-   return ldlm_pools_shrink(LDLM_NAMESPACE_SERVER,
-shrink_param(sc, nr_to_scan),
-shrink_param(sc, gfp_mask));
+   return ldlm_pools_scan(LDLM_NAMESPACE_SERVER, sc->nr_to_scan,
+  sc->gfp_mask);
  }
  
- static int ldlm_pools_cli_shrink(SHRINKER_ARGS(sc, nr_to_scan, gfp_mask))
+ static unsigned long ldlm_pools_cli_count(struct shrinker *s, struct 
shrink_control *sc)
  {
-   return ldlm_pools_shrink(LDLM_NAMESPACE_CLIENT,
-shrink_param(sc, nr_to_scan),
-

Re: [PATCH net-next] tuntap: hardware vlan tx support

2013-07-23 Thread Jason Wang

On 07/23/2013 11:53 PM, Stephen Hemminger wrote:
> On Tue, 23 Jul 2013 15:15:48 +0800
> Jason Wang  wrote:
>
>> +struct {
>> +__be16 h_vlan_proto;
>> +__be16 h_vlan_TCI;
>> +} veth;
> Don't you want to use struct vlan_hdr here? 

There's no need to care encapsulated proto here. In fact, we just
emulate the hardware inserting of 802.1Q header. So only skb->vlan_tci
and skb->vlan_proto needs to be cared.
> Your definition puts the two fields out of order?

It's order is same as struct vlan_ethhdr. Did you see any issue?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH] microblaze: Fix clone syscall

2013-07-23 Thread Michal Simek

Microblaze was assign to CLONE_BACKWARDS type where
parent tid was passed via 3rd argument.
Microblaze glibc is using 4th argument for it.

Create new CLONE_BACKWARDS3 type where stack_size is passed
via 3rd argument, parent thread id pointer via 4th,
child thread id pointer via 5th and tls value as 6th
argument

Signed-off-by: Michal Simek 
---
Hi Al,

We have found this problem based on debugging timer_create()
reported by customer which wasn't cover by LTP testing.
What tool do you use for syscall testing?

IRC there was any discussion about adding syscall tests
directly to the kernel.

I am not sure if there is more elegant way how to fix this
in syscalls.h.

Thanks,
Michal

---
 arch/Kconfig | 6 ++
 arch/microblaze/Kconfig  | 2 +-
 include/linux/syscalls.h | 5 +
 kernel/fork.c| 6 ++
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 8d2ae24..1feb169 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -407,6 +407,12 @@ config CLONE_BACKWARDS2
help
  Architecture has the first two arguments of clone(2) swapped.

+config CLONE_BACKWARDS3
+   bool
+   help
+ Architecture has tls passed as the 3rd argument of clone(2),
+ not the 5th one.
+
 config ODD_RT_SIGACTION
bool
help
diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index d22a4ec..4fab522 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -28,7 +28,7 @@ config MICROBLAZE
select GENERIC_CLOCKEVENTS
select GENERIC_IDLE_POLL_SETUP
select MODULES_USE_ELF_RELA
-   select CLONE_BACKWARDS
+   select CLONE_BACKWARDS3

 config SWAP
def_bool n
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 4147d70..71d8931 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -802,9 +802,14 @@ asmlinkage long sys_vfork(void);
 asmlinkage long sys_clone(unsigned long, unsigned long, int __user *, int,
   int __user *);
 #else
+#if CONFIG_CLONE_BACKWARDS3
+asmlinkage long sys_clone(unsigned long, unsigned long, int, int __user *,
+ int __user *, int);
+#else
 asmlinkage long sys_clone(unsigned long, unsigned long, int __user *,
   int __user *, int);
 #endif
+#endif

 asmlinkage long sys_execve(const char __user *filename,
const char __user *const __user *argv,
diff --git a/kernel/fork.c b/kernel/fork.c
index 66635c8..da6b699 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1679,6 +1679,12 @@ SYSCALL_DEFINE5(clone, unsigned long, newsp, unsigned 
long, clone_flags,
 int __user *, parent_tidptr,
 int __user *, child_tidptr,
 int, tls_val)
+#elif defined(CONFIG_CLONE_BACKWARDS3)
+SYSCALL_DEFINE6(clone, unsigned long, clone_flags, unsigned long, newsp,
+   int, stack_size,
+   int __user *, parent_tidptr,
+   int __user *, child_tidptr,
+   int, tls_val)
 #else
 SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
 int __user *, parent_tidptr,
--
1.8.2.3



pgpROSLx7Hmca.pgp
Description: PGP signature

Re: [PATCH] drivers: mfd: mfd-core: disable irq_domain related code when 'HAVE_GENERIC_HARDIRQS' disabled.

2013-07-23 Thread Chen Gang

On 07/24/2013 01:02 PM, Heiko Carstens wrote:
> On Wed, Jul 24, 2013 at 11:33:04AM +0800, Chen Gang wrote:
>> > 'irq_domain' depends on hard irqs, so for the architectures which have
>> > no hard irqs, but still need mfd (e.g. s390), need disable the related
>> > code, or can not pass compiling.
>> > 
>> > The related commit:
>> > 
>> >   "c94bb23 mfd: Make MFD core code Device Tree and IRQ domain aware"
>> > 
>> > The related error: (with allmodconfig under s390)
>> > 
>> >   ERROR: "irq_create_mapping" [drivers/mfd/mfd-core.ko] undefined!
>> > 
>> > 
>> > Signed-off-by: Chen Gang 
> s390 will have GENERIC_HARDIRQS soon (very likely next merge window),
> so lets not add more GENERIC_HARDIRQS ifdefs in the code.
> 

OK, thanks.

-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] video: xilinxfb: Fix compilation warning

2013-07-23 Thread Michal Simek

regs_phys is phys_addr_t (u32 or u64).
Lets retype it to u64.

Fixes compilation warning introduced by:
video: xilinxfb: Use drvdata->regs_phys instead of physaddr
(sha1: c88fafef0135e1e1c3e23c3e32ccbeeabc587f81)

Signed-off-by: Michal Simek 
---
ppc44x_defconfig
Fixes regressions in v3.11-rc2

---
 drivers/video/xilinxfb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/xilinxfb.c b/drivers/video/xilinxfb.c
index f3d4a69..79175a6 100644
--- a/drivers/video/xilinxfb.c
+++ b/drivers/video/xilinxfb.c
@@ -341,8 +341,8 @@ static int xilinxfb_assign(struct platform_device *pdev,

if (drvdata->flags & BUS_ACCESS_FLAG) {
/* Put a banner in the log (for DEBUG) */
-   dev_dbg(dev, "regs: phys=%x, virt=%p\n", drvdata->regs_phys,
-   drvdata->regs);
+   dev_dbg(dev, "regs: phys=%llx, virt=%p\n",
+   (unsigned long long)drvdata->regs_phys, drvdata->regs);
}
/* Put a banner in the log (for DEBUG) */
dev_dbg(dev, "fb: phys=%llx, virt=%p, size=%x\n",
--
1.8.2.3



pgpj_0yXqZ8zj.pgp
Description: PGP signature

Re: [ 07/11] MAINTAINERS: add stable_kernel_rules.txt to stable maintainer information

2013-07-23 Thread Greg Kroah-Hartman

On Wed, Jul 24, 2013 at 06:09:29AM +0100, Ben Hutchings wrote:
> On Thu, 2013-07-11 at 15:11 -0700, Greg Kroah-Hartman wrote:
> > 3.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Greg Kroah-Hartman 
> > 
> > commit 7b175c46720f8e6b92801bb634c93d1016f80c62 upstream.
> > 
> > This hopefully will help point developers to the proper way that patches
> > should be submitted for inclusion in the stable kernel releases.
> [...]
> 
> I've queued this up for 3.2.  And, since there was fuzz, I also picked
> your email address update (commit 879a5a001b62).

Thanks for that.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: dts: Add USBPHY nodes to Exynos4x12

2013-07-23 Thread Sachin Kamat

Hi Dongjin,

On 23 July 2013 23:01, Dongjin Kim  wrote:
> This patch adds device nodes for USBPHY to Exynos4x12.
>
> Signed-off-by: Dongjin Kim 
> ---
>  arch/arm/boot/dts/exynos4x12.dtsi |   18 ++
>  1 file changed, 18 insertions(+)
>
> diff --git a/arch/arm/boot/dts/exynos4x12.dtsi 
> b/arch/arm/boot/dts/exynos4x12.dtsi
> index 01da194..9c3335b 100644
> --- a/arch/arm/boot/dts/exynos4x12.dtsi
> +++ b/arch/arm/boot/dts/exynos4x12.dtsi
> @@ -73,4 +73,22 @@
> clock-names = "sclk_fimg2d", "fimg2d";
> status = "disabled";
> };
> +
> +   usbphy@125B0 {

Extra 0 above.

> +   #address-cells = <1>;
> +   #size-cells = <1>;
> +   compatible = "samsung,exynos4x12-usb2phy";
> +   reg = <0x125B 0x100>;
> +   ranges;
> +
> +   clocks = < 2>, < 305>;
> +   clock-names = "xusbxti", "otg";
> +   status = "disabled";
> +
> +   usbphy-sys {
> +   /* USB device and host PHY_CONTROL registers */
> +   reg = <0x10020704 0xc>,
> + <0x1001021c 0x4>;
> +   };
> +   };
>  };

Please add this node after tmu node (satisfies alphabetical order as
well as increasing address value)


-- 
With warm regards,
Sachin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc2

2013-07-23 Thread Kalle Valo

"Rafael J. Wysocki"  writes:

> Well, please try to revert commits efaa14c and 8c5bd7a (in this order) and see
> if that helps.

On 3.11-rc2-wl (from wireless-testing.git) my Thinkpad X230 display went
really dark every time during resume and I had to blindly write to sysfs
to be able to see anything. With 3.10-wl I didn't see this. Reverting
the commits above fixed the issue for me.

I'm using Ubuntu 12.04 64 bit. More info about my laptop:

  thinkpad_acpi: ThinkPad ACPI Extras v0.24
  thinkpad_acpi: http://ibm-acpi.sf.net/
  thinkpad_acpi: ThinkPad BIOS G2ET82WW (2.02 ), EC unknown
  thinkpad_acpi: Lenovo ThinkPad X230, model 2324JB2
  thinkpad_acpi: detected a 8-level brightness capable ThinkPad
  thinkpad_acpi: radio switch found; radios are enabled
  thinkpad_acpi: This ThinkPad has standard ACPI backlight brightness
  control, supported by the ACPI video driver
  thinkpad_acpi: Disabling thinkpad-acpi brightness events by default...
  thinkpad_acpi: rfkill switch tpacpi_bluetooth_sw: radio is unblocked
  thinkpad_acpi: rfkill switch tpacpi_wwan_sw: radio is unblocked
  thinkpad_acpi: Standard ACPI backlight interface available, not
  loading native one
  thinkpad_acpi: Console audio control enabled, mode: monitor (read
  only)
  input: ThinkPad Extra Buttons as
  /devices/platform/thinkpad_acpi/input/input5

-- 
Kalle Valo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ 07/11] MAINTAINERS: add stable_kernel_rules.txt to stable maintainer information

2013-07-23 Thread Ben Hutchings

On Thu, 2013-07-11 at 15:11 -0700, Greg Kroah-Hartman wrote:
> 3.4-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Greg Kroah-Hartman 
> 
> commit 7b175c46720f8e6b92801bb634c93d1016f80c62 upstream.
> 
> This hopefully will help point developers to the proper way that patches
> should be submitted for inclusion in the stable kernel releases.
[...]

I've queued this up for 3.2.  And, since there was fuzz, I also picked
your email address update (commit 879a5a001b62).

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part

Re: [PATCH 1/3 v6] cpufreq: Add debugfs directory for cpufreq

2013-07-23 Thread Viresh Kumar

On 24 July 2013 06:55, Chanwoo Choi  wrote:
> On 07/22/2013 07:11 PM, Viresh Kumar wrote:
>> On 18 July 2013 16:47, Chanwoo Choi  wrote:

>>> +static void cpufreq_remove_debugfs_dir(struct cpufreq_policy *policy,
>>> +  unsigned int cpu)
>>> +{
>>> +   unsigned int idx = cpumask_weight(policy->cpus) > 1 ? cpu : 0;
>>> +
>>> +   if (!policy->cpu_debugfs[idx])
>>> +   return;
>>> +
>>> +   debugfs_remove_recursive(policy->cpu_debugfs[idx]);
>>
>> Whey do we need recursive here? And what exactly does recursive will
>> do?
>>
>
> If cpu is last user of policy, __cpufreq_remove_dev() have to remove debugfs 
> directory
> and child file/directory of root debugfs directory. So, I used 
> debugfs_remove_recursive() function.

You are calling this routine even when we aren't at the last cpu of a policy.
And so, eventually you are calling this routine for a link you have created.

Have you actually tested your code? What kind of platform? What is cpu
topology ?? And what exactly you tested..

We are already on v6 and this patch still looks like the v1.. It still has lots
of basic mistakes, which I don't expect so later in the series..

Its very difficult for me to review the same patchset again and again.. So,
normally people might not review it well after v3-v4 and just trust the sender..
But I am nowhere close to getting that.. And so discouraged to review it..

Please review/test it well on multiple kind of systems if possible. Test on
your intel laptop and see if it has multiple policy structures with
multiple cpus
in it.. cpuX/cpufreq/related_cpus gives you all cpus that share policy
structure.

>>> +}
>>> +
>>
>> same problem here too.
>>> +static void cpufreq_move_debugfs_dir(struct cpufreq_policy *policy,
>>> +unsigned int new_cpu)
>>> +{
>>> +   struct dentry *old_entry, *new_entry;
>>> +   char new_dir_name[CPUFREQ_NAME_LEN];
>>> +   unsigned int j, old_cpu = policy->cpu;
>>> +
>>> +   if (!policy->cpu_debugfs[new_cpu])
>>> +   return;
>>> +
>>> +   /*
>>> +* Remove symbolic link of debugfs directory except for debugfs
>>> +* directory of old_cpu.
>>> +*/
>>> +   for_each_present_cpu(j) {
>>> +   if (old_cpu == j)
>>> +   continue;
>>> +
>>> +   debugfs_remove(policy->cpu_debugfs[j]);
>>
>> Why you need this? We aren't removing the earlier dentry at all here.

haven't answered this.

>>> +   if (!new_entry) {
>>> +   pr_err("changing debugfs directory name failed\n");
>>> +   goto err_rename;
>>> +   }
>>> +
>>> +   policy->cpu_debugfs[new_cpu] = new_entry;
>>> +   policy->cpu_debugfs[old_cpu] = NULL;
>>> +
>>> +   /* Create again symbolic link of debugfs directory */
>>> +   for_each_present_cpu(j) {
>>
>> present_cpu?? We discussed this before.. You will break multi cluster
>> systems.
>
> My mistake. I'll use for_each_cpu() macro instead of for_each_present_cpu().

Go through earlier comments about this.. you are still wrong.. You need to
run over cpus that are in this policy.. i.e. policy->cpus.

>>> +   if (new_cpu == j)
>>> +   continue;
>>> +

>>> @@ -1894,6 +2065,8 @@ int cpufreq_register_driver(struct cpufreq_driver 
>>> *driver_data)
>>> cpufreq_driver = driver_data;
>>> write_unlock_irqrestore(_driver_lock, flags);
>>>
>>> +   cpufreq_create_debugfs();
>>
>> Why you moved this to register_driver? It was fine at cpufreq_core_init()
>
> If we moved this to cpufreq_core_int(), I have to create cpufreq_core_exit().
> Do you agree about creating cpufreq_core_exit()(?

No you don't need that routine. Or in other words there isn't any exit
for cpufreq core and so this directory must not be removed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] drivers: mfd: mfd-core: disable irq_domain related code when 'HAVE_GENERIC_HARDIRQS' disabled.

2013-07-23 Thread Heiko Carstens

On Wed, Jul 24, 2013 at 11:33:04AM +0800, Chen Gang wrote:
> 'irq_domain' depends on hard irqs, so for the architectures which have
> no hard irqs, but still need mfd (e.g. s390), need disable the related
> code, or can not pass compiling.
> 
> The related commit:
> 
>   "c94bb23 mfd: Make MFD core code Device Tree and IRQ domain aware"
> 
> The related error: (with allmodconfig under s390)
> 
>   ERROR: "irq_create_mapping" [drivers/mfd/mfd-core.ko] undefined!
> 
> 
> Signed-off-by: Chen Gang 

s390 will have GENERIC_HARDIRQS soon (very likely next merge window),
so lets not add more GENERIC_HARDIRQS ifdefs in the code.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc2 (acpi backlight)

2013-07-23 Thread Steven Newbury

On Wed, 2013-07-24 at 02:05 +0200, Rafael J. Wysocki wrote:
> On Tuesday, July 23, 2013 11:46:29 AM Kamal Mostafa wrote:
> > On Mon, 2013-07-22 at 21:54 +0200, Rafael J. Wysocki wrote:
> > > On Monday, July 22, 2013 11:11:54 AM Linus Torvalds wrote:
> > > > On Mon, Jul 22, 2013 at 6:02 AM, Rafael J. Wysocki  wrote:
> > > > >
> > > > > Linus, do you want me to send a pull request reverting 8c5bd7a and 
> > > > > efaa14c?
> > > > 
> > > > Yes, but [...] I'd suggest doing the revert just in time for
> > > > rc3, but waiting until then to gather info about people who see
> > > > breakage.
> > > > 
> > > > Sound like a plan?
> > > 
> > > Yes, it does.
> > > 
> > > Rafael
> > 
> > 
> > Hi Rafael-
> > 
> > For your reference...
> > 
> > As James Hogan reported, those ACPI changes break backlight control on
> > the "Dell XPS13" Ivy Bridge models (the Sandy Bridge XPS13 model is not
> > affected).
> > 
> > I confirm that reverting 8c5bd7a and efaa14c fixes it again.
> 
> Thanks!
> 
> I'd like to collect some information on the systems having problems with those
> two commits (to see if they are similar somehow).
> 
> It seems that one common symptom is that brightness cannot be controlled
> through function keys.  Is that correct for all of you?  If so, did you try
> any other way to control brightness, like a GUI-based?
> 
> Also, can you all please send me (a) the output of dmidecode and (b) the
> contents of /proc/cpuinfo from your systems?
> 
> Rafael
> 
> 

Attached.

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 58
model name  : Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz
stepping: 9
microcode   : 0x15
cpu MHz : 2212.000
cache size  : 8192 KB
physical id : 0
siblings: 8
core id : 0
cpu cores   : 4
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm 
constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc 
aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 
cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave 
avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi 
flexpriority ept vpid fsgsbase smep erms
bogomips: 5587.12
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model   : 58
model name  : Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz
stepping: 9
microcode   : 0x15
cpu MHz : 1960.000
cache size  : 8192 KB
physical id : 0
siblings: 8
core id : 1
cpu cores   : 4
apicid  : 2
initial apicid  : 2
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm 
constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc 
aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 
cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave 
avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi 
flexpriority ept vpid fsgsbase smep erms
bogomips: 5587.12
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 2
vendor_id   : GenuineIntel
cpu family  : 6
model   : 58
model name  : Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz
stepping: 9
microcode   : 0x15
cpu MHz : 2716.000
cache size  : 8192 KB
physical id : 0
siblings: 8
core id : 2
cpu cores   : 4
apicid  : 4
initial apicid  : 4
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm 
constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc 
aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 
cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave 
avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi 
flexpriority ept vpid fsgsbase smep erms
bogomips: 5587.12
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 3
vendor_id   : GenuineIntel
cpu family  : 6
model   : 58
model name  : Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz
stepping: 9
microcode   : 0x15
cpu

Re: [PATCH 1/4] x86: introduce hypervisor_cpuid_base()

2013-07-23 Thread H. Peter Anvin

On 07/23/2013 09:44 PM, Jason Wang wrote:
> 
> Since it's just a minor optimization. How about just keep using the
> strcmp()?
> 

It's more that it enables the rest of the cleanup, making the code
easier to read.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] x86: properly handle kvm emulation of hyperv

2013-07-23 Thread H. Peter Anvin

On 07/23/2013 09:37 PM, Jason Wang wrote:
> On 07/23/2013 10:48 PM, H. Peter Anvin wrote:
>> On 07/23/2013 06:55 AM, KY Srinivasan wrote:
>>> This strategy of hypervisor detection based on some detection order IMHO is 
>>> not
>>> a robust detection strategy. The current scheme works since the only 
>>> hypervisor emulated
>>> (by other hypervisors happens to be Hyper-V). What if this were to change.
>>>
>> One strategy would be to pick the *last* one in the CPUID list, since
>> the ones before it are logically the one(s) being emulated...
>>
>>  -hpa
>>
> 
> How about simply does a reverse loop from 0x4001 to 0x4001?
> 

Not all systems like being poked too far into hyperspace.  Just remember
the last match and walk the list.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: [PATCH] ceph: Don't use ceph-sync-mode for synchronous-fs.

2013-07-23 Thread majianpeng

>Hi,
>
>Sorry for the slow review.  The patch looks good, but I have a hard time 
>understanding your changelog... is it okay if I change it to something 
>like:
>
>
>Sending reads and writes through the sync read/write paths bypasses the 
>page cache, which is not expected or generally a good idea.  Removing 
>the write check is safe as there is a conditional vfs_fsync_range() later 
>in ceph_aio_write that already checks for the same flag (via 
>IS_SYNC(inode)).
>
Very good.
It's my fault. I will notice the message later.

Thanks !
Jianpeng Ma
>?
>
>Thanks!
>sage
>
>
>On Wed, 24 Jul 2013, majianpeng wrote:
>
>> Ping
>> >Hi sage,
>> >How about this patch?Can you give some advisement?
>> >Thanks!
>> >Jianpeng Ma
>> >>At now for synchronous-fs, all write-operations use ceph_sync_mode.
>> >>But for the file which opened with O_SYNC, it don't use sync_mode.
>> >>The behaviour of them should be the same.
>> >>For fs which mounted using '-o sync', it want all I/O to the filesystem
>> >>should be done synchronously.But the ceph-sync-mode don't be suitful
>> >>for.For example,using ceph-sync-mode the content of file don't have in
>> >>memory.This will cause the following read only from osd rather than
>> >>memory.
>> >>
>> >>Signed-off-by: Jianpeng Ma 
>> >>---
>> >> fs/ceph/file.c | 2 --
>> >> 1 file changed, 2 deletions(-)
>> >>
>> >>diff --git a/fs/ceph/file.c b/fs/ceph/file.c
>> >>index 656e169..44670ad 100644
>> >>--- a/fs/ceph/file.c
>> >>+++ b/fs/ceph/file.c
>> >>@@ -659,7 +659,6 @@ again:
>> >> 
>> >>   if ((got & (CEPH_CAP_FILE_CACHE|CEPH_CAP_FILE_LAZYIO)) == 0 ||
>> >>   (iocb->ki_filp->f_flags & O_DIRECT) ||
>> >>-  (inode->i_sb->s_flags & MS_SYNCHRONOUS) ||
>> >>   (fi->flags & CEPH_F_SYNC))
>> >>   /* hmm, this isn't really async... */
>> >>   ret = ceph_sync_read(filp, base, len, ppos, );
>> >>@@ -764,7 +763,6 @@ retry_snap:
>> >> 
>> >>   if ((got & (CEPH_CAP_FILE_BUFFER|CEPH_CAP_FILE_LAZYIO)) == 0 ||
>> >>   (iocb->ki_filp->f_flags & O_DIRECT) ||
>> >>-  (inode->i_sb->s_flags & MS_SYNCHRONOUS) ||
>> >>   (fi->flags & CEPH_F_SYNC)) {
>> >>   mutex_unlock(>i_mutex);
>> >>   written = ceph_sync_write(file, iov->iov_base, count,
>> >>-- 
>> >>1.8.1.2

Re: [PATCH 1/4] x86: introduce hypervisor_cpuid_base()

2013-07-23 Thread Jason Wang

On 07/24/2013 12:03 AM, H. Peter Anvin wrote:
> On 07/23/2013 04:16 AM, Paolo Bonzini wrote:
>> That's nicer, though strcmp is what the replaced code used to do in
>> patches 2 and 3.
>>
>> Note that memcmp requires the caller to use "KVMKVMKVM\0\0" as the
>> signature (or alternatively hypervisor_cpuid_base can copy the argument
>> into another 12-byte local variable).
>>
> Which is the actual signature, though...
>
>   -hpa
>
>

Since it's just a minor optimization. How about just keep using the
strcmp()?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: build failure after merge of the tty tree

2013-07-23 Thread Stephen Rothwell

Hi Greg,

After merging the tty tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

drivers/tty/n_tty.c: In function 'n_tty_close':
drivers/tty/n_tty.c:1757:2: error: implicit declaration of function 'vfree' 
[-Werror=implicit-function-declaration]
  vfree(ldata);
  ^
drivers/tty/n_tty.c: In function 'n_tty_open':
drivers/tty/n_tty.c:1776:2: error: implicit declaration of function 'vmalloc' 
[-Werror=implicit-function-declaration]
  ldata = vmalloc(sizeof(*ldata));
  ^
drivers/tty/n_tty.c:1776:8: warning: assignment makes pointer from integer 
without a cast [enabled by default]
  ldata = vmalloc(sizeof(*ldata));
^

Caused by commit 20bafb3d23d1 ("n_tty: Move buffers into n_tty_data").
Forgot to include linux/vmalloc.h?

I have used the tty tree from next-20130723 for today.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


pgps0rGsDZ5vq.pgp
Description: PGP signature

Re: [PATCH net-next] tuntap: hardware vlan tx support

2013-07-23 Thread Jason Wang

On 07/23/2013 11:17 PM, Stephen Hemminger wrote:
> On Tue, 23 Jul 2013 15:15:48 +0800
> Jason Wang  wrote:
>
>> Inspired by commit f09e2249c4f5c7c13261ec73f5a7807076af0c8e (macvtap: restore
>> vlan header on user read). This patch adds hardware vlan tx support for
>> tuntap. This is done by copying vlan header directly into userspace in
>> tun_put_user() instead of doing it through __vlan_put_tag() in
>> dev_hard_start_xmit(). This eliminates one unnecessary memove in
>> vlan_insert_tag() for 802.1ad and 802.1q traffic.
>>
>> pktgen test shows about 20% improvement for 802.1q traffic:
>>
>> Before:
>>   662149pps 317Mb/sec (317831520bps) errors: 0
>> After:
>>   801033pps 384Mb/sec (384495840bps) errors: 0
>>
>> Cc: Basil Gor 
>> Cc: Michael S. Tsirkin 
>> Signed-off-by: Jason Wang 
>
> You need to make this configurable by some mechanism, since otherwise
> it will break applications that expect current VLAN behavior
> --
>

Didn't see any breakage. Vlan id will always exist in the userspace buffer.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] x86: properly handle kvm emulation of hyperv

2013-07-23 Thread Jason Wang

On 07/23/2013 10:48 PM, H. Peter Anvin wrote:
> On 07/23/2013 06:55 AM, KY Srinivasan wrote:
>> This strategy of hypervisor detection based on some detection order IMHO is 
>> not
>> a robust detection strategy. The current scheme works since the only 
>> hypervisor emulated
>> (by other hypervisors happens to be Hyper-V). What if this were to change.
>>
> One strategy would be to pick the *last* one in the CPUID list, since
> the ones before it are logically the one(s) being emulated...
>
>   -hpa
>

How about simply does a reverse loop from 0x4001 to 0x4001?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] x86: introduce hypervisor_cpuid_base()

2013-07-23 Thread Jason Wang

On 07/23/2013 09:48 PM, Gleb Natapov wrote:
> On Tue, Jul 23, 2013 at 05:41:02PM +0800, Jason Wang wrote:
>> > This patch introduce hypervisor_cpuid_base() which loop test the hypervisor
>> > existence function until the signature match and check the number of 
>> > leaves if
>> > required. This could be used by Xen/KVM guest to detect the existence of
>> > hypervisor.
>> > 
> Looks good to me.
>
> Since this touches common code, kvm and xen I expect this to be taken
> via the tip tree, correct?
>
Yes, I think so.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: request for stable inclusion

2013-07-23 Thread Ben Hutchings

On Fri, 2013-06-28 at 17:04 +0800, Yijing Wang wrote:
> Hi Greg, Jiri or Liang Li
> 
> 384e301e3519599b000c1a2ecd938b533fc15d85
> pch_uart: fix a deadlock when pch_uart as console
> 
> This looks applicable to stable-3.4, and it
> was built successful for me. What do you think?

I've queued this up for 3.2 as well, thanks.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part

Re: [RFC PATCH v2] sched: Limit idle_balance()

2013-07-23 Thread Jason Low

On Tue, 2013-07-23 at 16:36 +0530, Srikar Dronamraju wrote:
> > 
> > A potential issue I have found with avg_idle is that it may sometimes be
> > not quite as accurate for the purposes of this patch, because it is
> > always given a max value (default is 100 ns). For example, a CPU
> > could have remained idle for 1 second and avg_idle would be set to 1
> > millisecond. Another question I have is whether we can update avg_idle
> > at all times without putting a maximum value on avg_idle, or increase
> > the maximum value of avg_idle by a lot.
> 
> May be the current max value is a limiting factor, but I think there
> should be a limit to the maximum value. Peter and Ingo may help us
> understand why they limited to the 1ms. But I dont think we should
> introduce a new variable just for this.

You're right. As Peter recently mentioned, avg_idle is only used for
idle_balance() anyway, so we should just use the existing avg_idle
estimator.

> > 
> > > Should we take the consideration of whether a idle_balance was
> > > successful or not?
> > 
> > I recently ran fserver on the 8 socket machine with HT-enabled and found
> > that load balance was succeeding at a higher than average rate, but idle
> > balance was still lowering performance of that workload by a lot.
> > However, it makes sense to allow idle balance to run longer/more often
> > when it has a higher success rate.
> > 
> 
> If idle balance did succeed, then it means that the system was indeed
> imbalanced. So idle balance was the right thing to do. May be we chose
> the wrong task to pull. May be after numa balancing enhancements go in,
> we pick a better task to pull atleast across nodes. And there could be
> other opportunities/strategies to select a right task to pull.
> 
> Again, schedstats during the application run should give us hints here.
> 
> > > I am not sure whats a reasonable value for n can be, but may be we could
> > > try with n=3.
> > 
> > Based on some of the data I collected, n = 10 to 20 provides much better
> > performance increases.
> > 
> 
> I was saying it the other way. 
> your suggestion is to run idle balance once in n runs .. where n is 10
> to 20. 
> My thinking was to not run idle balance once in n unsuccessful runs. 

When I suggested N is 20, that means that if the average idle time of a
CPU is 1,000,000 ns, then we stop idle balancing if the average cost to
idle balance within a domain would come up to be greater than (1,000,000
ns / 20) = 50,000 ns. In the v2 patch, N helps determine the maximum
duration in which we allow each idle_balance() to run.

Thanks,
Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, apic: Enable x2APIC physical when cpu < 256 native

2013-07-23 Thread Yinghai Lu

On Thu, Jul 11, 2013 at 6:22 PM, Youquan Song  wrote:
> x2APIC extends APICID from 8 bits to 32 bits, but the device interrupt routed
> from IOAPIC or delivered in MSI mode will keep 8 bits destination APICID.
> In order to support x2APIC, the VT-d interrupt remapping is introduced to
> translate the destination APICID to 32 bits in x2APIC mode and keep the device
> compatible in this way.
>
> x2APIC support both logical and physical mode in destination mode.
> In logical destination mode, the 32 bits Logical APICID has 2 sub-fields:
>  16 bits cluster ID and 16 bits logical ID within the cluster and it is
> required VT-d interrupt remapping in x2APIC cluster mode.
> In physical destination mode, the 8 bits physical id is compatible with 32
> bits physical id when CPU number < 256.
> When interrupt remapping initialization fail on platform with CPU number < 
> 256,
> current kernel only enables x2APIC physical mode in virutalization 
> environment,
> while we also can enable x2APIC physcial mode in native kernel this situation,
> and the device interrupt will use 8 bits destination APICID in physical mode
> and be compatible with x2APIC physical when < 256 CPUs.
>
> So we can benefit from x2APIC vs xAPIC MMIO:
>  - x2APIC MSR read/write is faster than xAPIC mmio
>  - x2APIC only ICR write to deliver interrupt without polling ICR deliver
>status bit and xAPIC need poll to read ICR deliver status bit.
>  - x2APIC 64 bits ICR access instead of xAPIC two 32 bits access.
>
> Signed-off-by: Youquan Song 
> ---
>  arch/x86/kernel/apic/apic.c |7 ++-
>  1 files changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 904611b..51a065a 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -1603,11 +1603,8 @@ void __init enable_IR_x2apic(void)
> goto skip_x2apic;
>
> if (ret < 0) {
> -   /* IR is required if there is APIC ID > 255 even when running
> -* under KVM
> -*/
> -   if (max_physical_apicid > 255 ||
> -   !hypervisor_x2apic_available()) {
> +   /* IR is required if there is APIC ID > 255 */
> +   if (max_physical_apicid > 255) {
> if (x2apic_preenabled)
> disable_x2apic();
> goto skip_x2apic;

Those are kvm and xen related.

Add more Cc.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default

2013-07-23 Thread Ingo Molnar


* Toshi Kani  wrote:

> On Tue, 2013-07-23 at 10:01 +0200, Ingo Molnar wrote:
> > * Toshi Kani  wrote:
> > 
> > > > Could we please also fix it to never crash the kernel, even if stupid 
> > > > ranges are provided?
> > > 
> > > Yes, this probe interface can be enhanced to verify the firmware 
> > > information before adding a given memory address.  However, such change 
> > > would interfere its test use of "fake" hotplug, which is only the known 
> > > use-case of this interface on x86.
> > 
> > Not crashing the kernel is not a novel concept even for test interfaces...
> 
> Agreed.
> 
> > Where does the possible crash come from - from using invalid RAM ranges, 
> > right? I.e. on x86 to fix the crash we need to check the RAM is present in 
> > the e820 maps, is marked RAM there, and is not already registered with the 
> > kernel, or so?
> 
> Yes, the crash comes from using invalid RAM ranges.  How to check if the
> RAM is present is different if the system supports hotplug or not.
> 
> > > In order to verify if a given memory address is enabled at run-time (as 
> > > opposed to boot-time), we need to check with ACPI memory device objects 
> > > on x86.  However, system vendors tend to not implement memory device 
> > > objects unless their systems support memory hotplug.  Dave Hansen is 
> > > using this interface for his testing as a way to fake a hotplug event on 
> > > a system that does not support memory hotplug.
> > 
> > All vendors implement e820 maps for the memory present at boot time.
> 
> Yes for boot time.  At run-time, e820 is not guaranteed to represent a
> new memory added. [...]

Yes I know that, the e820 map is boot only.

You claimed that the only purpose of this on x86 was that testing was done 
on non-hotplug systems, using this interface. Non-hotplug systems have 
e820 maps.

> > How does the hotplug event based approach solve double adds? Relies on 
> > the hardware not sending a hot-add event twice for the same memory 
> > area or for an invalid memory area, or does it include fail-safes and 
> > double checks as well to avoid double adds and adding invalid memory? 
> > If yes then that could be utilized here as well.
> 
> In high-level, here is how ACPI memory hotplug works:
> 
> 1. ACPI sends a hotplug event to a new ACPI memory device object that is
> hot-added.
> 2. The kernel is notified, and verifies if the new memory device object
> has not been attached by any handler yet.
> 3. The memory handler is called, and obtains a new memory range from the
> ACPI memory device object. 
> 4. The memory handler calls add_memory() with the new address range.
> 
> The above step 1-4 proceeds automatically within the kernel.  No user 
> input (nor sysfs interface) is necessary.  Step 2 prevents double adds 
> [...]

If this 'new memory device object' is some ACPI detail then I don't see 
how it protects the kernel from a buggy ACPI implementation double adding 
the same physical memory range.

> and step 3 gets a valid address range from the firmware directly.  Step 
> 4 is basically the same as the "probe" interface, but with all the 
> verification up front, this step is safe.

So what verification does the kernel do to ensure that a buggy ACPI 
implementation does not pass us a crappy memory range, such a double 
physical range (represented via separate 'memory device objects'), or a 
range overlapping with an existing physical memory range already known to 
the kernel, or a totally nonsensical range the CPU cannot even access 
physically, etc.?

Also, is there any verification done to make sure that the new memory 
range is actually RAM - i.e. we could write the first and last word of it 
and see whether it gets modified correctly [to keep the sanity check 
fast]?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 04/18] MAINTAINERS: ARM: S3C2410: Update patterns

2013-07-23 Thread Kukjin Kim

Joe Perches wrote:
> 
> commit 85fd6d63bf2 ("ARM: S3C2410: move mach-s3c2410/* into
mach-s3c24xx/")
> moved the files, update the patterns.
> 
> Signed-off-by: Joe Perches 
> cc: Kukjin Kim 

Acked-by: Kukjin Kim 

Thanks,
Kukjin

> ---
>  MAINTAINERS | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 157e4ee..fbd5a67 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -7446,9 +7446,9 @@ P:  Vincent Sanders 
>  M:   Simtec Linux Team 
>  W:   http://www.simtec.co.uk/products/EB2410ITX/
>  S:   Supported
> -F:   arch/arm/mach-s3c2410/mach-bast.c
> -F:   arch/arm/mach-s3c2410/bast-ide.c
> -F:   arch/arm/mach-s3c2410/bast-irq.c
> +F:   arch/arm/mach-s3c24xx/mach-bast.c
> +F:   arch/arm/mach-s3c24xx/bast-ide.c
> +F:   arch/arm/mach-s3c24xx/bast-irq.c
> 
>  TI DAVINCI MACHINE SUPPORT
>  M:   Sekhar Nori 
> --
> 1.8.1.2.459.gbcd45b4.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/9] spi: tegra114: move to generic dma DT binding

2013-07-23 Thread Richard Zhao

- driver: remove use of nvidia,dma-request-selector
  use dma_request_slave_channel to request channel
- if dmas/dma-names are missing, it still supports cpu based transfer
- update binding doc and specify dmas/dma-names properties as optional

Signed-off-by: Richard Zhao 
---
 .../devicetree/bindings/spi/nvidia,tegra114-spi.txt  | 10 +++---
 drivers/spi/spi-tegra114.c   | 16 +---
 2 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/Documentation/devicetree/bindings/spi/nvidia,tegra114-spi.txt 
b/Documentation/devicetree/bindings/spi/nvidia,tegra114-spi.txt
index 91ff771..92e1a9a 100644
--- a/Documentation/devicetree/bindings/spi/nvidia,tegra114-spi.txt
+++ b/Documentation/devicetree/bindings/spi/nvidia,tegra114-spi.txt
@@ -4,11 +4,14 @@ Required properties:
 - compatible : should be "nvidia,tegra114-spi".
 - reg: Should contain SPI registers location and length.
 - interrupts: Should contain SPI interrupts.
-- nvidia,dma-request-selector : The Tegra DMA controller's phandle and
-  request selector for this SPI controller.
 - This is also require clock named "spi" as per binding document
   Documentation/devicetree/bindings/clock/clock-bindings.txt
 
+Optional properties:
+- dmas : The Tegra DMA controller's phandle and request selector for
+  this SPI controller.
+- dma-names : Should be "rx-tx".
+
 Recommended properties:
 - spi-max-frequency: Definition as per
  Documentation/devicetree/bindings/spi/spi-bus.txt
@@ -18,7 +21,8 @@ spi@7000d600 {
compatible = "nvidia,tegra114-spi";
reg = <0x7000d600 0x200>;
interrupts = <0 82 0x04>;
-   nvidia,dma-request-selector = < 16>;
+   dmas = < 16>;
+   dma-names = "rx-tx";
spi-max-frequency = <2500>;
#address-cells = <1>;
#size-cells = <0>;
diff --git a/drivers/spi/spi-tegra114.c b/drivers/spi/spi-tegra114.c
index e8f542a..baff559 100644
--- a/drivers/spi/spi-tegra114.c
+++ b/drivers/spi/spi-tegra114.c
@@ -177,7 +177,7 @@ struct tegra_spi_data {
void __iomem*base;
phys_addr_t phys;
unsignedirq;
-   int dma_req_sel;
+   booluse_dma;
u32 spi_max_frequency;
u32 cur_speed;
 
@@ -599,11 +599,8 @@ static int tegra_spi_init_dma_param(struct tegra_spi_data 
*tspi,
dma_addr_t dma_phys;
int ret;
struct dma_slave_config dma_sconfig;
-   dma_cap_mask_t mask;
 
-   dma_cap_zero(mask);
-   dma_cap_set(DMA_SLAVE, mask);
-   dma_chan = dma_request_channel(mask, NULL, NULL);
+   dma_chan = dma_request_slave_channel(tspi->dev, "rx-tx");
if (!dma_chan) {
dev_err(tspi->dev,
"Dma channel is not available, will try later\n");
@@ -618,7 +615,6 @@ static int tegra_spi_init_dma_param(struct tegra_spi_data 
*tspi,
return -ENOMEM;
}
 
-   dma_sconfig.slave_id = tspi->dma_req_sel;
if (dma_to_memory) {
dma_sconfig.src_addr = tspi->phys + SPI_RX_FIFO;
dma_sconfig.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
@@ -1012,11 +1008,9 @@ static void tegra_spi_parse_dt(struct platform_device 
*pdev,
struct tegra_spi_data *tspi)
 {
struct device_node *np = pdev->dev.of_node;
-   u32 of_dma[2];
 
-   if (of_property_read_u32_array(np, "nvidia,dma-request-selector",
-   of_dma, 2) >= 0)
-   tspi->dma_req_sel = of_dma[1];
+   if (of_find_property(np, "dmas", NULL))
+   tspi->use_dma = true;
 
if (of_property_read_u32(np, "spi-max-frequency",
>spi_max_frequency))
@@ -1093,7 +1087,7 @@ static int tegra_spi_probe(struct platform_device *pdev)
tspi->max_buf_size = SPI_FIFO_DEPTH << 2;
tspi->dma_buf_size = DEFAULT_SPI_DMA_BUF_LEN;
 
-   if (tspi->dma_req_sel) {
+   if (tspi->use_dma) {
ret = tegra_spi_init_dma_param(tspi, true);
if (ret < 0) {
dev_err(>dev, "RxDma Init failed, err %d\n", ret);
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/9] ARM: dts: add generic DMA DT binding for tegra apbdma

2013-07-23 Thread Richard Zhao

All Tegra device drivers will soon move to generic DMA device tree bindings.
Add the required properties to the Tegra DT files to support that. The legacy
property nvidia,dma-request-selector will be removed after all drivers have
been converted, in order to maintain bisectability.

Changes:
 - Add '#dma-cells' for apbdma nodes
 - And properties 'dmas' and 'dma-names' for apbdma client nodes
 - update apbdma DT binding doc

Signed-off-by: Richard Zhao 
---
 .../devicetree/bindings/dma/tegra20-apbdma.txt |  1 +
 arch/arm/boot/dts/tegra114.dtsi| 27 ++
 arch/arm/boot/dts/tegra20.dtsi | 27 ++
 arch/arm/boot/dts/tegra30.dtsi | 25 
 4 files changed, 80 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/tegra20-apbdma.txt 
b/Documentation/devicetree/bindings/dma/tegra20-apbdma.txt
index 90fa7da..e4fc695 100644
--- a/Documentation/devicetree/bindings/dma/tegra20-apbdma.txt
+++ b/Documentation/devicetree/bindings/dma/tegra20-apbdma.txt
@@ -5,6 +5,7 @@ Required properties:
 - reg: Should contain DMA registers location and length. This shuld include
   all of the per-channel registers.
 - interrupts: Should contain all of the per-channel DMA interrupts.
+- #dma-cells: Must be <1>, which specifies the dma request.
 
 Examples:
 
diff --git a/arch/arm/boot/dts/tegra114.dtsi b/arch/arm/boot/dts/tegra114.dtsi
index abf6c40..b133c62 100644
--- a/arch/arm/boot/dts/tegra114.dtsi
+++ b/arch/arm/boot/dts/tegra114.dtsi
@@ -81,6 +81,7 @@
 ,
 ;
clocks = <_car TEGRA114_CLK_APBDMA>;
+   #dma-cells = <1>;
};
 
ahb: ahb {
@@ -125,6 +126,8 @@
reg-shift = <2>;
interrupts = ;
nvidia,dma-request-selector = < 8>;
+   dmas = < 8>;
+   dma-names = "rx-tx";
status = "disabled";
clocks = <_car TEGRA114_CLK_UARTA>;
};
@@ -135,6 +138,8 @@
reg-shift = <2>;
interrupts = ;
nvidia,dma-request-selector = < 9>;
+   dmas = < 9>;
+   dma-names = "rx-tx";
status = "disabled";
clocks = <_car TEGRA114_CLK_UARTB>;
};
@@ -145,6 +150,8 @@
reg-shift = <2>;
interrupts = ;
nvidia,dma-request-selector = < 10>;
+   dmas = < 10>;
+   dma-names = "rx-tx";
status = "disabled";
clocks = <_car TEGRA114_CLK_UARTC>;
};
@@ -155,6 +162,8 @@
reg-shift = <2>;
interrupts = ;
nvidia,dma-request-selector = < 19>;
+   dmas = < 19>;
+   dma-names = "rx-tx";
status = "disabled";
clocks = <_car TEGRA114_CLK_UARTD>;
};
@@ -227,6 +236,8 @@
reg = <0x7000d400 0x200>;
interrupts = ;
nvidia,dma-request-selector = < 15>;
+   dmas = < 15>;
+   dma-names = "rx-tx";
#address-cells = <1>;
#size-cells = <0>;
clocks = <_car TEGRA114_CLK_SBC1>;
@@ -239,6 +250,8 @@
reg = <0x7000d600 0x200>;
interrupts = ;
nvidia,dma-request-selector = < 16>;
+   dmas = < 16>;
+   dma-names = "rx-tx";
#address-cells = <1>;
#size-cells = <0>;
clocks = <_car TEGRA114_CLK_SBC2>;
@@ -251,6 +264,8 @@
reg = <0x7000d800 0x200>;
interrupts = ;
nvidia,dma-request-selector = < 17>;
+   dmas = < 17>;
+   dma-names = "rx-tx";
#address-cells = <1>;
#size-cells = <0>;
clocks = <_car TEGRA114_CLK_SBC3>;
@@ -263,6 +278,8 @@
reg = <0x7000da00 0x200>;
interrupts = ;
nvidia,dma-request-selector = < 18>;
+   dmas = < 18>;
+   dma-names = "rx-tx";
#address-cells = <1>;
#size-cells = <0>;
clocks = <_car TEGRA114_CLK_SBC4>;
@@ -275,6 +292,8 @@
reg = <0x7000dc00 0x200>;
interrupts = ;
nvidia,dma-request-selector = < 27>;
+   dmas = < 27>;
+   dma-names = "rx-tx";
#address-cells = <1>;
#size-cells = <0>;
clocks = <_car TEGRA114_CLK_SBC5>;
@@ -287,6 +306,8 @@
reg = <0x7000de00 0x200>;
interrupts = ;
nvidia,dma-request-selector = < 28>;
+   dmas = < 28>;
+   dma-names = "rx-tx";
#address-cells = <1>;
#size-cells = <0>;
clocks = <_car TEGRA114_CLK_SBC6>;
@@

[PATCH 5/9] spi: tegra20-sflash: move to generic dma DT binding

2013-07-23 Thread Richard Zhao

update binding doc.

Signed-off-by: Richard Zhao 
---
 .../devicetree/bindings/spi/nvidia,tegra20-sflash.txt  | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/spi/nvidia,tegra20-sflash.txt 
b/Documentation/devicetree/bindings/spi/nvidia,tegra20-sflash.txt
index 7b53da5..fa22d1b 100644
--- a/Documentation/devicetree/bindings/spi/nvidia,tegra20-sflash.txt
+++ b/Documentation/devicetree/bindings/spi/nvidia,tegra20-sflash.txt
@@ -4,8 +4,11 @@ Required properties:
 - compatible : should be "nvidia,tegra20-sflash".
 - reg: Should contain SFLASH registers location and length.
 - interrupts: Should contain SFLASH interrupts.
-- nvidia,dma-request-selector : The Tegra DMA controller's phandle and
-  request selector for this SFLASH controller.
+
+Optional properties:
+- dmas : The Tegra DMA controller's phandle and request selector for
+  this SFLASH controller.
+- dma-names : Should be "rx-tx".
 
 Recommended properties:
 - spi-max-frequency: Definition as per
@@ -17,7 +20,8 @@ spi@7000c380 {
compatible = "nvidia,tegra20-sflash";
reg = <0x7000c380 0x80>;
interrupts = <0 39 0x04>;
-   nvidia,dma-request-selector = < 16>;
+   dmas = < 16>;
+   dma-names = "rx-tx";
spi-max-frequency = <2500>;
#address-cells = <1>;
#size-cells = <0>;
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/9] spi: tegra20-slink: move to generic dma DT binding

2013-07-23 Thread Richard Zhao

- driver: remove use of nvidia,dma-request-selector
  use dma_request_slave_channel to request channel
- if dmas/dma-names are missing, it still supports cpu based transfer
- update binding doc and specify dmas/dma-names properties as optional

Signed-off-by: Richard Zhao 
---
 .../devicetree/bindings/spi/nvidia,tegra20-slink.txt | 10 +++---
 drivers/spi/spi-tegra20-slink.c  | 16 +---
 2 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/Documentation/devicetree/bindings/spi/nvidia,tegra20-slink.txt 
b/Documentation/devicetree/bindings/spi/nvidia,tegra20-slink.txt
index eefe15e..ae43bd1 100644
--- a/Documentation/devicetree/bindings/spi/nvidia,tegra20-slink.txt
+++ b/Documentation/devicetree/bindings/spi/nvidia,tegra20-slink.txt
@@ -4,8 +4,11 @@ Required properties:
 - compatible : should be "nvidia,tegra20-slink", "nvidia,tegra30-slink".
 - reg: Should contain SLINK registers location and length.
 - interrupts: Should contain SLINK interrupts.
-- nvidia,dma-request-selector : The Tegra DMA controller's phandle and
-  request selector for this SLINK controller.
+
+Optional properties:
+- dmas : The Tegra DMA controller's phandle and request selector for
+  this SLINK controller.
+- dma-names : Should be "rx-tx".
 
 Recommended properties:
 - spi-max-frequency: Definition as per
@@ -17,7 +20,8 @@ spi@7000d600 {
compatible = "nvidia,tegra20-slink";
reg = <0x7000d600 0x200>;
interrupts = <0 82 0x04>;
-   nvidia,dma-request-selector = < 16>;
+   dmas = < 16>;
+   dma-names = "rx-tx";
spi-max-frequency = <2500>;
#address-cells = <1>;
#size-cells = <0>;
diff --git a/drivers/spi/spi-tegra20-slink.c b/drivers/spi/spi-tegra20-slink.c
index 80490cc..278fb04 100644
--- a/drivers/spi/spi-tegra20-slink.c
+++ b/drivers/spi/spi-tegra20-slink.c
@@ -170,7 +170,7 @@ struct tegra_slink_data {
void __iomem*base;
phys_addr_t phys;
unsignedirq;
-   int dma_req_sel;
+   booluse_dma;
u32 spi_max_frequency;
u32 cur_speed;
 
@@ -629,11 +629,8 @@ static int tegra_slink_init_dma_param(struct 
tegra_slink_data *tspi,
dma_addr_t dma_phys;
int ret;
struct dma_slave_config dma_sconfig;
-   dma_cap_mask_t mask;
 
-   dma_cap_zero(mask);
-   dma_cap_set(DMA_SLAVE, mask);
-   dma_chan = dma_request_channel(mask, NULL, NULL);
+   dma_chan = dma_request_slave_channel(tspi->dev, "rx-tx");
if (!dma_chan) {
dev_err(tspi->dev,
"Dma channel is not available, will try later\n");
@@ -648,7 +645,6 @@ static int tegra_slink_init_dma_param(struct 
tegra_slink_data *tspi,
return -ENOMEM;
}
 
-   dma_sconfig.slave_id = tspi->dma_req_sel;
if (dma_to_memory) {
dma_sconfig.src_addr = tspi->phys + SLINK_RX_FIFO;
dma_sconfig.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
@@ -1034,11 +1030,9 @@ static irqreturn_t tegra_slink_isr(int irq, void 
*context_data)
 static void tegra_slink_parse_dt(struct tegra_slink_data *tspi)
 {
struct device_node *np = tspi->dev->of_node;
-   u32 of_dma[2];
 
-   if (of_property_read_u32_array(np, "nvidia,dma-request-selector",
-   of_dma, 2) >= 0)
-   tspi->dma_req_sel = of_dma[1];
+   if (of_find_property(np, "dmas", NULL))
+   tspi->use_dma = true;
 
if (of_property_read_u32(np, "spi-max-frequency",
>spi_max_frequency))
@@ -1132,7 +1126,7 @@ static int tegra_slink_probe(struct platform_device *pdev)
tspi->max_buf_size = SLINK_FIFO_DEPTH << 2;
tspi->dma_buf_size = DEFAULT_SPI_DMA_BUF_LEN;
 
-   if (tspi->dma_req_sel) {
+   if (tspi->use_dma) {
ret = tegra_slink_init_dma_param(tspi, true);
if (ret < 0) {
dev_err(>dev, "RxDma Init failed, err %d\n", ret);
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/9] dma: tegra20-apbdma: move to generic device tree bindings

2013-07-23 Thread Richard Zhao

Update tegra20-apbdma driver to adopt generic DMA device tree bindings.
It calls of_dma_controller_register() with of_dma_simple_xlate to get
the generic DMA device tree helper support. The #dma-cells for apbdma
must be 1, which is slave ID.

The existing nvidia,dma-request-selector still works there, and the
support will be removed after all clients get converted to generic DMA
device tree helper.

Signed-off-by: Richard Zhao 
---
 drivers/dma/tegra20-apb-dma.c | 46 +--
 1 file changed, 44 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c
index f137914..0e12f78 100644
--- a/drivers/dma/tegra20-apb-dma.c
+++ b/drivers/dma/tegra20-apb-dma.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -199,6 +200,7 @@ struct tegra_dma_channel {
void*callback_param;
 
/* Channel-slave specific configuration */
+   int slave_id;
struct dma_slave_config dma_sconfig;
struct tegra_dma_channel_regs   channel_reg;
 };
@@ -219,6 +221,8 @@ struct tegra_dma {
struct tegra_dma_channel channels[0];
 };
 
+static struct platform_driver tegra_dmac_driver;
+
 static inline void tdma_write(struct tegra_dma *tdma, u32 reg, u32 val)
 {
writel(val, tdma->base_addr + reg);
@@ -339,6 +343,14 @@ static int tegra_dma_slave_config(struct dma_chan *dc,
}
 
memcpy(>dma_sconfig, sconfig, sizeof(*sconfig));
+
+   /* If we didn't get slave_id from DT when request channel, use the one
+* passed here.
+* It makes compatible with legacy nvidia,dma-request-selector.
+*/
+   if (tdc->slave_id == -EINVAL)
+   tdc->slave_id = sconfig->slave_id;
+
tdc->config_init = true;
return 0;
 }
@@ -943,7 +955,7 @@ static struct dma_async_tx_descriptor 
*tegra_dma_prep_slave_sg(
ahb_seq |= TEGRA_APBDMA_AHBSEQ_BUS_WIDTH_32;
 
csr |= TEGRA_APBDMA_CSR_ONCE | TEGRA_APBDMA_CSR_FLOW;
-   csr |= tdc->dma_sconfig.slave_id << TEGRA_APBDMA_CSR_REQ_SEL_SHIFT;
+   csr |= tdc->slave_id << TEGRA_APBDMA_CSR_REQ_SEL_SHIFT;
if (flags & DMA_PREP_INTERRUPT)
csr |= TEGRA_APBDMA_CSR_IE_EOC;
 
@@ -1087,7 +1099,7 @@ struct dma_async_tx_descriptor *tegra_dma_prep_dma_cyclic(
csr |= TEGRA_APBDMA_CSR_FLOW;
if (flags & DMA_PREP_INTERRUPT)
csr |= TEGRA_APBDMA_CSR_IE_EOC;
-   csr |= tdc->dma_sconfig.slave_id << TEGRA_APBDMA_CSR_REQ_SEL_SHIFT;
+   csr |= tdc->slave_id << TEGRA_APBDMA_CSR_REQ_SEL_SHIFT;
 
apb_seq |= TEGRA_APBDMA_APBSEQ_WRAP_WORD_1;
 
@@ -1209,6 +1221,23 @@ static void tegra_dma_free_chan_resources(struct 
dma_chan *dc)
clk_disable_unprepare(tdma->dma_clk);
 }
 
+static bool tegra_dma_filter_fn(struct dma_chan *dc, void *param)
+{
+   if (dc->device->dev->driver == _dmac_driver.driver) {
+   struct tegra_dma_channel *tdc = to_tegra_dma_chan(dc);
+   unsigned req = *(unsigned *)param;
+
+   tdc->slave_id = req;
+
+   return true;
+   }
+   return false;
+}
+
+static struct of_dma_filter_info tegra_dma_info = {
+   .filter_fn = tegra_dma_filter_fn,
+};
+
 /* Tegra20 specific DMA controller information */
 static const struct tegra_dma_chip_data tegra20_dma_chip_data = {
.nr_channels= 16,
@@ -1345,6 +1374,7 @@ static int tegra_dma_probe(struct platform_device *pdev)
>dma_dev.channels);
tdc->tdma = tdma;
tdc->id = i;
+   tdc->slave_id = -EINVAL;
 
tasklet_init(>tasklet, tegra_dma_tasklet,
(unsigned long)tdc);
@@ -1378,10 +1408,21 @@ static int tegra_dma_probe(struct platform_device *pdev)
goto err_irq;
}
 
+   ret = of_dma_controller_register(pdev->dev.of_node,
+   of_dma_simple_xlate, _dma_info);
+   if (ret) {
+   dev_err(>dev,
+   "Tegra20 APB DMA controller registration failed %d\n",
+   ret);
+   goto err_of_dma;
+   }
+
dev_info(>dev, "Tegra20 APB DMA driver register %d channels\n",
cdata->nr_channels);
return 0;
 
+err_of_dma:
+   dma_async_device_unregister(>dma_dev);
 err_irq:
while (--i >= 0) {
struct tegra_dma_channel *tdc = >channels[i];
@@ -1401,6 +1442,7 @@ static int tegra_dma_remove(struct platform_device *pdev)
int i;
struct tegra_dma_channel *tdc;
 
+   of_dma_controller_free(pdev->dev.of_node);
dma_async_device_unregister(>dma_dev);
 
for (i = 0; i < tdma->chip_data->nr_channels; ++i) {
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More

[PATCH 6/9] serial: tegra: move to generic dma DT binding

2013-07-23 Thread Richard Zhao

- driver: remove use of nvidia,dma-request-selector
  use dma_request_slave_channel to request channel
- update binding doc

Signed-off-by: Richard Zhao 
---
 .../devicetree/bindings/serial/nvidia,tegra20-hsuart.txt |  8 +---
 drivers/tty/serial/serial-tegra.c| 16 +---
 2 files changed, 6 insertions(+), 18 deletions(-)

diff --git a/Documentation/devicetree/bindings/serial/nvidia,tegra20-hsuart.txt 
b/Documentation/devicetree/bindings/serial/nvidia,tegra20-hsuart.txt
index 392a449..1ed2f48 100644
--- a/Documentation/devicetree/bindings/serial/nvidia,tegra20-hsuart.txt
+++ b/Documentation/devicetree/bindings/serial/nvidia,tegra20-hsuart.txt
@@ -4,8 +4,9 @@ Required properties:
 - compatible : should be "nvidia,tegra30-hsuart", "nvidia,tegra20-hsuart".
 - reg: Should contain UART controller registers location and length.
 - interrupts: Should contain UART controller interrupts.
-- nvidia,dma-request-selector : The Tegra DMA controller's phandle and
-  request selector for this UART controller.
+- dmas : The Tegra DMA controller's phandle and request selector for
+  this UART controller.
+- dma-names : Should be "rx-tx";
 
 Optional properties:
 - nvidia,enable-modem-interrupt: Enable modem interrupts. Should be enable
@@ -18,7 +19,8 @@ serial@70006000 {
reg = <0x70006000 0x40>;
reg-shift = <2>;
interrupts = <0 36 0x04>;
-   nvidia,dma-request-selector = < 8>;
+   dmas = < 8>;
+   dma-names = "rx-tx";
nvidia,enable-modem-interrupt;
status = "disabled";
 };
diff --git a/drivers/tty/serial/serial-tegra.c 
b/drivers/tty/serial/serial-tegra.c
index ee7c812..c8a7828 100644
--- a/drivers/tty/serial/serial-tegra.c
+++ b/drivers/tty/serial/serial-tegra.c
@@ -120,7 +120,6 @@ struct tegra_uart_port {
boolrx_timeout;
int rx_in_progress;
int symb_bit;
-   int dma_req_sel;
 
struct dma_chan *rx_dma_chan;
struct dma_chan *tx_dma_chan;
@@ -902,11 +901,8 @@ static int tegra_uart_dma_channel_allocate(struct 
tegra_uart_port *tup,
dma_addr_t dma_phys;
int ret;
struct dma_slave_config dma_sconfig;
-   dma_cap_mask_t mask;
 
-   dma_cap_zero(mask);
-   dma_cap_set(DMA_SLAVE, mask);
-   dma_chan = dma_request_channel(mask, NULL, NULL);
+   dma_chan = dma_request_slave_channel(tup->uport.dev, "rx-tx");
if (!dma_chan) {
dev_err(tup->uport.dev,
"Dma channel is not available, will try later\n");
@@ -930,7 +926,6 @@ static int tegra_uart_dma_channel_allocate(struct 
tegra_uart_port *tup,
dma_buf = tup->uport.state->xmit.buf;
}
 
-   dma_sconfig.slave_id = tup->dma_req_sel;
if (dma_to_memory) {
dma_sconfig.src_addr = tup->uport.mapbase;
dma_sconfig.src_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
@@ -1214,17 +1209,8 @@ static int tegra_uart_parse_dt(struct platform_device 
*pdev,
struct tegra_uart_port *tup)
 {
struct device_node *np = pdev->dev.of_node;
-   u32 of_dma[2];
int port;
 
-   if (of_property_read_u32_array(np, "nvidia,dma-request-selector",
-   of_dma, 2) >= 0) {
-   tup->dma_req_sel = of_dma[1];
-   } else {
-   dev_err(>dev, "missing dma requestor in device tree\n");
-   return -EINVAL;
-   }
-
port = of_alias_get_id(np, "serial");
if (port < 0) {
dev_err(>dev, "failed to get alias id, errno %d\n", port);
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 7/9] ASoC: tegra: move to generic DMA DT binding

2013-07-23 Thread Richard Zhao

- add tegra_dma_filter_data to specify dma info
  DMA DT binding needs the device that raise dma request and dma name
  to request a dma channel. tegra30_i2s is a special case. It should be ahub
  device and it also has dma name that cannot handled by ASoC dmaengine code.
  So we pass the info using filter data in snd_dmaengine_dai_dma_data.
- change i2s/ac97 drivers to use generic DT binding
- tegra30_i2s: alloc ahub tx/rx fifo at driver probe time

Signed-off-by: Richard Zhao 
---
 .../bindings/sound/nvidia,tegra20-ac97.txt |  8 ++--
 .../bindings/sound/nvidia,tegra20-i2s.txt  |  8 ++--
 .../bindings/sound/nvidia,tegra30-ahub.txt | 12 +++---
 sound/soc/tegra/tegra20_ac97.c | 17 +++--
 sound/soc/tegra/tegra20_ac97.h |  2 +
 sound/soc/tegra/tegra20_i2s.c  | 26 -
 sound/soc/tegra/tegra20_i2s.h  |  2 +
 sound/soc/tegra/tegra30_ahub.c | 25 +---
 sound/soc/tegra/tegra30_ahub.h | 11 +++---
 sound/soc/tegra/tegra30_i2s.c  | 44 +++---
 sound/soc/tegra/tegra30_i2s.h  |  2 +
 sound/soc/tegra/tegra_pcm.c| 14 +++
 sound/soc/tegra/tegra_pcm.h| 13 +++
 13 files changed, 107 insertions(+), 77 deletions(-)

diff --git a/Documentation/devicetree/bindings/sound/nvidia,tegra20-ac97.txt 
b/Documentation/devicetree/bindings/sound/nvidia,tegra20-ac97.txt
index c145497..972f444 100644
--- a/Documentation/devicetree/bindings/sound/nvidia,tegra20-ac97.txt
+++ b/Documentation/devicetree/bindings/sound/nvidia,tegra20-ac97.txt
@@ -4,8 +4,9 @@ Required properties:
 - compatible : "nvidia,tegra20-ac97"
 - reg : Should contain AC97 controller registers location and length
 - interrupts : Should contain AC97 interrupt
-- nvidia,dma-request-selector : The Tegra DMA controller's phandle and
-  request selector for the AC97 controller
+- dmas : The Tegra DMA controller's phandle and request selector for
+  the AC97 controller
+- dma-names : Should be "rx-tx"
 - nvidia,codec-reset-gpio : The Tegra GPIO controller's phandle and the number
   of the GPIO used to reset the external AC97 codec
 - nvidia,codec-sync-gpio : The Tegra GPIO controller's phandle and the number
@@ -16,7 +17,8 @@ ac97@70002000 {
compatible = "nvidia,tegra20-ac97";
reg = <0x70002000 0x200>;
interrupts = <0 81 0x04>;
-   nvidia,dma-request-selector = < 12>;
+   dmas = < 12>;
+   dma-names = "rx-tx";
nvidia,codec-reset-gpio = < 170 0>;
nvidia,codec-sync-gpio = < 120 0>;
 };
diff --git a/Documentation/devicetree/bindings/sound/nvidia,tegra20-i2s.txt 
b/Documentation/devicetree/bindings/sound/nvidia,tegra20-i2s.txt
index 0df2b5c..61a6c8d 100644
--- a/Documentation/devicetree/bindings/sound/nvidia,tegra20-i2s.txt
+++ b/Documentation/devicetree/bindings/sound/nvidia,tegra20-i2s.txt
@@ -4,8 +4,9 @@ Required properties:
 - compatible : "nvidia,tegra20-i2s"
 - reg : Should contain I2S registers location and length
 - interrupts : Should contain I2S interrupt
-- nvidia,dma-request-selector : The Tegra DMA controller's phandle and
-  request selector for this I2S controller
+- dmas : The Tegra DMA controller's phandle and request selector
+  for this I2S controller
+- dma-names : Should be "rx-tx"
 
 Example:
 
@@ -13,5 +14,6 @@ i2s@70002800 {
compatible = "nvidia,tegra20-i2s";
reg = <0x70002800 0x200>;
interrupts = < 45 >;
-   nvidia,dma-request-selector = <  2 >;
+   dmas = <  2 >;
+   dma-names = "rx-tx";
 };
diff --git a/Documentation/devicetree/bindings/sound/nvidia,tegra30-ahub.txt 
b/Documentation/devicetree/bindings/sound/nvidia,tegra30-ahub.txt
index 0e5c12c..0ac563b 100644
--- a/Documentation/devicetree/bindings/sound/nvidia,tegra30-ahub.txt
+++ b/Documentation/devicetree/bindings/sound/nvidia,tegra30-ahub.txt
@@ -7,11 +7,10 @@ Required properties:
   - Tegra30 requires 2 entries, for the APBIF and AHUB/AUDIO register blocks.
   - Tegra114 requires an additional entry, for the APBIF2 register block.
 - interrupts : Should contain AHUB interrupt
-- nvidia,dma-request-selector : A list of the DMA channel specifiers. Each
-  entry contains the Tegra DMA controller's phandle and request selector.
-  If a single entry is present, the request selectors for the channels are
-  assumed to be contiguous, and increment from this value.
-  If multiple values are given, one value must be given per channel.
+- dmas : A list of the DMA channel specifiers. Each entry contains the Tegra
+  DMA controller's phandle and request selector.
+- dma-names : Should be a list of "channelx", in which x is 0, 1, 2, ...
+  One entry is required for each RX/TX FIFO pair that exists in hardware.
 - clocks : Must contain an entry for each required entry in clock-names.
 - clock-names : Must include the following entries:
   - Tegra30:

[PATCH 8/9] ARM: dts: tegra: remove legacy nvidia,dma-request-selector properties

2013-07-23 Thread Richard Zhao

All tegra dma client drivers have moved to generic dma binding.

Signed-off-by: Richard Zhao 
---
 arch/arm/boot/dts/tegra114.dtsi | 14 --
 arch/arm/boot/dts/tegra20.dtsi  | 13 -
 arch/arm/boot/dts/tegra30.dtsi  | 12 
 3 files changed, 39 deletions(-)

diff --git a/arch/arm/boot/dts/tegra114.dtsi b/arch/arm/boot/dts/tegra114.dtsi
index b133c62..e696cbce 100644
--- a/arch/arm/boot/dts/tegra114.dtsi
+++ b/arch/arm/boot/dts/tegra114.dtsi
@@ -125,7 +125,6 @@
reg = <0x70006000 0x40>;
reg-shift = <2>;
interrupts = ;
-   nvidia,dma-request-selector = < 8>;
dmas = < 8>;
dma-names = "rx-tx";
status = "disabled";
@@ -137,7 +136,6 @@
reg = <0x70006040 0x40>;
reg-shift = <2>;
interrupts = ;
-   nvidia,dma-request-selector = < 9>;
dmas = < 9>;
dma-names = "rx-tx";
status = "disabled";
@@ -149,7 +147,6 @@
reg = <0x70006200 0x100>;
reg-shift = <2>;
interrupts = ;
-   nvidia,dma-request-selector = < 10>;
dmas = < 10>;
dma-names = "rx-tx";
status = "disabled";
@@ -161,7 +158,6 @@
reg = <0x70006300 0x100>;
reg-shift = <2>;
interrupts = ;
-   nvidia,dma-request-selector = < 19>;
dmas = < 19>;
dma-names = "rx-tx";
status = "disabled";
@@ -235,7 +231,6 @@
compatible = "nvidia,tegra114-spi";
reg = <0x7000d400 0x200>;
interrupts = ;
-   nvidia,dma-request-selector = < 15>;
dmas = < 15>;
dma-names = "rx-tx";
#address-cells = <1>;
@@ -249,7 +244,6 @@
compatible = "nvidia,tegra114-spi";
reg = <0x7000d600 0x200>;
interrupts = ;
-   nvidia,dma-request-selector = < 16>;
dmas = < 16>;
dma-names = "rx-tx";
#address-cells = <1>;
@@ -263,7 +257,6 @@
compatible = "nvidia,tegra114-spi";
reg = <0x7000d800 0x200>;
interrupts = ;
-   nvidia,dma-request-selector = < 17>;
dmas = < 17>;
dma-names = "rx-tx";
#address-cells = <1>;
@@ -277,7 +270,6 @@
compatible = "nvidia,tegra114-spi";
reg = <0x7000da00 0x200>;
interrupts = ;
-   nvidia,dma-request-selector = < 18>;
dmas = < 18>;
dma-names = "rx-tx";
#address-cells = <1>;
@@ -291,7 +283,6 @@
compatible = "nvidia,tegra114-spi";
reg = <0x7000dc00 0x200>;
interrupts = ;
-   nvidia,dma-request-selector = < 27>;
dmas = < 27>;
dma-names = "rx-tx";
#address-cells = <1>;
@@ -305,7 +296,6 @@
compatible = "nvidia,tegra114-spi";
reg = <0x7000de00 0x200>;
interrupts = ;
-   nvidia,dma-request-selector = < 28>;
dmas = < 28>;
dma-names = "rx-tx";
#address-cells = <1>;
@@ -354,10 +344,6 @@
  <0x70080200 0x100>,
  <0x70081000 0x200>;
interrupts = ;
-   nvidia,dma-request-selector = < 1>, < 2>,
-   < 3>, < 4>, < 6>, < 7>,
-   < 12>, < 13>, < 14>,
-   < 29>;
dmas = < 1>, < 2>, < 3>, < 4>,
< 6>, < 7>, < 12>, < 13>,
< 14>, < 29>;
diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 0fe7f37..7ca0ccb 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -223,7 +223,6 @@
compatible = "nvidia,tegra20-ac97";
reg = <0x70002000 0x200>;
interrupts = ;
-   nvidia,dma-request-selector = < 12>;
dmas = < 12>;
dma-names = "rx-tx";
clocks = <_car TEGRA20_CLK_AC97>;
@@ -234,7 +233,6 @@
compatible = "nvidia,tegra20-i2s";
reg = <0x70002800 0x200>;
interrupts = ;
-   nvidia,dma-request-selector = < 2>;
dmas = < 2>;
dma-names = "rx-tx";
clocks = <_car TEGRA20_CLK_I2S1>;
@@ -245,7 +243,6 @@
compatible = "nvidia,tegra20-i2s";
reg = <0x70002a00 0x200>;
interrupts = ;
-   nvidia,dma-request-selector = < 1>;
dmas = < 1>;
dma-names = "rx-tx";
clocks = <_car TEGRA20_CLK_I2S2>;
@@

[PATCH 9/9] dma: tegra20-apbdma: remove legacy nvidia,dma-request-selector support

2013-07-23 Thread Richard Zhao

All tegra dma client drivers have moved to generic dma binding.

Signed-off-by: Richard Zhao 
---
 drivers/dma/tegra20-apb-dma.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c
index 0e12f78..a67d159 100644
--- a/drivers/dma/tegra20-apb-dma.c
+++ b/drivers/dma/tegra20-apb-dma.c
@@ -344,13 +344,6 @@ static int tegra_dma_slave_config(struct dma_chan *dc,
 
memcpy(>dma_sconfig, sconfig, sizeof(*sconfig));
 
-   /* If we didn't get slave_id from DT when request channel, use the one
-* passed here.
-* It makes compatible with legacy nvidia,dma-request-selector.
-*/
-   if (tdc->slave_id == -EINVAL)
-   tdc->slave_id = sconfig->slave_id;
-
tdc->config_init = true;
return 0;
 }
@@ -1374,7 +1367,6 @@ static int tegra_dma_probe(struct platform_device *pdev)
>dma_dev.channels);
tdc->tdma = tdma;
tdc->id = i;
-   tdc->slave_id = -EINVAL;
 
tasklet_init(>tasklet, tegra_dma_tasklet,
(unsigned long)tdc);
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/9] ARM: tegra: move to generic DMA DT binding

2013-07-23 Thread Richard Zhao

The patch series aim to move apbdma to generic DT binding.

Changes:
- It add dmas/dma-names properties while leave
  nvidia,dma-request-selector to be compatible.
- update apbdma driver and dma client drivers
- ASoC tegra adds a new struct to pass device and dma-name.
- at last, remove legacy nvidia,dma-request-selector

Richard Zhao (9):
  ARM: dts: add generic DMA DT binding for tegra apbdma
  dma: tegra20-apbdma: move to generic device tree bindings
  spi: tegra114: move to generic dma DT binding
  spi: tegra20-slink: move to generic dma DT binding
  spi: tegra20-sflash: move to generic dma DT binding
  serial: tegra: move to generic dma DT binding
  ASoC: tegra: move to generic DMA DT binding
  ARM: dts: tegra: remove legacy nvidia,dma-request-selector properties
  dma: tegra20-apbdma: remove legacy nvidia,dma-request-selector support

 .../devicetree/bindings/dma/tegra20-apbdma.txt |  1 +
 .../bindings/serial/nvidia,tegra20-hsuart.txt  |  8 ++--
 .../bindings/sound/nvidia,tegra20-ac97.txt |  8 ++--
 .../bindings/sound/nvidia,tegra20-i2s.txt  |  8 ++--
 .../bindings/sound/nvidia,tegra30-ahub.txt | 12 +++---
 .../bindings/spi/nvidia,tegra114-spi.txt   | 10 +++--
 .../bindings/spi/nvidia,tegra20-sflash.txt | 10 +++--
 .../bindings/spi/nvidia,tegra20-slink.txt  | 10 +++--
 arch/arm/boot/dts/tegra114.dtsi| 41 +---
 arch/arm/boot/dts/tegra20.dtsi | 40 +---
 arch/arm/boot/dts/tegra30.dtsi | 37 --
 drivers/dma/tegra20-apb-dma.c  | 38 ++-
 drivers/spi/spi-tegra114.c | 16 +++-
 drivers/spi/spi-tegra20-slink.c| 16 +++-
 drivers/tty/serial/serial-tegra.c  | 16 +---
 sound/soc/tegra/tegra20_ac97.c | 17 +++--
 sound/soc/tegra/tegra20_ac97.h |  2 +
 sound/soc/tegra/tegra20_i2s.c  | 26 -
 sound/soc/tegra/tegra20_i2s.h  |  2 +
 sound/soc/tegra/tegra30_ahub.c | 25 +---
 sound/soc/tegra/tegra30_ahub.h | 11 +++---
 sound/soc/tegra/tegra30_i2s.c  | 44 +++---
 sound/soc/tegra/tegra30_i2s.h  |  2 +
 sound/soc/tegra/tegra_pcm.c| 14 +++
 sound/soc/tegra/tegra_pcm.h| 13 +++
 25 files changed, 260 insertions(+), 167 deletions(-)

-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:perf/core] sched: Implement smarter wake-affine logic

2013-07-23 Thread tip-bot for Michael Wang

Commit-ID:  62470419e993f8d9d93db0effd3af4296ecb79a5
Gitweb: http://git.kernel.org/tip/62470419e993f8d9d93db0effd3af4296ecb79a5
Author: Michael Wang 
AuthorDate: Thu, 4 Jul 2013 12:55:51 +0800
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 12:18:41 +0200

sched: Implement smarter wake-affine logic

The wake-affine scheduler feature is currently always trying to pull
the wakee close to the waker. In theory this should be beneficial if
the waker's CPU caches hot data for the wakee, and it's also beneficial
in the extreme ping-pong high context switch rate case.

Testing shows it can benefit hackbench up to 15%.

However, the feature is somewhat blind, from which some workloads
such as pgbench suffer. It's also time-consuming algorithmically.

Testing shows it can damage pgbench up to 50% - far more than the
benefit it brings in the best case.

So wake-affine should be smarter and it should realize when to
stop its thankless effort at trying to find a suitable CPU to wake on.

This patch introduces 'wakee_flips', which will be increased each
time the task flips (switches) its wakee target.

So a high 'wakee_flips' value means the task has more than one
wakee, and the bigger the number, the higher the wakeup frequency.

Now when making the decision on whether to pull or not, pay attention to
the wakee with a high 'wakee_flips', pulling such a task may benefit
the wakee. Also imply that the waker will face cruel competition later,
it could be very cruel or very fast depends on the story behind
'wakee_flips', waker therefore suffers.

Furthermore, if waker also has a high 'wakee_flips', that implies that
multiple tasks rely on it, then waker's higher latency will damage all
of them, so pulling wakee seems to be a bad deal.

Thus, when 'waker->wakee_flips / wakee->wakee_flips' becomes
higher and higher, the cost of pulling seems to be worse and worse.

The patch therefore helps the wake-affine feature to stop its pulling
work when:

wakee->wakee_flips > factor &&
waker->wakee_flips > (factor * wakee->wakee_flips)

The 'factor' here is the number of CPUs in the current CPU's NUMA node,
so a bigger node will lead to more pulling since the trial becomes more
severe.

After applying the patch, pgbench shows up to 40% improvements and no 
regressions.

Tested with 12 cpu x86 server and tip 3.10.0-rc7.

The percentages in the final column highlight the areas with the biggest wins,
all other areas improved as well:

pgbench basesmart

| db_size | clients |  tps  |   |  tps  |
+-+-+---+   +---+
| 22 MB   |   1 | 10598 |   | 10796 |
| 22 MB   |   2 | 21257 |   | 21336 |
| 22 MB   |   4 | 41386 |   | 41622 |
| 22 MB   |   8 | 51253 |   | 57932 |
| 22 MB   |  12 | 48570 |   | 54000 |
| 22 MB   |  16 | 46748 |   | 55982 | +19.75%
| 22 MB   |  24 | 44346 |   | 55847 | +25.93%
| 22 MB   |  32 | 43460 |   | 54614 | +25.66%
| 7484 MB |   1 |  8951 |   |  9193 |
| 7484 MB |   2 | 19233 |   | 19240 |
| 7484 MB |   4 | 37239 |   | 37302 |
| 7484 MB |   8 | 46087 |   | 50018 |
| 7484 MB |  12 | 42054 |   | 48763 |
| 7484 MB |  16 | 40765 |   | 51633 | +26.66%
| 7484 MB |  24 | 37651 |   | 52377 | +39.11%
| 7484 MB |  32 | 37056 |   | 51108 | +37.92%
| 15 GB   |   1 |  8845 |   |  9104 |
| 15 GB   |   2 | 19094 |   | 19162 |
| 15 GB   |   4 | 36979 |   | 36983 |
| 15 GB   |   8 | 46087 |   | 49977 |
| 15 GB   |  12 | 41901 |   | 48591 |
| 15 GB   |  16 | 40147 |   | 50651 | +26.16%
| 15 GB   |  24 | 37250 |   | 52365 | +40.58%
| 15 GB   |  32 | 36470 |   | 50015 | +37.14%

Signed-off-by: Michael Wang 
Cc: Mike Galbraith 
Signed-off-by: Peter Zijlstra 
Link: http://lkml.kernel.org/r/51d50057.9000...@linux.vnet.ibm.com
[ Improved the changelog. ]
Signed-off-by: Ingo Molnar 
---
 include/linux/sched.h |  3 +++
 kernel/sched/fair.c   | 47 +++
 2 files changed, 50 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 50d04b9..4f163a8 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1034,6 +1034,9 @@ struct task_struct {
 #ifdef CONFIG_SMP
struct llist_node wake_entry;
int on_cpu;
+   struct task_struct *last_wakee;
+   unsigned long wakee_flips;
+   unsigned long wakee_flip_decay_ts;
 #endif
int on_rq;
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 765d87a..860063a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3017,6 +3017,23 @@ static unsigned long cpu_avg_load_per_task(int cpu)
return 0;
 }
 
+static void record_wakee(struct task_struct *p)
+{
+   /*
+* Rough decay (wiping) for cost

[PATCH] ARM: shmobile: r8a7740: add PMU information to r8a7740.dtsi

2013-07-23 Thread Magnus Damm

From: Magnus Damm 

Add PMU information to r8a7740.dtsi. With this
included Armadillo800eva DT reference may use the PMU.

Signed-off-by: Magnus Damm 
---

 Dry coded based on data sheet, not runtime tested.

 arch/arm/boot/dts/r8a7740.dtsi |5 +
 1 file changed, 5 insertions(+)

--- 0001/arch/arm/boot/dts/r8a7740.dtsi
+++ work/arch/arm/boot/dts/r8a7740.dtsi 2013-07-24 04:20:42.0 +0900
@@ -32,6 +32,11 @@
  <0xc200 0x1000>;
};
 
+   pmu {
+   compatible = "arm,cortex-a9-pmu";
+   interrupts = <0 83 4>;
+   };
+
/* irqpin0: IRQ0 - IRQ7 */
irqpin0: irqpin@e690 {
compatible = "renesas,intc-irqpin";
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:perf/core] perf: Fix broken union in ' struct perf_event_mmap_page'

2013-07-23 Thread tip-bot for Adrian Hunter

Commit-ID:  860f085b74e9f0075de8140ed3a1e5b5e3e39aa8
Gitweb: http://git.kernel.org/tip/860f085b74e9f0075de8140ed3a1e5b5e3e39aa8
Author: Adrian Hunter 
AuthorDate: Fri, 28 Jun 2013 16:22:17 +0300
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 12:17:10 +0200

perf: Fix broken union in 'struct perf_event_mmap_page'

The capabilities bits must not be "union'ed" together.
Put them in a separate struct.

Signed-off-by: Adrian Hunter 
Signed-off-by: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1372425741-1676-2-git-send-email-adrian.hun...@intel.com
Signed-off-by: Ingo Molnar 
---
 include/uapi/linux/perf_event.h | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 00d8274..0041aed 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -375,9 +375,11 @@ struct perf_event_mmap_page {
__u64   time_running;   /* time event on cpu */
union {
__u64   capabilities;
-   __u64   cap_usr_time  : 1,
-   cap_usr_rdpmc : 1,
-   cap_res   : 62;
+   struct {
+   __u64   cap_usr_time: 1,
+   cap_usr_rdpmc   : 1,
+   cap_res : 62;
+   };
};
 
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:x86/apic] x86/acpi: Fix incorrect sanity check in acpi_register_lapic()

2013-07-23 Thread tip-bot for Tang Chen

Commit-ID:  82982d729319e975115d88cae4927dffb02bfea7
Gitweb: http://git.kernel.org/tip/82982d729319e975115d88cae4927dffb02bfea7
Author: Tang Chen 
AuthorDate: Tue, 23 Jul 2013 16:00:19 +0800
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 10:08:16 +0200

x86/acpi: Fix incorrect sanity check in acpi_register_lapic()

We wanted to check if the APIC ID is out of range. It should be:

if (id >= MAX_LOCAL_APIC)

There's no known bad effect of this bug.

Signed-off-by: Tang Chen 
Reviewed-by: Len Brown 
Cc: pa...@ucw.cz
Cc: r...@sisk.pl
Link: 
http://lkml.kernel.org/r/1374566419-21120-1-git-send-email-tangc...@cn.fujitsu.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/acpi/boot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 2627a81..872a2d2 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -199,7 +199,7 @@ static void acpi_register_lapic(int id, u8 enabled)
 {
unsigned int ver = 0;
 
-   if (id >= (MAX_LOCAL_APIC-1)) {
+   if (id >= MAX_LOCAL_APIC) {
printk(KERN_INFO PREFIX "skipped apicid that is too big\n");
return;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:perf/core] perf: Update perf_event_type documentation

2013-07-23 Thread tip-bot for Peter Zijlstra

Commit-ID:  a5cdd40c9877e9aba704c020fd65d26b5cfecf18
Gitweb: http://git.kernel.org/tip/a5cdd40c9877e9aba704c020fd65d26b5cfecf18
Author: Peter Zijlstra 
AuthorDate: Tue, 16 Jul 2013 17:09:07 +0200
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 12:17:08 +0200

perf: Update perf_event_type documentation

Due to a discussion with Adrian I had a good look at the perf_event_type record
layout and found the documentation to be somewhat unclear.

Cc: Adrian Hunter 
Signed-off-by: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/20130716150907.gl23...@dyad.programming.kicks-ass.net
Signed-off-by: Ingo Molnar 
---
 include/uapi/linux/perf_event.h | 18 +-
 kernel/events/core.c| 31 ---
 2 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 0b1df41..00d8274 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -478,6 +478,16 @@ enum perf_event_type {
 * file will be supported by older perf tools, with these new optional
 * fields being ignored.
 *
+* struct sample_id {
+*  { u32   pid, tid; } && PERF_SAMPLE_TID
+*  { u64   time; } && PERF_SAMPLE_TIME
+*  { u64   id;   } && PERF_SAMPLE_ID
+*  { u64   stream_id;} && PERF_SAMPLE_STREAM_ID
+*  { u32   cpu, res; } && PERF_SAMPLE_CPU
+* } && perf_event_attr::sample_id_all
+*/
+
+   /*
 * The MMAP events record the PROT_EXEC mappings so that we can
 * correlate userspace IPs to code. They have the following structure:
 *
@@ -498,6 +508,7 @@ enum perf_event_type {
 *  struct perf_event_headerheader;
 *  u64 id;
 *  u64 lost;
+*  struct sample_idsample_id;
 * };
 */
PERF_RECORD_LOST= 2,
@@ -508,6 +519,7 @@ enum perf_event_type {
 *
 *  u32 pid, tid;
 *  charcomm[];
+*  struct sample_idsample_id;
 * };
 */
PERF_RECORD_COMM= 3,
@@ -518,6 +530,7 @@ enum perf_event_type {
 *  u32 pid, ppid;
 *  u32 tid, ptid;
 *  u64 time;
+*  struct sample_idsample_id;
 * };
 */
PERF_RECORD_EXIT= 4,
@@ -528,6 +541,7 @@ enum perf_event_type {
 *  u64 time;
 *  u64 id;
 *  u64 stream_id;
+*  struct sample_idsample_id;
 * };
 */
PERF_RECORD_THROTTLE= 5,
@@ -539,6 +553,7 @@ enum perf_event_type {
 *  u32 pid, ppid;
 *  u32 tid, ptid;
 *  u64 time;
+*  struct sample_idsample_id;
 * };
 */
PERF_RECORD_FORK= 7,
@@ -549,6 +564,7 @@ enum perf_event_type {
 *  u32 pid, tid;
 *
 *  struct read_format  values;
+*  struct sample_idsample_id;
 * };
 */
PERF_RECORD_READ= 8,
@@ -596,7 +612,7 @@ enum perf_event_type {
 *u64   dyn_size; } && PERF_SAMPLE_STACK_USER
 *
 *  { u64   weight;   } && PERF_SAMPLE_WEIGHT
-*  { u64   data_src; } && PERF_SAMPLE_DATA_SRC
+*  { u64   data_src; } && PERF_SAMPLE_DATA_SRC
 * };
 */
PERF_RECORD_SAMPLE  = 9,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5e2bce9..1274114 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4462,20 +4462,6 @@ void perf_output_sample(struct perf_output_handle 
*handle,
}
}
 
-   if (!event->attr.watermark) {
-   int wakeup_events = event->attr.wakeup_events;
-
-   if (wakeup_events) {
-   struct ring_buffer *rb = handle->rb;
-   int events = local_inc_return(>events);
-
-   if (events >= wakeup_events) {
-   local_sub(wakeup_events, >events);
-   local_inc(>wakeup);
-   }
-   }
-   }
-
if

[tip:x86/urgent] x86/iommu/vt-d: Expand interrupt remapping quirk to cover x58 chipset

2013-07-23 Thread tip-bot for Neil Horman

Commit-ID:  803075dba31c17af110e1d9a915fe7262165b213
Gitweb: http://git.kernel.org/tip/803075dba31c17af110e1d9a915fe7262165b213
Author: Neil Horman 
AuthorDate: Wed, 17 Jul 2013 07:13:59 -0400
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 11:29:30 +0200

x86/iommu/vt-d: Expand interrupt remapping quirk to cover x58 chipset

Recently we added an early quirk to detect 5500/5520 chipsets
with early revisions that had problems with irq draining with
interrupt remapping enabled:

  commit 03bbcb2e7e292838bb0244f5a7816d194c911d62
  Author: Neil Horman 
  Date:   Tue Apr 16 16:38:32 2013 -0400

  iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets

It turns out this same problem is present in the intel X58
chipset as well. See errata 69 here:

  
http://www.intel.com/content/www/us/en/chipsets/x58-express-specification-update.html

This patch extends the pci early quirk so that the chip
devices/revisions specified in the above update are also covered
in the same way:

Signed-off-by: Neil Horman 
Reviewed-by: Jan Beulich 
Acked-by: Donald Dutile 
Cc: Joerg Roedel 
Cc: Andrew Cooper 
Cc: Malcolm Crossley 
Cc: Prarit Bhargava 
Cc: Don Zickus 
Cc: sta...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1374059639-8631-1-git-send-email-nhor...@tuxdriver.com
[ Small edits. ]
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/early-quirks.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 94ab6b9..63bdb29 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -196,15 +196,23 @@ static void __init ati_bugs_contd(int num, int slot, int 
func)
 static void __init intel_remapping_check(int num, int slot, int func)
 {
u8 revision;
+   u16 device;
 
+   device = read_pci_config_16(num, slot, func, PCI_DEVICE_ID);
revision = read_pci_config_byte(num, slot, func, PCI_REVISION_ID);
 
/*
-* Revision 0x13 of this chipset supports irq remapping
-* but has an erratum that breaks its behavior, flag it as such
+* Revision 13 of all triggering devices id in this quirk have
+* a problem draining interrupts when irq remapping is enabled,
+* and should be flagged as broken.  Additionally revisions 0x12
+* and 0x22 of device id 0x3405 has this problem.
 */
if (revision == 0x13)
set_irq_remapping_broken();
+   else if ((device == 0x3405) &&
+   ((revision == 0x12) ||
+(revision == 0x22)))
+   set_irq_remapping_broken();
 
 }
 
@@ -239,6 +247,8 @@ static struct chipset early_qrk[] __initdata = {
  PCI_CLASS_SERIAL_SMBUS, PCI_ANY_ID, 0, ati_bugs_contd },
{ PCI_VENDOR_ID_INTEL, 0x3403, PCI_CLASS_BRIDGE_HOST,
  PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
+   { PCI_VENDOR_ID_INTEL, 0x3405, PCI_CLASS_BRIDGE_HOST,
+ PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
{ PCI_VENDOR_ID_INTEL, 0x3406, PCI_CLASS_BRIDGE_HOST,
  PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
{}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:perf/core] sched: Micro-optimize the smart wake-affine logic

2013-07-23 Thread tip-bot for Peter Zijlstra

Commit-ID:  7d9ffa8961482232d964173cccba6e14d2d543b2
Gitweb: http://git.kernel.org/tip/7d9ffa8961482232d964173cccba6e14d2d543b2
Author: Peter Zijlstra 
AuthorDate: Thu, 4 Jul 2013 12:56:46 +0800
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 12:22:06 +0200

sched: Micro-optimize the smart wake-affine logic

Smart wake-affine is using node-size as the factor currently, but the overhead
of the mask operation is high.

Thus, this patch introduce the 'sd_llc_size' percpu variable, which will record
the highest cache-share domain size, and make it to be the new factor, in order
to reduce the overhead and make it more reasonable.

Tested-by: Davidlohr Bueso 
Tested-by: Michael Wang 
Signed-off-by: Peter Zijlstra 
Acked-by: Michael Wang 
Cc: Mike Galbraith 
Link: http://lkml.kernel.org/r/51d5008e.6030...@linux.vnet.ibm.com
[ Tidied up the changelog. ]
Signed-off-by: Ingo Molnar 
---
 kernel/sched/core.c  | 7 ++-
 kernel/sched/fair.c  | 2 +-
 kernel/sched/sched.h | 1 +
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b7c32cb..6df0fbe 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5083,18 +5083,23 @@ static void destroy_sched_domains(struct sched_domain 
*sd, int cpu)
  * two cpus are in the same cache domain, see cpus_share_cache().
  */
 DEFINE_PER_CPU(struct sched_domain *, sd_llc);
+DEFINE_PER_CPU(int, sd_llc_size);
 DEFINE_PER_CPU(int, sd_llc_id);
 
 static void update_top_cache_domain(int cpu)
 {
struct sched_domain *sd;
int id = cpu;
+   int size = 1;
 
sd = highest_flag_domain(cpu, SD_SHARE_PKG_RESOURCES);
-   if (sd)
+   if (sd) {
id = cpumask_first(sched_domain_span(sd));
+   size = cpumask_weight(sched_domain_span(sd));
+   }
 
rcu_assign_pointer(per_cpu(sd_llc, cpu), sd);
+   per_cpu(sd_llc_size, cpu) = size;
per_cpu(sd_llc_id, cpu) = id;
 }
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 860063a..f237437 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3175,7 +3175,7 @@ static inline unsigned long effective_load(struct 
task_group *tg, int cpu,
 
 static int wake_wide(struct task_struct *p)
 {
-   int factor = nr_cpus_node(cpu_to_node(smp_processor_id()));
+   int factor = this_cpu_read(sd_llc_size);
 
/*
 * Yeah, it's the switching-frequency, could means many wakee or
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 5e129ef..4c1cb80 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -594,6 +594,7 @@ static inline struct sched_domain *highest_flag_domain(int 
cpu, int flag)
 }
 
 DECLARE_PER_CPU(struct sched_domain *, sd_llc);
+DECLARE_PER_CPU(int, sd_llc_size);
 DECLARE_PER_CPU(int, sd_llc_id);
 
 struct sched_group_power {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:perf/core] perf tools: Add test for converting perf time to/ from TSC

2013-07-23 Thread tip-bot for Adrian Hunter

Commit-ID:  3bd5a5fc8c6b9fe769777abf74b0ab5fbd7930b4
Gitweb: http://git.kernel.org/tip/3bd5a5fc8c6b9fe769777abf74b0ab5fbd7930b4
Author: Adrian Hunter 
AuthorDate: Fri, 28 Jun 2013 16:22:19 +0300
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 12:17:59 +0200

perf tools: Add test for converting perf time to/from TSC

The test uses the newly added cap_usr_time_zero and time_zero of
perf_event_mmap_page.  TSC from rdtsc is compared with the time
from 2 perf events.  The test passes if the calculated times are
all in the correct order.

Signed-off-by: Adrian Hunter 
Signed-off-by: Peter Zijlstra 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Link: 
http://lkml.kernel.org/r/1372425741-1676-4-git-send-email-adrian.hun...@intel.com
Signed-off-by: Ingo Molnar 
---
 tools/perf/Makefile |   3 +
 tools/perf/arch/x86/Makefile|   2 +
 tools/perf/arch/x86/util/tsc.c  |  59 
 tools/perf/arch/x86/util/tsc.h  |  20 
 tools/perf/tests/builtin-test.c |   6 ++
 tools/perf/tests/perf-time-to-tsc.c | 177 
 tools/perf/tests/tests.h|   1 +
 7 files changed, 268 insertions(+)

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 024680b..bfd12d0 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -389,6 +389,9 @@ LIB_OBJS += $(OUTPUT)tests/bp_signal.o
 LIB_OBJS += $(OUTPUT)tests/bp_signal_overflow.o
 LIB_OBJS += $(OUTPUT)tests/task-exit.o
 LIB_OBJS += $(OUTPUT)tests/sw-clock.o
+ifeq ($(ARCH),x86)
+LIB_OBJS += $(OUTPUT)tests/perf-time-to-tsc.o
+endif
 
 BUILTIN_OBJS += $(OUTPUT)builtin-annotate.o
 BUILTIN_OBJS += $(OUTPUT)builtin-bench.o
diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
index 815841c..8801fe0 100644
--- a/tools/perf/arch/x86/Makefile
+++ b/tools/perf/arch/x86/Makefile
@@ -6,3 +6,5 @@ ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind.o
 endif
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/header.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/tsc.o
+LIB_H += arch/$(ARCH)/util/tsc.h
diff --git a/tools/perf/arch/x86/util/tsc.c b/tools/perf/arch/x86/util/tsc.c
new file mode 100644
index 000..f111744
--- /dev/null
+++ b/tools/perf/arch/x86/util/tsc.c
@@ -0,0 +1,59 @@
+#include 
+#include 
+
+#include 
+
+#include "../../perf.h"
+#include "../../util/types.h"
+#include "../../util/debug.h"
+#include "tsc.h"
+
+u64 perf_time_to_tsc(u64 ns, struct perf_tsc_conversion *tc)
+{
+   u64 time, quot, rem;
+
+   time = ns - tc->time_zero;
+   quot = time / tc->time_mult;
+   rem  = time % tc->time_mult;
+   return (quot << tc->time_shift) +
+  (rem << tc->time_shift) / tc->time_mult;
+}
+
+u64 tsc_to_perf_time(u64 cyc, struct perf_tsc_conversion *tc)
+{
+   u64 quot, rem;
+
+   quot = cyc >> tc->time_shift;
+   rem  = cyc & ((1 << tc->time_shift) - 1);
+   return tc->time_zero + quot * tc->time_mult +
+  ((rem * tc->time_mult) >> tc->time_shift);
+}
+
+int perf_read_tsc_conversion(const struct perf_event_mmap_page *pc,
+struct perf_tsc_conversion *tc)
+{
+   bool cap_usr_time_zero;
+   u32 seq;
+   int i = 0;
+
+   while (1) {
+   seq = pc->lock;
+   rmb();
+   tc->time_mult = pc->time_mult;
+   tc->time_shift = pc->time_shift;
+   tc->time_zero = pc->time_zero;
+   cap_usr_time_zero = pc->cap_usr_time_zero;
+   rmb();
+   if (pc->lock == seq && !(seq & 1))
+   break;
+   if (++i > 1) {
+   pr_debug("failed to get perf_event_mmap_page lock\n");
+   return -EINVAL;
+   }
+   }
+
+   if (!cap_usr_time_zero)
+   return -EOPNOTSUPP;
+
+   return 0;
+}
diff --git a/tools/perf/arch/x86/util/tsc.h b/tools/perf/arch/x86/util/tsc.h
new file mode 100644
index 000..a24dec8
--- /dev/null
+++ b/tools/perf/arch/x86/util/tsc.h
@@ -0,0 +1,20 @@
+#ifndef TOOLS_PERF_ARCH_X86_UTIL_TSC_H__
+#define TOOLS_PERF_ARCH_X86_UTIL_TSC_H__
+
+#include "../../util/types.h"
+
+struct perf_tsc_conversion {
+   u16 time_shift;
+   u32 time_mult;
+   u64 time_zero;
+};
+
+struct perf_event_mmap_page;
+
+int perf_read_tsc_conversion(const struct perf_event_mmap_page *pc,
+struct perf_tsc_conversion *tc);
+
+u64 perf_time_to_tsc(u64 ns, struct perf_tsc_conversion *tc);
+u64 tsc_to_perf_time(u64 cyc, struct perf_tsc_conversion *tc);
+
+#endif /* TOOLS_PERF_ARCH_X86_UTIL_TSC_H__ */
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 35b45f1466..b7b4049 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -93,6 +93,12 @@ static struct test {
.desc = "Test software clock events have valid period values",
.func = test__sw_clock_freq,
},
+#if

[tip:perf/core] perf/x86: Add ability to calculate TSC from perf sample timestamps

2013-07-23 Thread tip-bot for Adrian Hunter

Commit-ID:  c73deb6aecda2955716f31572516f09d930ef450
Gitweb: http://git.kernel.org/tip/c73deb6aecda2955716f31572516f09d930ef450
Author: Adrian Hunter 
AuthorDate: Fri, 28 Jun 2013 16:22:18 +0300
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 12:17:45 +0200

perf/x86: Add ability to calculate TSC from perf sample timestamps

For modern CPUs, perf clock is directly related to TSC.  TSC
can be calculated from perf clock and vice versa using a simple
calculation.  Two of the three componenets of that calculation
are already exported in struct perf_event_mmap_page.  This patch
exports the third.

Signed-off-by: Adrian Hunter 
Signed-off-by: Peter Zijlstra 
Cc: "H. Peter Anvin" 
Link: 
http://lkml.kernel.org/r/1372425741-1676-3-git-send-email-adrian.hun...@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/tsc.h   |  1 +
 arch/x86/kernel/cpu/perf_event.c |  6 ++
 arch/x86/kernel/tsc.c|  6 ++
 include/uapi/linux/perf_event.h  | 22 --
 4 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index c91e8b9..235be70 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -49,6 +49,7 @@ extern void tsc_init(void);
 extern void mark_tsc_unstable(char *reason);
 extern int unsynchronized_tsc(void);
 extern int check_tsc_unstable(void);
+extern int check_tsc_disabled(void);
 extern unsigned long native_calibrate_tsc(void);
 
 extern int tsc_clocksource_reliable;
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index a7c7305..8355c84 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1884,6 +1884,7 @@ static struct pmu pmu = {
 void arch_perf_update_userpage(struct perf_event_mmap_page *userpg, u64 now)
 {
userpg->cap_usr_time = 0;
+   userpg->cap_usr_time_zero = 0;
userpg->cap_usr_rdpmc = x86_pmu.attr_rdpmc;
userpg->pmc_width = x86_pmu.cntval_bits;
 
@@ -1897,6 +1898,11 @@ void arch_perf_update_userpage(struct 
perf_event_mmap_page *userpg, u64 now)
userpg->time_mult = this_cpu_read(cyc2ns);
userpg->time_shift = CYC2NS_SCALE_FACTOR;
userpg->time_offset = this_cpu_read(cyc2ns_offset) - now;
+
+   if (sched_clock_stable && !check_tsc_disabled()) {
+   userpg->cap_usr_time_zero = 1;
+   userpg->time_zero = this_cpu_read(cyc2ns_offset);
+   }
 }
 
 /*
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 6ff4924..930e5d4 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -89,6 +89,12 @@ int check_tsc_unstable(void)
 }
 EXPORT_SYMBOL_GPL(check_tsc_unstable);
 
+int check_tsc_disabled(void)
+{
+   return tsc_disabled;
+}
+EXPORT_SYMBOL_GPL(check_tsc_disabled);
+
 #ifdef CONFIG_X86_TSC
 int __init notsc_setup(char *str)
 {
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 0041aed..efef1d3 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -378,7 +378,8 @@ struct perf_event_mmap_page {
struct {
__u64   cap_usr_time: 1,
cap_usr_rdpmc   : 1,
-   cap_res : 62;
+   cap_usr_time_zero   : 1,
+   cap_res : 61;
};
};
 
@@ -420,12 +421,29 @@ struct perf_event_mmap_page {
__u16   time_shift;
__u32   time_mult;
__u64   time_offset;
+   /*
+* If cap_usr_time_zero, the hardware clock (e.g. TSC) can be calculated
+* from sample timestamps.
+*
+*   time = timestamp - time_zero;
+*   quot = time / time_mult;
+*   rem  = time % time_mult;
+*   cyc = (quot << time_shift) + (rem << time_shift) / time_mult;
+*
+* And vice versa:
+*
+*   quot = cyc >> time_shift;
+*   rem  = cyc & ((1 << time_shift) - 1);
+*   timestamp = time_zero + quot * time_mult +
+*   ((rem * time_mult) >> time_shift);
+*/
+   __u64   time_zero;
 
/*
 * Hole for extension of the self monitor capabilities
 */
 
-   __u64   __reserved[120];/* align to 1k */
+   __u64   __reserved[119];/* align to 1k */
 
/*
 * Control data for the mmap() data buffer.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:perf/core] sched: Move h_load calculation to task_h_load()

2013-07-23 Thread tip-bot for Vladimir Davydov

Commit-ID:  685207963be973fbb73550db6edaf920a283e1a7
Gitweb: http://git.kernel.org/tip/685207963be973fbb73550db6edaf920a283e1a7
Author: Vladimir Davydov 
AuthorDate: Mon, 15 Jul 2013 17:49:19 +0400
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 12:18:41 +0200

sched: Move h_load calculation to task_h_load()

The bad thing about update_h_load(), which computes hierarchical load
factor for task groups, is that it is called for each task group in the
system before every load balancer run, and since rebalance can be
triggered very often, this function can eat really a lot of cpu time if
there are many cpu cgroups in the system.

Although the situation was improved significantly by commit a35b646
('sched, cgroup: Reduce rq->lock hold times for large cgroup
hierarchies'), the problem still can arise under some kinds of loads,
e.g. when cpus are switching from idle to busy and back very frequently.

For instance, when I start 1000 of processes that wake up every
millisecond on my 8 cpus host, 'top' and 'perf top' show:

Cpu(s): 17.8%us, 24.3%sy,  0.0%ni, 57.9%id,  0.0%wa,  0.0%hi,  0.0%si
Events: 243K cycles
  7.57%  [kernel]   [k] __schedule
  7.08%  [kernel]   [k] timerqueue_add
  6.13%  libc-2.12.so   [.] usleep

Then if I create 1 *idle* cpu cgroups (no processes in them), cpu
usage increases significantly although the 'wakers' are still executing
in the root cpu cgroup:

Cpu(s): 19.1%us, 48.7%sy,  0.0%ni, 31.6%id,  0.0%wa,  0.0%hi,  0.7%si
Events: 230K cycles
 24.56%  [kernel][k] tg_load_down
  5.76%  [kernel][k] __schedule

This happens because this particular kind of load triggers 'new idle'
rebalance very frequently, which requires calling update_h_load(),
which, in turn, calls tg_load_down() for every *idle* cpu cgroup even
though it is absolutely useless, because idle cpu cgroups have no tasks
to pull.

This patch tries to improve the situation by making h_load calculation
proceed only when h_load is really necessary. To achieve this, it
substitutes update_h_load() with update_cfs_rq_h_load(), which computes
h_load only for a given cfs_rq and all its ascendants, and makes the
load balancer call this function whenever it considers if a task should
be pulled, i.e. it moves h_load calculations directly to task_h_load().
For h_load of the same cfs_rq not to be updated multiple times (in case
several tasks in the same cgroup are considered during the same balance
run), the patch keeps the time of the last h_load update for each cfs_rq
and breaks calculation when it finds h_load to be uptodate.

The benefit of it is that h_load is computed only for those cfs_rq's,
which really need it, in particular all idle task groups are skipped.
Although this, in fact, moves h_load calculation under rq lock, it
should not affect latency much, because the amount of work done under rq
lock while trying to pull tasks is limited by sched_nr_migrate.

After the patch applied with the setup described above (1000 wakers in
the root cgroup and 1 idle cgroups), I get:

Cpu(s): 16.9%us, 24.8%sy,  0.0%ni, 58.4%id,  0.0%wa,  0.0%hi,  0.0%si
Events: 242K cycles
  7.57%  [kernel]  [k] __schedule
  6.70%  [kernel]  [k] timerqueue_add
  5.93%  libc-2.12.so  [.] usleep

Signed-off-by: Vladimir Davydov 
Signed-off-by: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1373896159-1278-1-git-send-email-vdavy...@parallels.com
Signed-off-by: Ingo Molnar 
---
 kernel/sched/fair.c  | 58 
 kernel/sched/sched.h |  7 +++
 2 files changed, 30 insertions(+), 35 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bb456f4..765d87a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4171,47 +4171,48 @@ static void update_blocked_averages(int cpu)
 }
 
 /*
- * Compute the cpu's hierarchical load factor for each task group.
+ * Compute the hierarchical load factor for cfs_rq and all its ascendants.
  * This needs to be done in a top-down fashion because the load of a child
  * group is a fraction of its parents load.
  */
-static int tg_load_down(struct task_group *tg, void *data)
+static void update_cfs_rq_h_load(struct cfs_rq *cfs_rq)
 {
-   unsigned long load;
-   long cpu = (long)data;
-
-   if (!tg->parent) {
-   load = cpu_rq(cpu)->avg.load_avg_contrib;
-   } else {
-   load = tg->parent->cfs_rq[cpu]->h_load;
-   load = div64_ul(load * tg->se[cpu]->avg.load_avg_contrib,
-   tg->parent->cfs_rq[cpu]->runnable_load_avg + 1);
-   }
-
-   tg->cfs_rq[cpu]->h_load = load;
-
-   return 0;
-}
-
-static void update_h_load(long cpu)
-{
-   struct rq *rq = cpu_rq(cpu);
+   struct rq *rq = rq_of(cfs_rq);
+   struct sched_entity *se = cfs_rq->tg->se[cpu_of(rq)];
unsigned long now = jiffies;
+   unsigned long load;
 
-   if

[tip:core/locking] mutex: Do not unnecessarily deal with waiters

2013-07-23 Thread tip-bot for Davidlohr Bueso

Commit-ID:  ec83f425dbca47e19c6737e8e7db0d0924a5de1b
Gitweb: http://git.kernel.org/tip/ec83f425dbca47e19c6737e8e7db0d0924a5de1b
Author: Davidlohr Bueso 
AuthorDate: Fri, 28 Jun 2013 13:13:18 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 11:48:37 +0200

mutex: Do not unnecessarily deal with waiters

Upon entering the slowpath, we immediately attempt to acquire
the lock by checking if it is already unlocked. If we are lucky
enough that this is the case, then we don't need to deal with
any waiter related logic.

Furthermore any checks for an empty wait_list are unnecessary as
we already know that count is non-negative and hence no one is
waiting for the lock.

Move the count check and xchg calls to be done before any
waiters are setup - including waiter debugging. Upon failure to
acquire the lock, the xchg sets the counter to 0, instead of -1
as it was originally. This can be done here since we set it back
to -1 right at the beginning of the loop so other waiters are
woken up when the lock is released.

When tested on a 8-socket (80 core) system against a vanilla
3.10-rc1 kernel, this patch provides some small performance
benefits (+2-6%). While it could be considered in the noise
level, the average percentages were stable across multiple runs
and no performance regressions were seen. Two big winners, for
small amounts of users (10-100), were the short and compute
workloads had a +19.36% and +%15.76% in jobs per minute.

Also change some break statements to 'goto slowpath', which IMO
makes a little more intuitive to read.

Signed-off-by: Davidlohr Bueso 
Acked-by: Rik van Riel 
Acked-by: Maarten Lankhorst 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1372450398.2106.1.ca...@buesod1.americas.hpqcorp.net
Signed-off-by: Ingo Molnar 
---
 kernel/mutex.c | 41 ++---
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/kernel/mutex.c b/kernel/mutex.c
index 7ff48c5..386ad5d 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -463,7 +463,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
 * performed the optimistic spinning cannot be done.
 */
if (ACCESS_ONCE(ww->ctx))
-   break;
+   goto slowpath;
}
 
/*
@@ -474,7 +474,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
owner = ACCESS_ONCE(lock->owner);
if (owner && !mutex_spin_on_owner(lock, owner)) {
mspin_unlock(MLOCK(lock), );
-   break;
+   goto slowpath;
}
 
if ((atomic_read(>count) == 1) &&
@@ -489,8 +489,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
 
mutex_set_owner(lock);
mspin_unlock(MLOCK(lock), );
-   preempt_enable();
-   return 0;
+   goto done;
}
mspin_unlock(MLOCK(lock), );
 
@@ -501,7 +500,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
 * the owner complete.
 */
if (!owner && (need_resched() || rt_task(task)))
-   break;
+   goto slowpath;
 
/*
 * The cpu_relax() call is a compiler barrier which forces
@@ -515,6 +514,10 @@ slowpath:
 #endif
spin_lock_mutex(>wait_lock, flags);
 
+   /* once more, can we acquire the lock? */
+   if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(>count, 0) == 1))
+   goto skip_wait;
+
debug_mutex_lock_common(lock, );
debug_mutex_add_waiter(lock, , task_thread_info(task));
 
@@ -522,9 +525,6 @@ slowpath:
list_add_tail(, >wait_list);
waiter.task = task;
 
-   if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(>count, -1) == 1))
-   goto done;
-
lock_contended(>dep_map, ip);
 
for (;;) {
@@ -538,7 +538,7 @@ slowpath:
 * other waiters:
 */
if (MUTEX_SHOW_NO_WAITER(lock) &&
-  (atomic_xchg(>count, -1) == 1))
+   (atomic_xchg(>count, -1) == 1))
break;
 
/*
@@ -563,24 +563,25 @@ slowpath:
schedule_preempt_disabled();
spin_lock_mutex(>wait_lock, flags);
}
+   mutex_remove_waiter(lock, , current_thread_info());
+   /* set it to 0 if there are no waiters left: */
+   if (likely(list_empty(>wait_list)))
+   atomic_set(>count, 0);
+   debug_mutex_free_waiter();
 
-done:
+skip_wait:
+   /* got the lock - cleanup and rejoice! */
lock_acquired(>dep_map, ip);
-   /* got the lock - rejoice! */
-

[tip:perf/core] kprobes/x86: Call out into INT3 handler directly instead of using notifier

2013-07-23 Thread tip-bot for Jiri Kosina

Commit-ID:  17f41571bb2c4a398785452ac2718a6c5d77180e
Gitweb: http://git.kernel.org/tip/17f41571bb2c4a398785452ac2718a6c5d77180e
Author: Jiri Kosina 
AuthorDate: Tue, 23 Jul 2013 10:09:28 +0200
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 10:12:57 +0200

kprobes/x86: Call out into INT3 handler directly instead of using notifier

In fd4363fff3d96 ("x86: Introduce int3 (breakpoint)-based
instruction patching"), the mechanism that was introduced for
notifying alternatives code from int3 exception handler that and
exception occured was die_notifier.

This is however problematic, as early code might be using jump
labels even before the notifier registration has been performed,
which will then lead to an oops due to unhandled exception. One
of such occurences has been encountered by Fengguang:

 int3:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc1-01429-g04bf576 #8
 task: 88000da1b040 ti: 88000da1c000 task.ti: 88000da1c000
 RIP: 0010:[]  [] ttwu_do_wakeup+0x28/0x225
 RSP: :88000dd03f10  EFLAGS: 0006
 RAX:  RBX: 88000dd12940 RCX: 81769c40
 RDX: 0002 RSI:  RDI: 0001
 RBP: 88000dd03f28 R08: 8176a8c0 R09: 0002
 R10: 810ff484 R11: 88000dd129e8 R12: 88000dbc90c0
 R13: 88000dbc90c0 R14: 88000da1dfd8 R15: 88000da1dfd8
 FS:  () GS:88000dd0() knlGS:
 CS:  0010 DS:  ES:  CR0: 8005003b
 CR2:  CR3: 01c88000 CR4: 06e0
 Stack:
  88000dd12940 88000dbc90c0 88000da1dfd8 88000dd03f48
  81109e2b 88000dd12940  88000dd03f68
  81109e9e  00012940 88000dd03f98
 Call Trace:
  
  [] ttwu_do_activate.constprop.56+0x6d/0x79
  [] sched_ttwu_pending+0x67/0x84
  [] scheduler_ipi+0x15a/0x2b0
  [] smp_reschedule_interrupt+0x38/0x41
  [] reschedule_interrupt+0x6d/0x80
  
  [] ? __atomic_notifier_call_chain+0x5/0xc1
  [] ? native_safe_halt+0xd/0x16
  [] default_idle+0x147/0x282
  [] arch_cpu_idle+0x3d/0x5d
  [] cpu_idle_loop+0x46d/0x5db
  [] cpu_startup_entry+0x84/0x84
  [] start_secondary+0x3c8/0x3d5
  [...]

Fix this by directly calling poke_int3_handler() from the int3
exception handler (analogically to what ftrace has been doing
already), instead of relying on notifier, registration of which
might not have yet been finalized by the time of the first trap.

Reported-and-tested-by: Fengguang Wu 
Signed-off-by: Jiri Kosina 
Acked-by: Masami Hiramatsu 
Cc: H. Peter Anvin 
Cc: Fengguang Wu 
Cc: Steven Rostedt 
Link: http://lkml.kernel.org/r/alpine.lnx.2.00.1307231007490.14...@pobox.suse.cz
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/alternative.h |  2 ++
 arch/x86/kernel/alternative.c  | 31 ---
 arch/x86/kernel/traps.c|  4 
 kernel/kprobes.c   |  2 +-
 4 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h 
b/arch/x86/include/asm/alternative.h
index 4daf8c5..0a3f9c9 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Alternative inline assembly for SMP.
@@ -224,6 +225,7 @@ extern void *text_poke_early(void *addr, const void 
*opcode, size_t len);
  * inconsistent instruction while you patch.
  */
 extern void *text_poke(void *addr, const void *opcode, size_t len);
+extern int poke_int3_handler(struct pt_regs *regs);
 extern void *text_poke_bp(void *addr, const void *opcode, size_t len, void 
*handler);
 
 #endif /* _ASM_X86_ALTERNATIVE_H */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 5d8782e..15e8563 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -605,26 +605,24 @@ static void do_sync_core(void *info)
 static bool bp_patching_in_progress;
 static void *bp_int3_handler, *bp_int3_addr;
 
-static int int3_notify(struct notifier_block *self, unsigned long val, void 
*data)
+int poke_int3_handler(struct pt_regs *regs)
 {
-   struct die_args *args = data;
-
/* bp_patching_in_progress */
smp_rmb();
 
if (likely(!bp_patching_in_progress))
-   return NOTIFY_DONE;
+   return 0;
 
-   /* we are not interested in non-int3 faults and ring > 0 faults */
-   if (val != DIE_INT3 || !args->regs || user_mode_vm(args->regs)
-   || args->regs->ip != (unsigned long)bp_int3_addr)
-   return NOTIFY_DONE;
+   if (user_mode_vm(regs) || regs->ip != (unsigned long)bp_int3_addr)
+   return 0;
 
/* set up the specified breakpoint handler */
-   args->regs->ip = (unsigned long) bp_int3_handler;
+   regs->ip = (unsigned long) bp_int3_handler;
+
+   return

[tip:x86/asm] x86/ia32/asm: Remove unused argument in macro

2013-07-23 Thread tip-bot for Ramkumar Ramachandra

Commit-ID:  d2475b8ff81ebeed88d8fcbc22876aced5a0807a
Gitweb: http://git.kernel.org/tip/d2475b8ff81ebeed88d8fcbc22876aced5a0807a
Author: Ramkumar Ramachandra 
AuthorDate: Wed, 10 Jul 2013 23:34:28 +0530
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 11:23:21 +0200

x86/ia32/asm: Remove unused argument in macro

Commit 3fe26fa ("x86: get rid of pt_regs argument in sigreturn variants",
from 2012-11-12) changed the body of PTREGSCALL to drop arg, and
updated the callsites; unfortunately, it forgot to update the
macro argument list, leaving an unused argument.  Fix this.

Signed-off-by: Ramkumar Ramachandra 
Cc: Al Viro 
Link: 
http://lkml.kernel.org/r/1373479468-7175-1-git-send-email-artag...@gmail.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/ia32/ia32entry.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 474dc1b..4299eb0 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -452,7 +452,7 @@ ia32_badsys:
 
CFI_ENDPROC

-   .macro PTREGSCALL label, func, arg
+   .macro PTREGSCALL label, func
ALIGN
 GLOBAL(\label)
leaq \func(%rip),%rax
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:x86/apic] x86/apic: Enable x2APIC physical mode on native hardware too, when there are fewer than 256 CPUs

2013-07-23 Thread tip-bot for Youquan Song

Commit-ID:  3d1acb49d22fbbae96524040e9e2d4cbbb3adbef
Gitweb: http://git.kernel.org/tip/3d1acb49d22fbbae96524040e9e2d4cbbb3adbef
Author: Youquan Song 
AuthorDate: Thu, 11 Jul 2013 21:22:39 -0400
Committer:  Ingo Molnar 
CommitDate: Tue, 23 Jul 2013 11:15:42 +0200

x86/apic: Enable x2APIC physical mode on native hardware too, when there are 
fewer than 256 CPUs

x2APIC extends APICID from 8 bits to 32 bits, but the device
interrupt routed from IOAPIC or delivered in MSI mode will keep
8 bits destination APICID.  In order to support x2APIC, the VT-d
interrupt remapping is introduced to translate the destination
APICID to 32 bits in x2APIC mode and keep the device compatible
in this way.

x2APIC support both logical and physical mode in destination
mode.

In logical destination mode, the 32 bits Logical APICID
has 2 sub-fields: 16 bits cluster ID and 16 bits logical ID within
the cluster and it is required VT-d interrupt remapping in x2APIC
cluster mode.

In physical destination mode, the 8 bits physical id is
compatible with 32  bits physical id when CPU number < 256.

When interrupt remapping initialization fails on platforms with
CPU number < 256, the current kernel only enables x2APIC physical
mode in virtualization environment, while we could also can enable
x2APIC physcial mode in native kernel this situation.

In this case the device interrupt will use 8 bits destination
APICID in physical mode and be compatible with x2APIC physical
when < 256 CPUs.

So we can benefit from x2APIC vs xAPIC MMIO:

 - x2APIC MSR read/write is faster than xAPIC mmio

 - x2APIC only ICR write to deliver interrupt without polling ICR deliver
   status bit and xAPIC need poll to read ICR deliver status bit.

 - x2APIC 64 bits ICR access instead of xAPIC two 32 bits access.

Signed-off-by: Youquan Song 
Cc: Youquan Song 
Cc: h...@linux.intel.com
Cc: ying...@kernel.org
Link: 
http://lkml.kernel.org/r/1373592159-459-1-git-send-email-youquan.s...@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/apic/apic.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index eca89c5..d9dd5a6 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1622,11 +1622,8 @@ void __init enable_IR_x2apic(void)
goto skip_x2apic;
 
if (ret < 0) {
-   /* IR is required if there is APIC ID > 255 even when running
-* under KVM
-*/
-   if (max_physical_apicid > 255 ||
-   !hypervisor_x2apic_available()) {
+   /* IR is required if there is APIC ID > 255 */
+   if (max_physical_apicid > 255) {
if (x2apic_preenabled)
disable_x2apic();
goto skip_x2apic;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ARM: shmobile: r8a73a4: Remove ->init_machine() special case

2013-07-23 Thread Magnus Damm

From: Magnus Damm 

No need to special case r8a73a4 ->init_machine(),
so get rid of undesired cpufreq platform device
from the generic long term r8a73a4 DT support code.

For short term support on APE6EVM the DT reference
implementation already adds a "cpufreq-cpu0" platform
device so that can be used for development.

Regarding more long term cpufreq support, perhaps
it makes sense to adjust the cpufreq driver to check
for DT information directly instead of using a
platform device for software configuration and DT
for hardware parameters.

Signed-off-by: Magnus Damm 
---

 arch/arm/mach-shmobile/setup-r8a73a4.c |6 --
 1 file changed, 6 deletions(-)

--- 0001/arch/arm/mach-shmobile/setup-r8a73a4.c
+++ work/arch/arm/mach-shmobile/setup-r8a73a4.c 2013-07-23 16:22:29.0 
+0900
@@ -215,11 +215,6 @@ void __init r8a73a4_init_delay(void)
 }
 
 #ifdef CONFIG_USE_OF
-void __init r8a73a4_add_standard_devices_dt(void)
-{
-   platform_device_register_simple("cpufreq-cpu0", -1, NULL, 0);
-   of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
-}
 
 static const char *r8a73a4_boards_compat_dt[] __initdata = {
"renesas,r8a73a4",
@@ -228,7 +223,6 @@ static const char *r8a73a4_boards_compat
 
 DT_MACHINE_START(R8A73A4_DT, "Generic R8A73A4 (Flattened Device Tree)")
.init_early = r8a73a4_init_delay,
-   .init_machine   = r8a73a4_add_standard_devices_dt,
.init_time  = shmobile_timer_init,
.dt_compat  = r8a73a4_boards_compat_dt,
 MACHINE_END
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 11/21] x86: get pg_data_t's memory from other node

2013-07-23 Thread Tang Chen


On 07/24/2013 04:09 AM, Tejun Heo wrote:

On Fri, Jul 19, 2013 at 03:59:24PM +0800, Tang Chen wrote:

From: Yasuaki Ishimatsu

If system can create movable node which all memory of the
node is allocated as ZONE_MOVABLE, setup_node_data() cannot
allocate memory for the node's pg_data_t.
So, use memblock_alloc_try_nid() instead of memblock_alloc_nid()
to retry when the first allocation fails. Otherwise, the system
could failed to boot.

..

-   nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
+   nd_pa = memblock_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
if (!nd_pa) {
-   pr_err("Cannot find %zu bytes in node %d\n",
-  nd_size, nid);
+   pr_err("Cannot find %zu bytes in any node\n", nd_size);


Hmm... we want the node data to be colocated on the same node and I
don't think being hotpluggable necessarily requires the node data to
be allocated on a different node.  Does node data of a hotpluggable
node need to stay around after hotunplug?

I don't think it's a huge issue but it'd be great if we can clarify
where the restriction is coming from.



You are right, the node data could be on hotpluggable node. And Yinghai
also said pagetable and vmemmap could be on hotpluggable node.

But for now, doing so will break memory hot-remove path. I should have
mentioned so in the log, which I didn't do.

A node could have several memory devices. And the device who holds node
data should be hot-removed in the last place. But in NUAM level, we don't
know which memory_block (/sys/devices/system/node/nodeX/memoryXXX) belongs
to which memory device. We only have node. So we can only do node hotplug.

Also as Yinghai's previous patch-set did, he put pagetable on local node.
And we met the same problem. when hot-removing memory, we have to ensure
the memory device containing pagetable being hot-removed in the last place.

But in virtualization, developers are now developing memory hotplug in qemu,
which support a single memory device hotplug. So a whole node hotplug will
not satisfy virtualization users.

At last, we concluded that we'd better do memory hotplug and local node
things (local node node data, pagetable, vmemmap, ...) in two steps.
Please refer to https://lkml.org/lkml/2013/6/19/73

The node data should be on local, I agree with that. I'm not saying I
won't do it. Just for now, it will be complicated to fix memory hot-remove
path. So I think pushing this patch for now, and do the local node things
in the next step.

Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ARM: shmobile: sh73a0: add PMU information to sh73a0.dtsi

2013-07-23 Thread Magnus Damm

From: Magnus Damm 

Add PMU information to sh73a0.dtsi. With this
included KZM9G DT reference may use the PMU.

Signed-off-by: Magnus Damm 
---

 arch/arm/boot/dts/sh73a0.dtsi |6 ++
 1 file changed, 6 insertions(+)

--- 0001/arch/arm/boot/dts/sh73a0.dtsi
+++ work/arch/arm/boot/dts/sh73a0.dtsi  2013-07-24 04:12:57.0 +0900
@@ -38,6 +38,12 @@
  <0xf100 0x100>;
};
 
+   pmu {
+   compatible = "arm,cortex-a9-pmu";
+   interrupts = <0 55 4>,
+<0 56 4>;
+   };
+
irqpin0: irqpin@e690 {
compatible = "renesas,intc-irqpin";
#interrupt-cells = <2>;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ARM: shmobile: emev2: add PMU information to emev2.dtsi

2013-07-23 Thread Magnus Damm

From: Magnus Damm 

Add PMU information to emev2.dtsi. With this
included KZM9D DT reference may use the PMU.

Signed-off-by: Magnus Damm 
---

 arch/arm/boot/dts/emev2.dtsi |6 ++
 1 file changed, 6 insertions(+)

--- 0009/arch/arm/boot/dts/emev2.dtsi
+++ work/arch/arm/boot/dts/emev2.dtsi   2013-07-02 17:32:45.0 +0900
@@ -46,6 +46,12 @@
  <0xe002 0x0100>;
};
 
+   pmu {
+   compatible = "arm,cortex-a9-pmu";
+   interrupts = <0 120 4>,
+<0 121 4>;
+   };
+
sti@e018 {
compatible = "renesas,em-sti";
reg = <0xe018 0x54>;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: dts: Add USB host node for Exynos4

2013-07-23 Thread Sachin Kamat

On 23 July 2013 23:02, Dongjin Kim  wrote:
> This patch adds EHCI and OHCI host device nodes for Exynos4.
>
> Signed-off-by: Dongjin Kim 
> ---
>  arch/arm/boot/dts/exynos4.dtsi |   20 
>  1 file changed, 20 insertions(+)
>
> diff --git a/arch/arm/boot/dts/exynos4.dtsi b/arch/arm/boot/dts/exynos4.dtsi
> index 3f94fe8..1cdbf89 100644
> --- a/arch/arm/boot/dts/exynos4.dtsi
> +++ b/arch/arm/boot/dts/exynos4.dtsi
> @@ -155,6 +155,26 @@
> status = "disabled";
> };
>
> +   ehci@1258 {
> +   compatible = "samsung,exynos4210-ehci";
> +   reg = <0x1258 0x100>;
> +   interrupts = <0 70 0>;
> +   status = "disabled";
> +
> +   clocks = < 304>;
> +   clock-names = "usbhost";
> +   };
> +
> +   ohci@1259 {
> +   compatible = "samsung,exynos4210-ohci";
> +   reg = <0x1258 0x100>;

Register value and node name do not match. Typo?


> +   interrupts = <0 70 0>;
> +   status = "disabled";
> +
> +   clocks = < 304>;
> +   clock-names = "usbhost";
> +   };
> +
> mfc: codec@1340 {
> compatible = "samsung,mfc-v5";
> reg = <0x1340 0x1>;
> --
> 1.7.9.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" 
> in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
With warm regards,
Sachin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] drivers: mfd: mfd-core: disable irq_domain related code when 'HAVE_GENERIC_HARDIRQS' disabled.

2013-07-23 Thread Chen Gang

'irq_domain' depends on hard irqs, so for the architectures which have
no hard irqs, but still need mfd (e.g. s390), need disable the related
code, or can not pass compiling.

The related commit:

  "c94bb23 mfd: Make MFD core code Device Tree and IRQ domain aware"

The related error: (with allmodconfig under s390)

  ERROR: "irq_create_mapping" [drivers/mfd/mfd-core.ko] undefined!


Signed-off-by: Chen Gang 
---
 drivers/mfd/mfd-core.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/mfd/mfd-core.c b/drivers/mfd/mfd-core.c
index 7604f4e..8e56a74 100644
--- a/drivers/mfd/mfd-core.c
+++ b/drivers/mfd/mfd-core.c
@@ -129,13 +129,16 @@ static int mfd_add_device(struct device *parent, int id,
res[r].end = mem_base->start +
cell->resources[r].end;
} else if (cell->resources[r].flags & IORESOURCE_IRQ) {
+#ifdef HAVE_GENERIC_HARDIRQS
if (domain) {
/* Unable to create mappings for IRQ ranges. */
WARN_ON(cell->resources[r].start !=
cell->resources[r].end);
res[r].start = res[r].end = irq_create_mapping(
domain, cell->resources[r].start);
-   } else {
+   } else
+#endif
+   {
res[r].start = irq_base +
cell->resources[r].start;
res[r].end   = irq_base +
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] perf: Add mailing list to MAINTAINERS

2013-07-23 Thread Michael Ellerman

Currently there is no mailing list mentioned for perf, but everyone
sends them to lkml, so document that.

Signed-off-by: Michael Ellerman 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index bf61e04..2aae970 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6306,6 +6306,7 @@ M:Peter Zijlstra 
 M: Paul Mackerras 
 M: Ingo Molnar 
 M: Arnaldo Carvalho de Melo 
+L: linux-kernel@vger.kernel.org
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/core
 S: Supported
 F: kernel/events/*
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] sched: update_top_cache_domain only at the times of building sched domain.

2013-07-23 Thread Michael Wang

Hi, Rakib

On 07/24/2013 01:42 AM, Rakib Mullick wrote:
> Currently, update_top_cache_domain() is called whenever schedule domain is 
> built or destroyed. But, the following
> callpath shows that they're at the same callpath and can be avoided 
> update_top_cache_domain() while destroying schedule
> domain and update only at the times of building schedule domains.
> 
>   partition_sched_domains()
>   detach_destroy_domain()
>   cpu_attach_domain()
>   update_top_cache_domain()

IMHO, cpu_attach_domain() and update_top_cache_domain() should be
paired, below patch will open a window which 'rq->sd == NULL' while
'sd_llc != NULL', isn't it?

I don't think we have the promise that before we rebuild the stuff
correctly, no one will utilize 'sd_llc'...

Further more, what will happen if the old sd was freed after next rcu
work cycle while 'sd_llc' still hold the reference for some victims?

Thus I do suggest we leave the things untouched since the benefit we get
is too less, not worth the risk...

Regards,
Michael Wang

>   build_sched_domains()
>   cpu_attach_domain()
>   update_top_cache_domain()
> 
> Changes since v1: use sd to determine when to skip, courtesy PeterZ
> 
> Signed-off-by: Rakib Mullick 
> ---
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index b7c32cb..387fb66 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5138,7 +5138,8 @@ cpu_attach_domain(struct sched_domain *sd, struct 
> root_domain *rd, int cpu)
>   rcu_assign_pointer(rq->sd, sd);
>   destroy_sched_domains(tmp, cpu);
> 
> - update_top_cache_domain(cpu);
> + if (sd)
> + update_top_cache_domain(cpu);
>  }
> 
>  /* cpus with isolated domains */
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.11-rc2

2013-07-23 Thread Dave Chinner

On Mon, Jul 22, 2013 at 05:06:01AM +0100, Al Viro wrote:
> On Mon, Jul 22, 2013 at 11:25:17AM +1000, Dave Chinner wrote:
> 
> > I'll just point out that it can make the whole thing worse, too.
> > For example, for ext3/4, the tmpfile being created has to be added
> > to the orphan inode list which is protected by a filesystem global
> > mutex. Hence scalability of O_TMPFILE is massively limited on
> > ext3/ext4 due to architectural issues within ext3/4. Other
> > filesystems will be more efficient, but because they have more
> > scalable/complex orphan inode handling it's going to take longer to
> > implement O_TMPFILE support for them
> 
> Um...  You do realize that the same architectural issues there will
> create exactly the same serialization when you are unlinking the
> sucker?  I.e. with the "pick the name, create and open, unlink" sequence
> ext[34] will insert that inode into the same orphan list, creating
> the same contention...

Yup.

But that is assuming that the unlink of the tmpfile happens
immediately after the open() and that's not necessarily the case for
all users of tmp files that might get converted to use O_TMPFILE,
and so a saying it is more efficient than traditional tmpfiles is
not necessarily correct.

Let's set expectations appropriately at the start, rather than have
people complain a year down the track that O_TMPFILE causes them
performance problems because they don't understand the limitations
of the implementations underlying it..

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH net-next] tuntap: hardware vlan tx support

2013-07-23 Thread Jason Wang

On 07/23/2013 08:12 PM, Sergei Shtylyov wrote:
> Hello.
>
> On 23-07-2013 11:15, Jason Wang wrote:
>
>> Inspired by commit f09e2249c4f5c7c13261ec73f5a7807076af0c8e (macvtap:
>> restore
>> vlan header on user read). This patch adds hardware vlan tx support for
>> tuntap. This is done by copying vlan header directly into userspace in
>> tun_put_user() instead of doing it through __vlan_put_tag() in
>> dev_hard_start_xmit(). This eliminates one unnecessary memove in
>
>s/memove/memmove/?
>

Yes.
>> vlan_insert_tag() for 802.1ad and 802.1q traffic.
>
>> pktgen test shows about 20% improvement for 802.1q traffic:
>
>> Before:
>>662149pps 317Mb/sec (317831520bps) errors: 0
>> After:
>>801033pps 384Mb/sec (384495840bps) errors: 0
>
>> Cc: Basil Gor 
>> Cc: Michael S. Tsirkin 
>> Signed-off-by: Jason Wang 
>> ---
>>   drivers/net/tun.c |   39 +++
>>   1 files changed, 35 insertions(+), 4 deletions(-)
>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index a72d141..66e265d 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
> [...]
>> @@ -1328,11 +1330,39 @@ static ssize_t tun_put_user(struct tun_struct
>> *tun,
>>   total += tun->vnet_hdr_sz;
>>   }
>>
>> -len = min_t(int, skb->len, len);
>> +if (!vlan_tx_tag_present(skb))
>> +len = min_t(int, skb->len, len);
>> +else {
>
>According to Documentation/CodingStyle chapter 3, both arms of an
> *if* statement should have {} if one arm has it.
>

Right.
>> +int copy, ret;
>> +struct {
>> +__be16 h_vlan_proto;
>> +__be16 h_vlan_TCI;
>> +} veth;
>
>Empty line wouldn't hurt here, after declarations...
>

Ok
>> +veth.h_vlan_proto = skb->vlan_proto;
>> +veth.h_vlan_TCI = htons(vlan_tx_tag_get(skb));
> [...]
>
> WBR, Sergei
>

Thanks, will correct them.
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 10/21] earlycpio.c: Fix the confusing comment of find_cpio_data().

2013-07-23 Thread Tang Chen


On 07/24/2013 04:02 AM, Tejun Heo wrote:

On Fri, Jul 19, 2013 at 03:59:23PM +0800, Tang Chen wrote:

- * @offset: When a matching file is found, this is the offset to the
- *  beginning of the cpio. It can be used to iterate through
- *  the cpio to find all files inside of a directory path
+ * @offset: When a matching file is found, this is the offset from the
+ *  beginning of the cpio to the beginning of the next file, not the
+ *  matching file itself. It can be used to iterate through the cpio
+ *  to find all files inside of a directory path


Nicely spotted.  I think we can go further and rename it to @nextoff.


OK, followed.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 09/21] x86: Make get_ramdisk_{image|size}() global.

2013-07-23 Thread Tang Chen


On 07/24/2013 03:56 AM, Tejun Heo wrote:

On Fri, Jul 19, 2013 at 03:59:22PM +0800, Tang Chen wrote:

In the following patches, we need to call get_ramdisk_{image|size}()
to get initrd file's address and size. So make these two functions
global.

Signed-off-by: Tang Chen
---
  arch/x86/include/asm/setup.h |3 +++
  arch/x86/kernel/setup.c  |4 ++--
  2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index b7bf350..69de7a1 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -106,6 +106,9 @@ void *extend_brk(size_t size, size_t align);
RESERVE_BRK(name, sizeof(type) * entries)

  extern void probe_roms(void);
+u64 get_ramdisk_image(void);
+u64 get_ramdisk_size(void);


Might as well make these accessors inline functions.\


Sure, will make them as static inline functions in 
arch/x86/include/asm/setup.h.


Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] msix restore code optimization for dom0

2013-07-23 Thread Zhenzhong Duan

PHYSDEVOP_restore_msi is used to restore all msix entrys in one hypercall
in dom0. But it is called multi times in current code.

This patch split arch_restore_msi_irqs into two functions.
Use arch_restore_msi_irq deal with one entry and avoid call hypercall multi
times in __pci_restore_msix_state.

Signed-off-by: Zhenzhong Duan 
---
 arch/x86/include/asm/pci.h  |8 
 arch/x86/include/asm/x86_init.h |2 +-
 arch/x86/pci/xen.c  |2 +-
 drivers/pci/msi.c   |   17 -
 4 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index d9e9e6c..40cbea4 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -115,9 +115,9 @@ static inline void x86_teardown_msi_irq(unsigned int irq)
 {
x86_msi.teardown_msi_irq(irq);
 }
-static inline void x86_restore_msi_irqs(struct pci_dev *dev, int irq)
+static inline void x86_restore_msi_irqs(struct pci_dev *dev)
 {
-   x86_msi.restore_msi_irqs(dev, irq);
+   x86_msi.restore_msi_irqs(dev);
 }
 #define arch_setup_msi_irqs x86_setup_msi_irqs
 #define arch_teardown_msi_irqs x86_teardown_msi_irqs
@@ -127,14 +127,14 @@ static inline void x86_restore_msi_irqs(struct pci_dev 
*dev, int irq)
 struct msi_desc;
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
 void native_teardown_msi_irq(unsigned int irq);
-void native_restore_msi_irqs(struct pci_dev *dev, int irq);
+void native_restore_msi_irqs(struct pci_dev *dev);
 int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
  unsigned int irq_base, unsigned int irq_offset);
 /* default to the implementation in drivers/lib/msi.c */
 #define HAVE_DEFAULT_MSI_TEARDOWN_IRQS
 #define HAVE_DEFAULT_MSI_RESTORE_IRQS
 void default_teardown_msi_irqs(struct pci_dev *dev);
-void default_restore_msi_irqs(struct pci_dev *dev, int irq);
+void default_restore_msi_irqs(struct pci_dev *dev);
 #else
 #define native_setup_msi_irqs  NULL
 #define native_teardown_msi_irqNULL
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 828a156..f58a9c7 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -180,7 +180,7 @@ struct x86_msi_ops {
   u8 hpet_id);
void (*teardown_msi_irq)(unsigned int irq);
void (*teardown_msi_irqs)(struct pci_dev *dev);
-   void (*restore_msi_irqs)(struct pci_dev *dev, int irq);
+   void (*restore_msi_irqs)(struct pci_dev *dev);
int  (*setup_hpet_msi)(unsigned int irq, unsigned int id);
 };
 
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 48e8461..cdd869f 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -337,7 +337,7 @@ out:
return ret;
 }
 
-static void xen_initdom_restore_msi_irqs(struct pci_dev *dev, int irq)
+static void xen_initdom_restore_msi_irqs(struct pci_dev *dev)
 {
int ret = 0;
 
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 922fb49..d4ccfeb 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -214,7 +214,7 @@ void unmask_msi_irq(struct irq_data *data)
 #endif /* CONFIG_GENERIC_HARDIRQS */
 
 #ifdef HAVE_DEFAULT_MSI_RESTORE_IRQS
-void default_restore_msi_irqs(struct pci_dev *dev, int irq)
+static void default_restore_msi_irq(struct pci_dev *dev, int irq)
 {
int pos;
u16 control;
@@ -244,6 +244,15 @@ void default_restore_msi_irqs(struct pci_dev *dev, int irq)
}
}
 }
+
+void default_restore_msi_irqs(struct pci_dev *dev)
+{
+   struct msi_desc *entry;
+
+   list_for_each_entry(entry, >msi_list, list) {
+   default_restore_msi_irq(dev, entry->irq);
+   }
+}
 #endif
 
 void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
@@ -416,7 +425,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
 
pci_intx_for_msi(dev, 0);
msi_set_enable(dev, 0);
-   arch_restore_msi_irqs(dev, dev->irq);
+   arch_restore_msi_irqs(dev);
 
pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, );
control &= ~PCI_MSI_FLAGS_QSIZE;
@@ -440,9 +449,7 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
control |= PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL;
pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, control);
 
-   list_for_each_entry(entry, >msi_list, list) {
-   arch_restore_msi_irqs(dev, entry->irq);
-   }
+   arch_restore_msi_irqs(dev);
 
control &= ~PCI_MSIX_FLAGS_MASKALL;
pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, control);
-- 
1.7.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3] Refactor msi/msix restore code Part1

2013-07-23 Thread Zhenzhong Duan

Move default_restore_msi_irqs down to reference some static function
msi_mask_irq and msix_mask_irq.

Tested-by: Sucheta Chakraborty 
Signed-off-by: Zhenzhong Duan 
---
 drivers/pci/msi.c |   40 
 1 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index aca7578..87223ae 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -96,26 +96,6 @@ void default_teardown_msi_irqs(struct pci_dev *dev)
 # define HAVE_DEFAULT_MSI_RESTORE_IRQS
 #endif
 
-#ifdef HAVE_DEFAULT_MSI_RESTORE_IRQS
-void default_restore_msi_irqs(struct pci_dev *dev, int irq)
-{
-   struct msi_desc *entry;
-
-   entry = NULL;
-   if (dev->msix_enabled) {
-   list_for_each_entry(entry, >msi_list, list) {
-   if (irq == entry->irq)
-   break;
-   }
-   } else if (dev->msi_enabled)  {
-   entry = irq_get_msi_desc(irq);
-   }
-
-   if (entry)
-   write_msi_msg(irq, >msg);
-}
-#endif
-
 static void msi_set_enable(struct pci_dev *dev, int enable)
 {
u16 control;
@@ -233,6 +213,26 @@ void unmask_msi_irq(struct irq_data *data)
 
 #endif /* CONFIG_GENERIC_HARDIRQS */
 
+#ifdef HAVE_DEFAULT_MSI_RESTORE_IRQS
+void default_restore_msi_irqs(struct pci_dev *dev, int irq)
+{
+   struct msi_desc *entry;
+
+   entry = NULL;
+   if (dev->msix_enabled) {
+   list_for_each_entry(entry, >msi_list, list) {
+   if (irq == entry->irq)
+   break;
+   }
+   } else if (dev->msi_enabled)  {
+   entry = irq_get_msi_desc(irq);
+   }
+
+   if (entry)
+   write_msi_msg(irq, >msg);
+}
+#endif
+
 void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 {
BUG_ON(entry->dev->current_state != PCI_D0);
-- 
1.7.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/3] Refactor msi/msix restore code Part2

2013-07-23 Thread Zhenzhong Duan

xen_initdom_restore_msi_irqs trigger a hypercall to restore addr/data/mask
in dom0. It's better to do the same for default_restore_msi_irqs in baremetal.

Move restore of mask in default_restore_msi_irqs, this could avoid mask
restored twice in dom0, once in hypercall, the other in kernel.

Without that, qlcnic driver calling pci_reset_function will lost interrupt
in dom0.

Tested-by: Sucheta Chakraborty 
Signed-off-by: Zhenzhong Duan 
---
 drivers/pci/msi.c |   17 ++---
 1 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 87223ae..922fb49 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -216,6 +216,8 @@ void unmask_msi_irq(struct irq_data *data)
 #ifdef HAVE_DEFAULT_MSI_RESTORE_IRQS
 void default_restore_msi_irqs(struct pci_dev *dev, int irq)
 {
+   int pos;
+   u16 control;
struct msi_desc *entry;
 
entry = NULL;
@@ -228,8 +230,19 @@ void default_restore_msi_irqs(struct pci_dev *dev, int irq)
entry = irq_get_msi_desc(irq);
}
 
-   if (entry)
+   if (entry) {
write_msi_msg(irq, >msg);
+   if (dev->msix_enabled) {
+   msix_mask_irq(entry, entry->masked);
+   readl(entry->mask_base);
+   } else {
+   pos = entry->msi_attrib.pos;
+   pci_read_config_word(dev, pos + PCI_MSI_FLAGS,
+);
+   msi_mask_irq(entry, msi_capable_mask(control),
+entry->masked);
+   }
+   }
 }
 #endif
 
@@ -406,7 +419,6 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
arch_restore_msi_irqs(dev, dev->irq);
 
pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, );
-   msi_mask_irq(entry, msi_capable_mask(control), entry->masked);
control &= ~PCI_MSI_FLAGS_QSIZE;
control |= (entry->msi_attrib.multiple << 4) | PCI_MSI_FLAGS_ENABLE;
pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
@@ -430,7 +442,6 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
 
list_for_each_entry(entry, >msi_list, list) {
arch_restore_msi_irqs(dev, entry->irq);
-   msix_mask_irq(entry, entry->masked);
}
 
control &= ~PCI_MSIX_FLAGS_MASKALL;
-- 
1.7.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: hugepage related lockdep trace.

2013-07-23 Thread Minchan Kim

On Tue, Jul 23, 2013 at 01:24:17AM -0600, Hush Bensen wrote:
> On 07/18/2013 06:13 PM, Minchan Kim wrote:
> >On Thu, Jul 18, 2013 at 11:12:24PM +0530, Aneesh Kumar K.V wrote:
> >>Minchan Kim  writes:
> >>
> >>>Ccing people get_maintainer says.
> >>>
> >>>On Wed, Jul 17, 2013 at 11:32:23AM -0400, Dave Jones wrote:
> [128095.470960] =
> [128095.471315] [ INFO: inconsistent lock state ]
> [128095.471660] 3.11.0-rc1+ #9 Not tainted
> [128095.472156] -
> [128095.472905] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> [128095.473650] kswapd0/49 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [128095.474373]  (>i_mmap_mutex){+.+.?.}, at: [] 
> page_referenced+0x87/0x5e3
> [128095.475128] {RECLAIM_FS-ON-W} state was registered at:
> [128095.475866]   [] mark_held_locks+0x81/0xe7
> [128095.476597]   [] lockdep_trace_alloc+0x5e/0xbc
> [128095.477322]   [] __alloc_pages_nodemask+0x8b/0x9b6
> [128095.478049]   [] __get_free_pages+0x20/0x31
> [128095.478769]   [] get_zeroed_page+0x12/0x14
> [128095.479477]   [] __pmd_alloc+0x1c/0x6b
> [128095.480138]   [] huge_pmd_share+0x265/0x283
> [128095.480138]   [] huge_pte_alloc+0x5d/0x71
> [128095.480138]   [] hugetlb_fault+0x7c/0x64a
> [128095.480138]   [] handle_mm_fault+0x255/0x299
> [128095.480138]   [] __do_page_fault+0x142/0x55c
> [128095.480138]   [] do_page_fault+0xd/0x16
> [128095.480138]   [] error_code+0x6c/0x74
> [128095.480138] irq event stamp: 3136917
> [128095.480138] hardirqs last  enabled at (3136917): [] 
> _raw_spin_unlock_irq+0x27/0x50
> [128095.480138] hardirqs last disabled at (3136916): [] 
> _raw_spin_lock_irq+0x15/0x78
> [128095.480138] softirqs last  enabled at (3136180): [] 
> __do_softirq+0x137/0x30f
> [128095.480138] softirqs last disabled at (3136175): [] 
> irq_exit+0xa8/0xaa
> [128095.480138]
> other info that might help us debug this:
> [128095.480138]  Possible unsafe locking scenario:
> 
> [128095.480138]CPU0
> [128095.480138]
> [128095.480138]   lock(>i_mmap_mutex);
> [128095.480138]   
> [128095.480138] lock(>i_mmap_mutex);
> [128095.480138]
>   *** DEADLOCK ***
> 
> [128095.480138] no locks held by kswapd0/49.
> [128095.480138]
> stack backtrace:
> [128095.480138] CPU: 1 PID: 49 Comm: kswapd0 Not tainted 3.11.0-rc1+ #9
> [128095.480138] Hardware name: Dell Inc. Precision 
> WorkStation 490/0DT031, BIOS A08 04/25/2008
> [128095.480138]  c1d32630  ee39fb18 c15b001e ee395780 ee39fb54 
> c15acdcb c1751845
> [128095.480138]  c1751bbf 0031     
> 0001 0001
> [128095.480138]  c1751bbf 0008 ee395c44 0100 ee39fb88 c10a6130 
> 0008 d8fb
> [128095.480138] Call Trace:
> [128095.480138]  [] dump_stack+0x4b/0x79
> [128095.480138]  [] print_usage_bug+0x1d9/0x1e3
> [128095.480138]  [] mark_lock+0x1e0/0x261
> [128095.480138]  [] ? check_usage_backwards+0x109/0x109
> [128095.480138]  [] __lock_acquire+0x623/0x17f2
> [128095.480138]  [] ? sched_clock_cpu+0xcd/0x130
> [128095.480138]  [] ? sched_clock_local+0x42/0x12e
> [128095.480138]  [] lock_acquire+0x7d/0x195
> [128095.480138]  [] ? page_referenced+0x87/0x5e3
> [128095.480138]  [] mutex_lock_nested+0x6c/0x3a7
> [128095.480138]  [] ? page_referenced+0x87/0x5e3
> [128095.480138]  [] ? page_referenced+0x87/0x5e3
> [128095.480138]  [] ? 
> mem_cgroup_charge_statistics.isra.24+0x61/0x9e
> [128095.480138]  [] page_referenced+0x87/0x5e3
> [128095.480138]  [] ? raid0_congested+0x26/0x8a [raid0]
> [128095.480138]  [] shrink_page_list+0x3d9/0x947
> [128095.480138]  [] ? trace_hardirqs_on+0xb/0xd
> [128095.480138]  [] shrink_inactive_list+0x155/0x4cb
> [128095.480138]  [] shrink_lruvec+0x300/0x5ce
> [128095.480138]  [] shrink_zone+0x53/0x14e
> [128095.480138]  [] kswapd+0x517/0xa75
> [128095.480138]  [] ? mem_cgroup_shrink_node_zone+0x280/0x280
> [128095.480138]  [] kthread+0xa8/0xaa
> [128095.480138]  [] ? trace_hardirqs_on+0xb/0xd
> [128095.480138]  [] ret_from_kernel_thread+0x1b/0x28
> [128095.480138]  [] ? insert_kthread_work+0x63/0x63
> >>>IMHO, it's a false positive because i_mmap_mutex was held by kswapd
> >>>while one in the middle of fault path could be never on kswapd context.
> >>>
> >>>It seems lockdep for reclaim-over-fs isn't enough smart to identify
> >>>between background and direct reclaim.
> >>>
> >>>Wait for other's opinion.
> >>Is that reasoning correct ?. We may not deadlock because hugetlb pages
> >>cannot be reclaimed. So the fault path in hugetlb won't end up
> >>reclaiming pages from same inode. But the report is correct right ?
> >>
> >>
> >>Looking at the hugetlb code we have in huge_pmd_share

Re: [PATCH 03/21] x86, acpi, numa, mem-hotplug: Introduce MEMBLK_HOTPLUGGABLE to reserve hotpluggable memory.

2013-07-23 Thread Tang Chen


On 07/24/2013 03:19 AM, Tejun Heo wrote:

On Fri, Jul 19, 2013 at 03:59:16PM +0800, Tang Chen wrote:

  /* Definition of memblock flags. */
  #define MEMBLK_FLAGS_DEFAULT  0x0 /* default flag */
+#define MEMBLK_HOTPLUGGABLE0x1 /* hotpluggable region */


Given that all existing APIs are using "memblock", wouldn't it be
better to use "MEMBLOCK_" prefix?  If it's too long, we can just do
MEMBLOCK_HOTPLUG.


OK, followed.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PATCH - radeon_atombios_parse_power_table_4_5 - update of 23th july 2013

2013-07-23 Thread Alex Deucher

On Tue, Jul 23, 2013 at 4:08 PM, Michael Schuster
 wrote:
> Dear Mr. Deucher,
>
> I received a reply from Mr. Airlie with your email as developer of the
> radeon kernel driver.
> I don't know if he send you the both emails and (patch) attachements, if
> not, please send me a short
> note than I would like to send you some information about me why I decided
> to try to develope a patch...

I only saw the message he cc'ed me on.  I haven't seen any of the
previous patches or emails.

> Attached is the actual version of the patche, changed some lines e.g. in
> atombios.c and radeon_pm.c (and
> corrected some off/on typos in evergreen.c) to start he asic with
> AUTO/BALANCED power mode using
> LOW clocks if some power modes are available.
> By now I unsuccesfully tried to autosuspend my muxless (was my real initial
> intention to develope the patch)
> from the start but did not find an succesful way to do it...

You can use vgaswitcheroo to disable the dGPU on your system if you
aren't using it.

> Hope he patch will do some good (almost byte-wise) parsing the
> PowerPlayTable used for
> radeon_atombios_parse_power_table_4_5 (and probably assembling missing power
> modes) for proper
> use in 'static mode' power management.

It's not entirely clear to me what your patch is doing.  I think you
might be overcomplicating things.  The power tables are designed with
the DPM (Dynamic Power Management) hardware in mind, so they don't
match perfectly to the the older static levels from older asics.  Now
that we have DPM support (as of 3.11), I'd like to eventually
deprecate the the old static profiles on asics that support DPM so I'd
prefer not make any major changes to that code to avoid any
regressions.

> By the way, the 'rom' file size from (debian 7, ext4) is 128k, 'rom' is not
> readable after echo 1 > rom
> may be it depends of my Samsung 700Z3A having an intel integrated graphics
> card and a (muxless)
> HD6490M as a second graphics card.

In hybrid laptops the vbios for the discrete card is stored in ACPI,
you can't just dump it from the pci rom.  You'd need to dump it from
the driver itself or provide an interface in the driver to access it
via debugfs or sysfs.

> Corrections und hints for learning C / doing kernel module programming the
> right/code safe way
> are highly appreciated, thanks a lot in advance, best wishes and regards,
>

Best advice is to look at other kernel drivers and see what they are
doing and how the code is written for pointers on style.

Alex

> Michael Schuster
>
> post scriptum: the patch is 66k plain text, I attached it because of the
> size, hoping
> this will be okay for you ... - M.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/21] memblock, numa: Introduce flag into memblock.

2013-07-23 Thread Tang Chen


On 07/24/2013 03:09 AM, Tejun Heo wrote:

Hello,

On Fri, Jul 19, 2013 at 03:59:15PM +0800, Tang Chen wrote:

+#define MEMBLK_FLAGS_DEFAULT   0x0 /* default flag */


Please don't do this.  Just clearing the struct as zero is enough.


@@ -439,12 +449,14 @@ repeat:
  int __init_memblock memblock_add_node(phys_addr_t base, phys_addr_t size,
   int nid)
  {
-   return memblock_add_region(, base, size, nid);
+   return memblock_add_region(, base, size,
+  nid, MEMBLK_FLAGS_DEFAULT);


And just use zero for no flag.  Doing something like the above gets
weird with actual flags.  e.g. if you add a flag, say, MEMBLK_HOTPLUG,
should it be MEMBLK_FLAGS_DEFAULT | MEMBLK_HOTPLUG or just
MEMBLK_HOTPLUG?  If latter, the knowledge that DEFAULT is zero is
implicit, and, if so, why do it at all?


OK, will remove MEMBLK_FLAGS_DEFAULT, and use 0 by default.




+static int __init_memblock memblock_reserve_region(phys_addr_t base,
+  phys_addr_t size,
+  int nid,
+  unsigned long flags)
  {
struct memblock_type *_rgn =

-   memblock_dbg("memblock_reserve: [%#016llx-%#016llx] %pF\n",
+   memblock_dbg("memblock_reserve: [%#016llx-%#016llx] with flags %#016lx 
%pF\n",


Let's please drop "with" and do we really need to print full 16
digits?


Sure, will remove "with". But I think printing out the full flags is batter.
The output seems more tidy.


Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Recvfile patch used for Samba.

2013-07-23 Thread Dave Chinner

On Tue, Jul 23, 2013 at 02:58:58PM -0700, Jeremy Allison wrote:
> On Tue, Jul 23, 2013 at 05:10:27PM +1000, Dave Chinner wrote:
> > So, we are nesting up to 32 page locks here. That's bad. And we are
> > nesting kmap() calls for all the pages individually - is that even
> > safe to do?
> > 
> > So, what happens when we've got 16 pages in, and the filesystem has
> > allocated space for those 16 blocks, and we get ENOSPC on the 17th?
> > Sure, you undo the state here, but what about the 16 blocks that the
> > filesystem has allocated to this file? There's no notification to
> > the filesystem that they need to be truncated away because the write
> > failed
> > 
> > > +
> > > + /* IOV is ready, receive the date from socket now */
> > > + msg.msg_name = NULL;
> > > + msg.msg_namelen = 0;
> > > + msg.msg_iov = (struct iovec *)[0];
> > > + msg.msg_iovlen = cPagesAllocated ;
> > > + msg.msg_control = NULL;
> > > + msg.msg_controllen = 0;
> > > + msg.msg_flags = MSG_KERNSPACE;
> > > + rcvtimeo = sock->sk->sk_rcvtimeo;
> > > + sock->sk->sk_rcvtimeo = 8 * HZ;
> > 
> > We can hold the inode and the pages locked for 8 seconds?
> > 
> > I'll stop there. This is fundamentally broken. It's an attempt to do
> > a multi-page write operation without any of the supporting
> > structures needed to handle the failure cases properly.  The nested
> > page locking has "deadlock" written all over it, and the lack of
> > partial failure handling shouts "data corruption" and "stale data
> > exposure" to me. The fact it can block for up to 8 seconds waiting
> > for network shenanigans to be completed while holding lots of locks
> > is going to cause all sorts of problems under memory pressure.
> > 
> > Not to mention it means that all memory allocations in the msgrcv
> > path need to be done with GFP_NOFS, because GFP_KERNEL allocations
> > are almost guaranteed to deadlock on the locked pages this path
> > already holds
> > 
> > Need I say more?
> 
> No, that's great ! :-).
> 
> Thanks for the analysis. I'd heard it wasn't
> near production quality, but not being a kernel
> engineer myself I wasn't able to make that assessment.
> 
> Having said that the OEMs that are using it does
> find it improves write speeds by a large amount (10%
> or more), so it's showing there is room for improvement
> here if the correct code can be created for recvfile.

10% is not very large gain given the complexity it adds, and I
question that the gain actually comes from moving the memcpy() into
the kernel.  If this recvfile code enabled zero-copy behaviour into
the page cache, then it would be worth pursuing. But it doesn't, and
so IMO the complexity is not worth the gain right now.

Indeed, I suspect the 10% gain will be from the multi-page write
behaviour that was hacked into the code. I wrote a multi-page
write prototype ~3 years ago that showed write(2) performance gains
of roughly 10% on low CPU power machines running XFS.

$ git branch |grep multi
  multipage-write
$ git checkout multipage-write 
Checking out files: 100% (45114/45114), done.
Switched to branch 'multipage-write'
$ head -4 Makefile 
VERSION = 2
PATCHLEVEL = 6
SUBLEVEL = 37
EXTRAVERSION = -rc6
$

I should probably pick this up again and push it forwards. FWIW,
I've attached the first multipage-write infrastructure patch from
the above branch to show how this sort of operation needs to be done
from a filesystem and page-cache perspective to avoid locking
problems have sane error handling.

I beleive the version that Christoph implemented for a couple of
OEMs around that time de-multiplexed the ->iomap method

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com

multipage-write: introduce iomap infrastructure

From: Dave Chinner 

Add infrastructure for multipage writes by defining the mapping interface
that the multipage writes will use and the main multipage write loop.

Signed-off-by: Dave Chinner 

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 76041b6..1196877 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -513,6 +513,7 @@ enum positive_aop_returns {
 struct page;
 struct address_space;
 struct writeback_control;
+struct iomap;
 
 struct iov_iter {
const struct iovec *iov;
@@ -604,6 +605,9 @@ struct address_space_operations {
int (*is_partially_uptodate) (struct page *, read_descriptor_t *,
unsigned long);
int (*error_remove_page)(struct address_space *, struct page *);
+
+   int (*iomap)(struct address_space *mapping, loff_t pos,
+   ssize_t length, struct iomap *iomap, int cmd);
 };
 
 /*
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
new file mode 100644
index 000..7708614
--- /dev/null
+++ b/include/linux/iomap.h
@@ -0,0 +1,45 @@
+#ifndef _IOMAP_H
+#define _IOMAP_H
+
+/* ->iomap a_op command types */
+#define IOMAP_READ 0x01/* read the current mapping starting at the
+  given position,

Re: hugepage related lockdep trace.

2013-07-23 Thread Minchan Kim

On Tue, Jul 23, 2013 at 04:01:20PM +0200, Michal Hocko wrote:
> On Fri 19-07-13 09:13:03, Minchan Kim wrote:
> > On Thu, Jul 18, 2013 at 11:12:24PM +0530, Aneesh Kumar K.V wrote:
> [...]
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 83aff0a..2cb1be3 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -3266,8 +3266,8 @@ pte_t *huge_pmd_share(struct mm_struct *mm, 
> > > unsigned long addr, pud_t *pud)
> > >   put_page(virt_to_page(spte));
> > >   spin_unlock(>page_table_lock);
> > >  out:
> > > - pte = (pte_t *)pmd_alloc(mm, pud, addr);
> > >   mutex_unlock(>i_mmap_mutex);
> > > + pte = (pte_t *)pmd_alloc(mm, pud, addr);
> > >   return pte;
> > 
> > I am blind on hugetlb but not sure it doesn't break eb48c071.
> > Michal?
> 
> Well, it is some time since I debugged the race and all the details
> vanished in the meantime. But this part of the changelog suggests that
> this indeed breaks the fix:
> "
> This patch addresses the issue by moving pmd_alloc into huge_pmd_share
> which guarantees that the shared pud is populated in the same critical
> section as pmd.  This also means that huge_pte_offset test in
> huge_pmd_share is serialized correctly now which in turn means that the
> success of the sharing will be higher as the racing tasks see the pud
> and pmd populated together.
> "
> 
> Besides that I fail to see how moving pmd_alloc down changes anything.
> Even if pmd_alloc triggered reclaim then we cannot trip over the same
> i_mmap_mutex as hugetlb pages are not reclaimable because they are not
> on the LRU.

I thought we could map some part of binary with normal page and other part
of the one with MAP_HUGETLB so that a address space could have both normal
page and HugeTLB page. Okay, it's impossible so HugeTLB pages are not on LRU.
Then, above lockdep warning is totally false positive.
Best solution is avoiding pmd_alloc with holding i_mmap_mutex but we need it
to fix eb48c071 so how about this if we couldn't have a better idea?


diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 83aff0a..e7c3a15 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3240,7 +3240,15 @@ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned 
long addr, pud_t *pud)
if (!vma_shareable(vma, addr))
return (pte_t *)pmd_alloc(mm, pud, addr);
 
-   mutex_lock(>i_mmap_mutex);
+   /*
+* It annotates to shut lockdep's warning up casued by i_mmap_mutex
+* Below pmd_alloc try to allocate memory with GFP_KERNEL while
+* holding i_mmap_mutex so that it could enter direct reclaim path
+* that rmap try to hold i_mmap_mutex again. But it's no problem
+* for hugetlb because pages on hugetlb never could live in LRU so
+* it's false positive. I hope someone fixes it with avoiding pmd_alloc
+* with holding i_mmap_mutex rather than nesting annotation.
+*/
+   mutex_lock_nested(>i_mmap_mutex, SINGLE_DEPTH_NESTING);
vma_interval_tree_foreach(svma, >i_mmap, idx, idx) {
if (svma == vma)
continue;

> -- 
> Michal Hocko
> SUSE Labs
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] TX throttling bug-fixing patch of AX88179_178A

2013-07-23 Thread Grant Grundler

On Tue, Jul 23, 2013 at 7:29 PM, Grant Grundler  wrote:
> On Tue, Jul 23, 2013 at 4:46 PM, David Miller  wrote:
> ...
>> A quick scan shows that smsc75xx, smsc95xx, and ax88179_178a all have
>> this problem.
>>
>> Instead of the patch starting this thread, I'd like to see one that
>> hits all three drivers and removes all SG and TSO features bits from
>> both the ->features _and_ ->hw_features settings.
>
> Since you are asking to remove TSO, do you also want skb_linearize()
> calls in ax88179_178a.c and smsc75xx.c removed as well?

Nevermind...Eric already removed skb_linearize calls in his patch.

cheers,
grant

>
> Not part of the original patch - but based on this thread, Eric seems
> to think calling skb_linearize isn't necessary or helpful either.
>
> cheers,
> grant
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] TX throttling bug-fixing patch of AX88179_178A

2013-07-23 Thread Grant Grundler

On Tue, Jul 23, 2013 at 4:46 PM, David Miller  wrote:
...
> A quick scan shows that smsc75xx, smsc95xx, and ax88179_178a all have
> this problem.
>
> Instead of the patch starting this thread, I'd like to see one that
> hits all three drivers and removes all SG and TSO features bits from
> both the ->features _and_ ->hw_features settings.

Since you are asking to remove TSO, do you also want skb_linearize()
calls in ax88179_178a.c and smsc75xx.c removed as well?

Not part of the original patch - but based on this thread, Eric seems
to think calling skb_linearize isn't necessary or helpful either.

cheers,
grant
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] fs/bio-integrity: fix a potential mem leak

2013-07-23 Thread Gu Zheng

Free the bio_integrity_pool in the fail path of biovec_create_pool
in function bioset_integrity_create().

Signed-off-by: Gu Zheng 
---
 fs/bio-integrity.c |9 +
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
index 8fb4291..6025084 100644
--- a/fs/bio-integrity.c
+++ b/fs/bio-integrity.c
@@ -716,13 +716,14 @@ int bioset_integrity_create(struct bio_set *bs, int 
pool_size)
return 0;
 
bs->bio_integrity_pool = mempool_create_slab_pool(pool_size, bip_slab);
-
-   bs->bvec_integrity_pool = biovec_create_pool(bs, pool_size);
-   if (!bs->bvec_integrity_pool)
+   if (!bs->bio_integrity_pool)
return -1;
 
-   if (!bs->bio_integrity_pool)
+   bs->bvec_integrity_pool = biovec_create_pool(bs, pool_size);
+   if (!bs->bvec_integrity_pool) {
+   mempool_destroy(bs->bio_integrity_pool);
return -1;
+   }
 
return 0;
 }
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] perf tools: Add support for pinned modifier

2013-07-23 Thread Michael Ellerman

This commit adds support for a new modifier "P", which requests that the
event, or group of events, be pinned to the PMU.

This is an oft-requested feature from our HW folks, who want to be able
to run a large number of events, but also want 100% accurate results for
instructions per cycle.

Comparison of results with and without pinning:

$ perf stat -e '{cycles,instructions}:P' -e cycles,instructions,...

  79,590,480,683 cycles #  0.000 GHz
 166,123,716,524 instructions   #  2.09  insns per cycle
#  0.11  stalled cycles per insn
  79,352,134,463 cycles #  0.000 GHz [11.11%]
 165,178,301,818 instructions   #  2.08  insns per cycle
#  0.11  stalled cycles per insn [11.13%]

As you can see although perf does a very good job of scaling the values
in the non-pinned case, there is some small discrepancy.

The patch is fairly straight forward, the one detail is that we need to
make sure we only request pinning for the group leader when we have a
group.

Signed-off-by: Michael Ellerman 
---

I would have used "p" obviously, but that's taken. Are folks happy that
"P" is sufficiently different from "p"? I couldn't think of anything
better.
---
 tools/perf/Documentation/perf-list.txt | 1 +
 tools/perf/util/parse-events.c | 9 +
 tools/perf/util/parse-events.l | 2 +-
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-list.txt 
b/tools/perf/Documentation/perf-list.txt
index 826f3d6..7ecf655 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -29,6 +29,7 @@ counted. The following modifiers exist:
  G - guest counting (in KVM guests)
  H - host counting (not in KVM guests)
  p - precise level
+ P - pin the event to the PMU
 
 The 'p' modifier can be used for specifying how precise the instruction
 address should be. The 'p' modifier can be specified multiple times:
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 2c460ed..962093a 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -687,6 +687,7 @@ struct event_modifier {
int eG;
int precise;
int exclude_GH;
+   int pinned;
 };
 
 static int get_event_modifier(struct event_modifier *mod, char *str,
@@ -698,6 +699,7 @@ static int get_event_modifier(struct event_modifier *mod, 
char *str,
int eH = evsel ? evsel->attr.exclude_host : 0;
int eG = evsel ? evsel->attr.exclude_guest : 0;
int precise = evsel ? evsel->attr.precise_ip : 0;
+   int pinned = evsel ? evsel->attr.pinned : 0;
 
int exclude = eu | ek | eh;
int exclude_GH = evsel ? evsel->exclude_GH : 0;
@@ -730,6 +732,8 @@ static int get_event_modifier(struct event_modifier *mod, 
char *str,
/* use of precise requires exclude_guest */
if (!exclude_GH)
eG = 1;
+   } else if (*str == 'P') {
+   pinned = 1;
} else
break;
 
@@ -756,6 +760,8 @@ static int get_event_modifier(struct event_modifier *mod, 
char *str,
mod->eG = eG;
mod->precise = precise;
mod->exclude_GH = exclude_GH;
+   mod->pinned = pinned;
+
return 0;
 }
 
@@ -806,6 +812,9 @@ int parse_events__modifier_event(struct list_head *list, 
char *str, bool add)
evsel->attr.exclude_host   = mod.eH;
evsel->attr.exclude_guest  = mod.eG;
evsel->exclude_GH  = mod.exclude_GH;
+
+   if (evsel->leader == evsel)
+   evsel->attr.pinned = mod.pinned;
}
 
return 0;
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index e9d1134..587dac0 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -82,7 +82,7 @@ num_hex   0x[a-fA-F0-9]+
 num_raw_hex[a-fA-F0-9]+
 name   [a-zA-Z_*?][a-zA-Z0-9_*?]*
 name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?]*
-modifier_event [ukhpGH]+
+modifier_event [ukhpGHP]+
 modifier_bp[rwx]{1,3}
 
 %%
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: spinlock lockup, rcu stalls etc.

2013-07-23 Thread Linus Torvalds

[ Added Thomas and Ingo due to that timer/nmi thing. ]

On Tue, Jul 23, 2013 at 9:28 AM, Dave Jones  wrote:
> Woke up to a box that I could log into, but would hang as soon as I tried to
> do any disk IO.  This is what hit the logs before that.
>
> [28853.503179] hrtimer: interrupt took 4847 ns

Ugh. There's some nasty timer congestion there..

> [28932.599607] BUG: spinlock lockup suspected on CPU#0, trinity-child2/6990
> [28932.600419]  lock: inode_sb_list_lock+0x0/0x80, .magic: dead4ead, .owner: 
> trinity-child1/6763, .owner_cpu: 1

So the current owner of the lock is cpu 1. The other CPU's agree:

> [28932.60] BUG: spinlock lockup suspected on CPU#2, trinity-child2/6764
> [28932.606669]  lock: inode_sb_list_lock+0x0/0x80, .magic: dead4ead, .owner: 
> trinity-child1/6763, .owner_cpu: 1
> [28932.617231] sending NMI to all CPUs:
> [28932.635092] BUG: spinlock lockup suspected on CPU#3, trinity-child3/6975
> [28932.635095]  lock: inode_sb_list_lock+0x0/0x80, .magic: dead4ead, .owner: 
> trinity-child1/6763, .owner_cpu: 1

.. and their backtrace all points to them trying to take that
spinlock. So that is all consistent.

And here's cpu1, edited down a bit:

> [28932.777623] NMI backtrace for cpu 1
> [28932.777625] INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) 
> took too long to run: 91.230 msecs

Whee. 91 msec? We have something going on there too. irq entry locks?

> [28932.779440] CPU: 1 PID: 6763 Comm: trinity-child1 Not tainted 3.11.0-rc2+ 
> #54
> [28932.782283] RIP: 0010:[]  [] 
> add_preempt_count+0x25/0xf0
> [28932.797761] Call Trace:
> [28932.798737]  
> [28932.799715]  [] delay_tsc+0x61/0xe0
> [28932.800693]  [] __const_udelay+0x29/0x30
> [28932.801674]  [] __rcu_read_unlock+0x54/0xa0
> [28932.802657]  [] cpuacct_account_field+0xf1/0x200
> [28932.804613]  [] account_system_time+0xb0/0x1b0
> [28932.805561]  [] __vtime_account_system+0x35/0x40
> [28932.806506]  [] vtime_account_system.part.2+0x2d/0x50
> [28932.807445]  [] vtime_account_irq_enter+0x55/0x80
> [28932.808365]  [] irq_enter+0x4f/0x90
> [28932.809269]  [] smp_apic_timer_interrupt+0x35/0x60
> [28932.810156]  [] apic_timer_interrupt+0x6f/0x80
> [28932.812756]  [] irq_exit+0xcd/0xe0
> [28932.813618]  [] smp_apic_timer_interrupt+0x45/0x60
> [28932.814483]  [] apic_timer_interrupt+0x6f/0x80
> [28932.815345]  
> [28932.816207]  [] ? retint_restore_args+0xe/0xe
> [28932.817069]  [] ? lock_acquired+0x105/0x3f0
> [28932.817924]  [] ? sync_inodes_sb+0x1c2/0x2a0
> [28932.818767]  [] _raw_spin_lock+0x6c/0x80
> [28932.819616]  [] ? sync_inodes_sb+0x1c2/0x2a0
> [28932.820468]  [] sync_inodes_sb+0x1c2/0x2a0
> [28932.821310]  [] ? wait_for_completion+0xdf/0x110
> [28932.823819]  [] sync_inodes_one_sb+0x19/0x20
> [28932.824649]  [] iterate_supers+0xb2/0x110
> [28932.825477]  [] sys_sync+0x35/0x90
> [28932.826300]  [] tracesys+0xdd/0xe2
> [28932.827119]  [] ? 0x9fff

.. and again, it actually looks like the time is not necessarily spent
inside the spinlock in sync_inodes_sb(), but in a timer interrupt that
just happened to go off during that. I wonder if this is the same
issue that caused that earlier hrtimer long delay.. I'm not
necessarily seeing 91 msecs worth, but..

You seem to have CONFIG_PROVE_RCU_DELAY enabled, which explains that
delay_tsc() call in there. I wonder how much things like that make it
worse. Together with the (crazy bad) back-off in __spin_lock_debug(),
there might be a *lot* of these delays.

That said, the fact that your machine is dead after all this implies
that there is something else wrong than just things being very slow.
But I suspect at least *part* of your problems may be due to these
kinds of debugging options that make things much much worse.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ptrace(PTRACE_ATTACH) [no intervering wait] ptrace(PTRACE_DETACH) may leave tracee stuck

2013-07-23 Thread Mike Galbraith

On Tue, 2013-07-23 at 17:58 +0200, Oleg Nesterov wrote: 
> On 07/23, Mike Galbraith wrote:
> >
> > I received a report that glibc:elf/pldd hangs occasionally, and indeed..
> >
> >   for i in `seq 1 1000`; do taskset -c 3 pldd $$ > /dev/null 2>&1; done
> >
> > ..will do so.  Rummage.
> >
> > ptrace(PTRACE_DETACH) returns -ESRCH when the trap hasn't happened yet,
> > which happens because pldd doesn't wait() before ptrace(PTRACE_DETACH).
> >
> > pldd source:
> >
> [...snip...]
> >
> > Seems this usually works only because cycles expended between attach and
> > detach is usually enough to let trap happen so tracee can set its state
> > to TASK_TRACED as PTRACE_DETACH expects it to be.
> >
> > Is this expected behavior?
> 
> Yes. PTRACE_ATTACH + PTRACE_DETACH is not correct without wait() in
> between, this is expected.

Thanks for confirmation.  The man page was pretty clear (read it after
slogging through source/traces, oh well, educational;) that -ESRCH was
expected, but I wanted to be sure about tracee state thereafter.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/30] ACPI / hotplug / PCI: Major rework + Thunderbolt workarounds

2013-07-23 Thread Yinghai Lu

On Tue, Jul 23, 2013 at 2:39 PM, Rafael J. Wysocki  wrote:
>
> Ugh, stupid bug, sorry about it.  We try to unregister something that may have
> not been registered.
>
> Can you please check if the appended patch helps (on top of
> linux-pm.git/linux-next)?
>
> Rafael
>
>
> ---
>  drivers/pci/hotplug/acpiphp_glue.c |4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
> ===
> --- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
> +++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
> @@ -340,6 +340,7 @@ static acpi_status register_slot(acpi_ha
>
> retval = acpiphp_register_hotplug_slot(slot, sun);
> if (retval) {
> +   slot->slot = NULL;
> bridge->nr_slots--;
> if (retval == -EBUSY)
> warn("Slot %llu already registered by another 
> "
> @@ -429,7 +430,8 @@ static void cleanup_bridge(struct acpiph
> err("failed to remove notify 
> handler\n");
> }
> }
> -   acpiphp_unregister_hotplug_slot(slot);
> +   if (slot->slot)
> +   acpiphp_unregister_hotplug_slot(slot);
> }
>
> mutex_lock(_mutex);
>

yes, that fixes the problem. Thanks

10:~ # echo "PCI0 3" > /sys/kernel/debug/acpi/sci_notify
[  102.231645] ACPI: ACPI device name is , event code is <3>
[  102.233189] ACPI: Notify event is queued
[  102.234326] ACPI: \_SB_.PCI0: Device eject notify on
_handle_hotplug_event_root
10:~ # [  102.357749] ACPI: Device :00:03.0 -x-> \_SB_.PCI0.S03_
[  102.359902] ACPI: Device :00:02.0 -x-> \_SB_.PCI0.VGA_
[  102.362188] ACPI: Device :00:01.3 -x-> \_SB_.PCI0.PX13
[  102.364752] ata1.00: disabled
[  102.372154] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[  102.374523] sd 0:0:0:0: [sda]
[  102.375759] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  102.378173] sd 0:0:0:0: [sda] Stopping disk
[  102.380248] sd 0:0:0:0: [sda] START_STOP FAILED
[  102.381983] sd 0:0:0:0: [sda]
[  102.383167] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  102.387588] ata2.00: disabled
[  102.395254] ACPI: Device :00:01.0 -x-> \_SB_.PCI0.ISA_
[  102.396943] ACPI: Device pci:00 -x-> \_SB_.PCI0
[  102.398162]   acpi_pci_iommu_remove is called for \_SB_.PCI0 88007ab3f1e0
[  102.400253]   acpi_pci_ioapic_remove is called for \_SB_.PCI0
88007ab3f1e0
[  102.402176] pci :00:00.0: freeing pci_dev info
[  102.404247] pci :00:01.0: freeing pci_dev info
[  102.406611] pci :00:01.1: freeing pci_dev info
[  102.408401] pci :00:01.3: freeing pci_dev info
[  102.410276] pci :00:02.0: freeing pci_dev info
[  102.411378] pci :00:03.0: freeing pci_dev info
[  102.412485] pci_bus :00: busn_res: [bus 00-ff] is released
[  102.413945] acpiphp: Slot [3] unregistered
[  102.415189] pci_hotplug: pci_hp_deregister: Removed slot 3 from the list
[  102.418224] acpiphp: release_slot - physical_slot = 3
[  102.420439] pci_bus :00: dev 03, dec refcount to 1
[  102.422689] acpiphp: Slot [4] unregistered
[  102.424592] pci_hotplug: pci_hp_deregister: Removed slot 4 from the list
[  102.427484] acpiphp: release_slot - physical_slot = 4
[  102.429679] pci_bus :00: dev 04, dec refcount to 1
[  102.431492] acpiphp: Slot [5] unregistered
[  102.433169] pci_hotplug: pci_hp_deregister: Removed slot 5 from the list
[  102.435486] acpiphp: release_slot - physical_slot = 5
[  102.436963] pci_bus :00: dev 05, dec refcount to 1
[  102.438140] acpiphp: Slot [6] unregistered
[  102.439116] pci_hotplug: pci_hp_deregister: Removed slot 6 from the list
[  102.440922] acpiphp: release_slot - physical_slot = 6
[  102.442079] pci_bus :00: dev 06, dec refcount to 1
[  102.443280] acpiphp: Slot [7] unregistered
[  102.444286] pci_hotplug: pci_hp_deregister: Removed slot 7 from the list
[  102.445840] acpiphp: release_slot - physical_slot = 7
[  102.447024] pci_bus :00: dev 07, dec refcount to 1
[  102.448272] acpiphp: Slot [8] unregistered
[  102.449236] pci_hotplug: pci_hp_deregister: Removed slot 8 from the list
[  102.450770] acpiphp: release_slot - physical_slot = 8
[  102.451632] pci_bus :00: dev 08, dec refcount to 1
[  102.452870] acpiphp: Slot [9] unregistered
[  102.453848] pci_hotplug: pci_hp_deregister: Removed slot 9 from the list
[  102.455400] acpiphp: release_slot - physical_slot = 9
[  102.456594] pci_bus :00: dev 09, dec refcount to 1
[  102.457557] acpiphp: Slot [10] unregistered
[  102.458542] pci_hotplug: pci_hp_deregister: Removed slot 10 from the list
[  102.460124] acpiphp: release_slot - physical_slot = 10
[  102.461295] pci_bus :00: dev 0a, dec refcount to 1
[  102.462482] acpiphp: Slot [11] unregistered
[  102.463464] pci_hotplug: pci_hp_deregister: Removed slot 11 from the list
[

Re: [PATCH 1/1] ext4: Fix Opts: (null)

2013-07-23 Thread Eric Sandeen

On 7/22/13 5:24 PM, Jóhann B. Guðmundsson wrote:
> null null null no more Opts: (null) but something that actually makes sense to
> human beings...

It's not clear to me how this changes the (null) output...
Have you tested it?  What's the difference in output?

-Eric

> Signed-off-by: Jóhann B. Guðmundsson 
> ---
>  fs/ext4/super.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 85b3dd6..ef141b7 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4088,8 +4088,8 @@ no_journal:
>"the device does not support discard");
>   }
>  
> - ext4_msg(sb, KERN_INFO, "mounted filesystem with%s. "
> -  "Opts: %s%s%s", descr, sbi->s_es->s_mount_opts,
> + ext4_msg(sb, KERN_INFO, "mounted filesystem with%s "
> +  "%s%s%s mount option(s)", descr, sbi->s_es->s_mount_opts,
>*sbi->s_es->s_mount_opts ? "; " : "", orig_data);
>  
>   if (es->s_error_count)
> @@ -4866,7 +4866,7 @@ static int ext4_remount(struct super_block *sb, int 
> *flags, char *data)
>   }
>  #endif
>  
> - ext4_msg(sb, KERN_INFO, "re-mounted. Opts: %s", orig_data);
> + ext4_msg(sb, KERN_INFO, "re-mounted %s mount option(s)", orig_data);

>   kfree(orig_data);
>   return 0;
>  
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, apic: Enable x2APIC physical when cpu < 256 native

2013-07-23 Thread Youquan Song

On Tue, Jul 23, 2013 at 11:17:29AM +0200, Ingo Molnar wrote:
> 
> * Youquan Song  wrote:
> 
> > x2APIC extends APICID from 8 bits to 32 bits, but the device interrupt 
> > routed from IOAPIC or delivered in MSI mode will keep 8 bits destination 
> > APICID. In order to support x2APIC, the VT-d interrupt remapping is 
> > introduced to translate the destination APICID to 32 bits in x2APIC mode 
> > and keep the device compatible in this way.
> > 
> > x2APIC support both logical and physical mode in destination mode.  In 
> > logical destination mode, the 32 bits Logical APICID has 2 sub-fields:
> >  16 bits cluster ID and 16 bits logical ID within the cluster and it is 
> > required VT-d interrupt remapping in x2APIC cluster mode. In physical 
> > destination mode, the 8 bits physical id is compatible with 32 bits 
> > physical id when CPU number < 256. When interrupt remapping 
> > initialization fail on platform with CPU number < 256, current kernel 
> > only enables x2APIC physical mode in virutalization environment, while 
> > we also can enable x2APIC physcial mode in native kernel this situation, 
> > and the device interrupt will use 8 bits destination APICID in physical 
> > mode and be compatible with x2APIC physical when < 256 CPUs.
> >  
> > So we can benefit from x2APIC vs xAPIC MMIO:
> >  - x2APIC MSR read/write is faster than xAPIC mmio
> >  - x2APIC only ICR write to deliver interrupt without polling ICR deliver 
> >status bit and xAPIC need poll to read ICR deliver status bit.
> >  - x2APIC 64 bits ICR access instead of xAPIC two 32 bits access.
> 
> That looks interesting. How many systems are affected by this change in 
> practice? Have you tested it on affected hardware?

Thanks Ingo!
The machines will be affected: CPU support x2APIC and CPU number < 256,
chipset does not support VT-d2 or VT-d is disabled in BIOS. 

I have tested on one of affected hardware, it works.

Thanks
-Youquan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kernel/irq/devres.c: move "kernel/irq/devres.c" to "drivers/base/devres_irq.c"

2013-07-23 Thread Chen Gang

On 07/23/2013 11:31 PM, Greg KH wrote:
> On Tue, Jul 23, 2013 at 03:36:04PM +0800, Chen Gang wrote:
>> "kernel/irq/devres.c" is a driver extension tool for irq (with devres)
>> which is independent on 'GENERIC_HARDIRQS', so it is not suitable to
>> still be in "kernel/irq/" which depends on 'GENERIC_HARDIRQS'.
>>
>> It is a basic tool for drivers, so can move it to "drivers/base/" to be
>> independent on 'GENERIC_HARDIRQS'.
>>
>> It is about irq features, so if can not find other more suitable place,
>> can still let their declaration in "include/linux/interrupts.h".
>>
>> The related error (with randconfig which disable 'GENERIC_HARDIRQS')
>>
>>   drivers/built-in.o: In function `dw_dma_probe':
>>   (.text+0x3747a): undefined reference to `devm_request_threaded_irq'
> 
> Don't fix problems when you are moving files around, that makes it
> _very_ hard to review.
> 

OK, thanks, I should notice next time (originally, I really did not know
about it).

Hmm... but for our case, if move the related file, it also can solve the
related issue, it is only one action.

> Remember, one thing per patch please.
> 

OK, thanks, I will try (that may let me make more patches, which is not
bad for myself ;-)).


>> Signed-off-by: Chen Gang 
>> ---
>>  drivers/base/Makefile |2 +-
>>  drivers/base/devres_irq.c |   94 
>> +
>>  kernel/irq/Makefile   |2 +-
>>  kernel/irq/devres.c   |   94 
>> -
>>  4 files changed, 96 insertions(+), 96 deletions(-)
>>  create mode 100644 drivers/base/devres_irq.c
>>  delete mode 100644 kernel/irq/devres.c
> 
> Please use git when renaming files so that the move is shown in the git
> patch.  As it is, I would have to verify this by hand, and I don't want
> to ever have to do that.
> 

Oh, thanks, I need use git to perform it (I should try to familiar with
git).

> Also, I have no problem with the file being where it is.  This is for
> irqs, which are handled by the interrupt maintainers, no need to put it
> in the driver core, just because it happens to deal with "resources".
> We arrange things for ease of maintainability, not always logically :)
> 

Hmm... normally, 'maintainability' has no conflict with 'logically', if
we feel they are conflict, that means both of them need improvement.

For 'logically', is it suitable to move "resources" from "drivers/base"
to "lib/", since they are already not only for drivers wide, but also
for kernel wide ? (especially, some of "devm*" have already been in "lib/").

For 'maintainability', is it suitable to let "kernel/irq" independent on
'GENERIC_HARDIRQS' or "mv kernel/irq/devres.c kernel/devres_irq.c" ?

:-)

Thanks.
-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3 v6] cpufreq: stats: Add 'load_table' debugfs file to show accumulated data of CPUs

2013-07-23 Thread Chanwoo Choi

Hi Viresh,

On 07/22/2013 08:05 PM, Viresh Kumar wrote:
> On 18 July 2013 16:47, Chanwoo Choi  wrote:
>> diff --git a/drivers/cpufreq/cpufreq_stats.c 
>> b/drivers/cpufreq/cpufreq_stats.c
> 
>> +static int cpufreq_stats_reset_debugfs(struct cpufreq_policy *policy)
>> +{
>> +   struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, 
>> policy->cpu);
>> +   int size;
>> +
>> +   if (!stat)
>> +   return -EINVAL;
>> +
>> +   if (stat->load_table)
>> +   kfree(stat->load_table);
>> +   stat->load_last_index = 0;
>> +
>> +   size = sizeof(*stat->load_table) * stat->load_max_index;
>> +   stat->load_table = kzalloc(size, GFP_KERNEL);
>> +   if (!stat->load_table)
>> +   return -ENOMEM;
> 
> Why are you freeing memory and allocating it again ??

This purpose is reseting the data of stat->load_table.
If you don't agree this, I'll initizliae stat->load_table array as zero(0) with 
loop statement.

> 
>> +   return 0;
>> +}
>> +
>> +static int cpufreq_stats_create_debugfs(struct cpufreq_policy *policy)
>> +{
>> +   struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, 
>> policy->cpu);
>> +   unsigned int idx, size;
>> +   int ret = 0;
>> +
>> +   if (!stat)
>> +   return -EINVAL;
>> +
>> +   if (!policy->cpu_debugfs)
>> +   return -EINVAL;
>> +
>> +   stat->load_last_index = 0;
>> +   stat->load_max_index = CONFIG_NR_CPU_LOAD_STORAGE;
>> +
>> +   /* Allocate memory for storage of CPUs load */
>> +   size = sizeof(*stat->load_table) * stat->load_max_index;
>> +   stat->load_table = kzalloc(size, GFP_KERNEL);
>> +   if (!stat->load_table)
>> +   return -ENOMEM;
>> +
>> +   /* Create debugfs directory and file for cpufreq */
>> +   idx = cpumask_weight(policy->cpus) > 1 ? policy->cpu : 0;
> 
> idx is broken again..

OK, I'll fix it.

> 
>> +   stat->debugfs_load_table = debugfs_create_file("load_table", S_IWUSR,
>> +   policy->cpu_debugfs[idx],
>> +   policy, _table_fops);
>> +   if (!stat->debugfs_load_table) {
>> +   ret = -ENOMEM;
>> +   goto err;
>> +   }
>> +
>> +   pr_debug("Creating debugfs file for CPU%d \n", policy->cpu);
> 
> s/Creating/Created

OK, I'll fixt it.

> 
>> +
>> +   return 0;
>> +err:
>> +   kfree(stat->load_table);
>> +   return ret;
>> +}
>> +
>> +/* should be called late in the CPU removal sequence so that the stats
>> + * memory is still available in case someone tries to use it.
>> + */
> 
> Please write multiline comment correctly..

OK.

> 
>> +static void cpufreq_stats_free_load_table(unsigned int cpu)
>> +{
>> +   struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, cpu);
>> +
>> +   if (stat) {
>> +   pr_debug("Free memory of load_table\n");
>> +   kfree(stat->load_table);
>> +   }
>> +}
>> +
>> +/* must be called early in the CPU removal sequence (before
>> + * cpufreq_remove_dev) so that policy is still valid.
>> + */
>> +static void cpufreq_stats_free_debugfs(unsigned int cpu)
>> +{
>> +   struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, cpu);
>> +
>> +   if (stat) {
>> +   pr_debug("Remove load_table debugfs file\n");
>> +   debugfs_remove(stat->debugfs_load_table);
>> +   }
>> +}
>> +
>> +static void cpufreq_stats_store_load_table(struct cpufreq_freqs *freq,
>> +  unsigned long val)
>> +{
>> +   struct cpufreq_stats *stat;
>> +   int cpu, last_idx;
>> +
>> +   stat = per_cpu(cpufreq_stats_table, freq->cpu);
>> +   if (!stat)
>> +   return;
>> +
>> +   spin_lock(_stats_lock);
>> +
>> +   switch (val) {
>> +   case CPUFREQ_POSTCHANGE:
>> +   if (!stat->load_last_index)
>> +   last_idx = stat->load_max_index;
>> +   else
>> +   last_idx = stat->load_last_index - 1;
>> +
>> +   stat->load_table[last_idx].new = freq->new;
>> +   break;
>> +   case CPUFREQ_LOADCHECK:
>> +   last_idx = stat->load_last_index;
>> +
>> +   stat->load_table[last_idx].time = freq->time;
>> +   stat->load_table[last_idx].old = freq->old;
>> +   stat->load_table[last_idx].new = freq->old;
>> +   for_each_present_cpu(cpu)
>> +   stat->load_table[last_idx].load[cpu] = 
>> freq->load[cpu];
>> +
>> +   if (++stat->load_last_index == stat->load_max_index)
>> +   stat->load_last_index = 0;
>> +   break;
>> +   }
>> +
>> +   spin_unlock(_stats_lock);
>> +}
>> +
>>  static int freq_table_get_index(struct cpufreq_stats *stat, unsigned int 
>> freq)
>>  {
>> int index;
>> @@ -204,7 +386,7 @@ static int cpufreq_stats_create_table(struct 
>>

Re: [Ksummit-2013-discuss] [ATTEND] How to act on LKML

2013-07-23 Thread Steven Rostedt

On Tue, 2013-07-23 at 21:48 -0400, Paul Gortmaker wrote:
> C'mon folks.  This is beyond silly.  Let us look at the things that we
> can really change, or at least influence change within.  Things that
> really matter to linux today and tomorrow.

Ah, so there is middle ground between creationism and evolution!

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 10/27] drivers/memory: don't check resource with devm_ioremap_resource

2013-07-23 Thread Joe Perches

Hi again Wolfram

The next time you submit a patch series
please use a [PATCH 0/N] cover-letter
with a description of all the patches
and cc all the various email lists
on that [PATCH 0/N].

This cover letter can be created via
git format-patch --cover-letter

That way, general replies to the series
can be to the 0/N cover letter and all
mailing lists can receive it.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Ksummit-2013-discuss] [ATTEND] How to act on LKML

2013-07-23 Thread Paul Gortmaker

On Tue, Jul 23, 2013 at 9:26 PM, James Bottomley
 wrote:
> On Tue, 2013-07-23 at 19:51 -0500, Felipe Contreras wrote:
>> On Sat, Jul 20, 2013 at 8:02 PM, Daniel Phillips
>>  wrote:
>> > On 07/20/2013 12:36 PM, Felipe Contreras wrote:
>> >> I think you need more than "hope" to change one of the fundamental
>> >> rules of LKML; be open and honest, even if that means expressing your
>> >> opinion in a way that others might consider offensive and colorful.
>> >
>> > Logical fallacy type: bifurcation. You can be open and honest without
>> > being offensive or abusive.
>>
>> You are mistaken, that is not what the false dichotomy fallacy means.
>> I'm not saying you have to be A (open and honest), or B (polite), and
>> that you can't be both, if that's what you arguing (which seems to be
>> the case), you are wrong, and to argue against that position would be
>> a straw man fallacy.
>>
>> Your mistaken fallacy seems to be that you think one can *always* be
>> both A (open and honest), and B (polite), I'm not sure if there's a
>> name for that fallacy, but you don't provide any evidence for that
>> claim.
>
> It's not actually one of the original logical fallacies, but it's called
> argument to moderation or false compromise: The fallacy is the
> assumption that the original statements represent extremal positions of
> a continuum so there must always be middle ground which represents the
> correct statement.  To those accepting the fallacy making the middle
> ground statement by that fact alone demonstrates the invalidity of the
> previous proposition.

And when so many of us had convinced ourselves that this thread could
not possibly descend any further into the off-topic weeds...  Good job.
That assumption has now been shattered by bringing in ancient Greece.

Given that, I'd like to propose a KS topic that covers Adam Smith, and
John Stuart Mill,  Leviathan by Hobbes, and The Politics by Aristotle.

C'mon folks.  This is beyond silly.  Let us look at the things that we
can really change, or at least influence change within.  Things that
really matter to linux today and tomorrow.

P.
---

>
> I think it's not in the original fallacies because they come from Greek
> rhetoric and the Greeks believed dialectic: the taking opposite
> positions and arguing them thoroughly.  It's only with the advent of
> Western European political systems that we're conditioned to seek
> compromise without rigorous examination.  This actually makes argument
> to moderation one of the most effective rhetorical tools in use today
> for discrediting an opponent's argument without actually addressing it.
>
> James
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: device thermal limits represented in device tree nodes

2013-07-23 Thread Stephen Warren

On 07/22/2013 07:25 AM, Eduardo Valentin wrote:
> Hello Grant and Rob,
> 
> (Resending, as I got a message saying: 
> : Recipient address rejected:
> User has moved to devicetree at vger.kernel.org)
> 
> I am writing this email to you specifically to ask your technical 
> assessment with respect to representing device thermal limits as
> device tree nodes. I am proposing to introduce device tree nodes to
> describe these limits as thermal zones, their composition and their
> relations with cooling devices and other thermal zones (thermal
> data).

Given:
https://lkml.org/lkml/2013/7/20/69
[PATCH 3/3] MAINTAINERS: Refactor device tree maintainership

I'm explicitly CCing a few people besides Grant/Rob, and qouting the
whole email.

>From my perspective, the concept of including thermal limits in DT
seems reasonable, although I haven't looked at the proposed binding
itself in detail yet.

> As you should know, device thermal limits are part of hardware 
> specification. Considering your board layout, mechanics, power 
> dissipation and composition of ICs, etc, that will impose thermal 
> requirements on your system, and infringing these limits can lead
> to device damage, device life time reduction or even end user harm.
> Thus, the thermal data help to describe the hardware limits and
> what needs to be done if those limits are crosses, as part of your
> board design and non-functional requirements. Obviously that is
> very dependent on your hardware, and not all of them will have
> these non-functional requirements. Besides, describing these limits
> has *nothing* to do with how you actually find these limits.
> 
> In any case, there is a need to properly represent these
> requirements and I am proposing to have this representation in
> device tree. There were already couple of counter-arguments
> claiming this is actually about configuration and performance
> profile description. But I still stand against these two readings
> of this proposal and again state that if one interprets it as
> configuration or performance profile, that is a mis-understanding
> [0]. Let me state it clear (again [1]), my proposal is to describe
> hardware thermal limits, because these limits are part of a 
> hardware specification; representing in device tree would not
> infringe the original purpose of this data structure  ("The Device
> Tree is a data structure for describing hardware."[2]).
> 
> Before I explain my proposal, I want to highlight also that these
> data is represented elsewhere already and it is reused across
> different OS's. Thermal data is described using ACPI [3] and
> operating systems ACPI-aware do support the interpretation of
> thermal data. Linux is one example of such systems (I believe I do
> not need to enlist here all systems supporting ACPI). On the other
> hand, not all systems have ACPI or are specified to use ACPI.
> Thus, here is another reason to represent properly thermal data, so
> that we can scale across systems.
> 
> In the specific case of Linux, the common thermal concepts between
> ACPI systems and non-ACPI systems have been represented in the
> thermal framework (CONFIG_THERMAL). Today, on ACPI systems, thermal
> data is fetched from bootloader with help from the common ACPI
> parser. For non-ACPI systems, the thermal data is actually coded as
> part of device drivers.
> 
> So, to the point, a brief explanation of my proposal goes as
> follows: i   - trip points: a node to describe a point in the
> temperature domain in which the system has to take an action. This
> node describes just the point, not the action. Properties here are
> temperature, hysteresis, and type (critical, hot, passive, active,
> etc). ii  - binding parameters: the bind_param node is a node to
> describe how actions (cooling devices) get assigned to trip points.
> Cooling devices are expected to be loaded in the target system.
> Properties here are: cooling device name, weight, trip_mask and
> limits. iii - thermal zones: the thermal_zone node is the node
> containing all the required info for describing a thermal zone with
> hardware thermal limitation, including its bindings with cooling
> devices. Properties here are:  type, passive_delay, polling_delay,
> governor. The thermal_zone node must contain, apart from its own
> properties, one node containing trip nodes and one node containing
> all the zone bind parameters.
> 
> Here is an example (on OMAP4430): thermal_zone { type = "CPU"; mask
> = <0x03>; /* trips writability */ passive_delay = <250>; /*
> milliseconds */ polling_delay = <1000>; /* milliseconds */ governor
> = "step_wise"; trips { alert@10{ temperature = <10>; /*
> milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
> ; }; crit@125000{ temperature = <125000>; /*
> milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
> ; }; }; bind_params { action@0{ 
> cooling_device = "thermal-cpufreq"; weight = <100>; /* percentage
> */ mask = <0x01>; /* no limits, using defaults */ };

Re: [PATCH 01/21] acpi: Print Hot-Pluggable Field in SRAT.

2013-07-23 Thread Tang Chen


On 07/24/2013 02:48 AM, Tejun Heo wrote:

On Fri, Jul 19, 2013 at 03:59:14PM +0800, Tang Chen wrote:

The Hot-Pluggable field in SRAT suggests if the memory could be
hotplugged while the system is running. Print it as well when
parsing SRAT will help users to know which memory is hotpluggable.

Signed-off-by: Tang Chen
Reviewed-by: Wanpeng Li


Acked-by: Tejun Heo

But a nit below


+   pr_info("SRAT: Node %u PXM %u [mem %#010Lx-%#010Lx] %s\n",
+   node, pxm,
+   (unsigned long long) start, (unsigned long long) end - 1,
+   hotpluggable ? "Hot Pluggable" : "");


The following would be more conventional.

   "...10Lx]%s\n", ..., hotpluggable ? " Hot Pluggable" : ""

Also, isn't "Hot Pluggable" a bit too verbose?  "hotplug" should be
fine, I think.



Hi tj, Joe,

OK，will change it as you guys said.
Thank you very much.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Ksummit-2013-discuss] [ATTEND] How to act on LKML

2013-07-23 Thread Steven Rostedt

On Tue, 2013-07-23 at 18:26 -0700, James Bottomley wrote:

> I think it's not in the original fallacies because they come from Greek
> rhetoric and the Greeks believed dialectic: the taking opposite
> positions and arguing them thoroughly.  It's only with the advent of
> Western European political systems that we're conditioned to seek
> compromise without rigorous examination.  This actually makes argument
> to moderation one of the most effective rhetorical tools in use today
> for discrediting an opponent's argument without actually addressing it.

What? Really? You mean the truth doesn't lie in the middle between
evolution and creationism?

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 10/27] drivers/memory: don't check resource with devm_ioremap_resource

2013-07-23 Thread Joe Perches

On Tue, 2013-07-23 at 18:27 -0700, Stephen Warren wrote:
> On 07/23/2013 11:25 AM, Joe Perches wrote:
> > On Tue, 2013-07-23 at 20:01 +0200, Wolfram Sang wrote:
> >> devm_ioremap_resource does sanity checks on the given resource. No need to
> >> duplicate this in the driver.
[]
> > This is the first and only one of the patch series I looked at.
> > 
> >> diff --git a/drivers/memory/tegra20-mc.c b/drivers/memory/tegra20-mc.c
> > []
> >> @@ -218,8 +218,6 @@ static int tegra20_mc_probe(struct platform_device 
> >> *pdev)
> >>struct resource *res;
> >>  
> >>res = platform_get_resource(pdev, IORESOURCE_MEM, i);
> >> -  if (!res)
> >> -  return -ENODEV;
> >>mc->regs[i] = devm_ioremap_resource(>dev, res);
> > 
> > I'm not so sure this is appropriate.
> > 
> > devm_ioremap_resource returns ERR_PTR(-EINVAL) for
> > null resource so this changes the return.
> 
> I think the exact return value is probably pretty arbitrary here.

I think so as well, but it takes code inspection to
determine whether or not there's any code impact.

I want to make sure Wolfram has done that inspection.

> > devm_ioremap_resource also emits a noisy dev_err
> > message when resource is NULL.
> > 
> > It's a probe and before the message log would be silent
> > but now there's a new dmesg.
> 
> I think those changes are fine, at least for this driver. It's a bug if
> the required resources are missing, and having probe() actively point
> out why it's failing can only be a good thing in my book.

Again, I haven't looked at _all_ the paths for all
of these patches, I just picked one at random.

Extra dmesg output with some device probes that are
expected to fail is not good.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 10/27] drivers/memory: don't check resource with devm_ioremap_resource

2013-07-23 Thread Stephen Warren

On 07/23/2013 11:25 AM, Joe Perches wrote:
> On Tue, 2013-07-23 at 20:01 +0200, Wolfram Sang wrote:
>> devm_ioremap_resource does sanity checks on the given resource. No need to
>> duplicate this in the driver.
> 
> Hi Wolfram:
> 
> This is the first and only one of the patch series I looked at.
> 
>> diff --git a/drivers/memory/tegra20-mc.c b/drivers/memory/tegra20-mc.c
> []
>> @@ -218,8 +218,6 @@ static int tegra20_mc_probe(struct platform_device *pdev)
>>  struct resource *res;
>>  
>>  res = platform_get_resource(pdev, IORESOURCE_MEM, i);
>> -if (!res)
>> -return -ENODEV;
>>  mc->regs[i] = devm_ioremap_resource(>dev, res);
> 
> I'm not so sure this is appropriate.
> 
> devm_ioremap_resource returns ERR_PTR(-EINVAL) for
> null resource so this changes the return.

I think the exact return value is probably pretty arbitrary here.

> devm_ioremap_resource also emits a noisy dev_err
> message when resource is NULL.
> 
> It's a probe and before the message log would be silent
> but now there's a new dmesg.

I think those changes are fine, at least for this driver. It's a bug if
the required resources are missing, and having probe() actively point
out why it's failing can only be a good thing in my book.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/27] drivers/amba: don't check resource with devm_ioremap_resource

2013-07-23 Thread Stephen Warren

On 07/23/2013 11:01 AM, Wolfram Sang wrote:
> devm_ioremap_resource does sanity checks on the given resource. No need to
> duplicate this in the driver.
> 
> Signed-off-by: Wolfram Sang 
> ---
> Please apply via the subsystem-tree.

Russell, I assume you'll take this patch?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] tools, perf: Add a precise event qualifier v2

2013-07-23 Thread Andi Kleen

On Tue, Jul 23, 2013 at 08:39:09PM -0400, Sasha Levin wrote:
> On 07/23/2013 06:51 PM, Andi Kleen wrote:
> >On Tue, Jul 23, 2013 at 05:27:43PM -0400, Vince Weaver wrote:
> >>>
> >>>I hate having to justify why breaking the ABI is unacceptable.
> >Well it's a testing ABI, so we can do changes to it.
> 
> The testing ABI has a simple policy about changes:
> 
>   The interface can be changed to add new features, but the
>   current interface will not break by doing this, unless grave
>   errors or security problems are found in them.
> 
> It's probably fine to change a testing ABI once in a while, but when things
> like trinity start breaking that often due to ABI changes in the same exact
> place, that's too much IMO.

It sounds like trinity is breaking (well printing a message, not really
breaking) on any addition. So if we follow that the perf sysfs interface
would be completely frozen and can never be extended over today.

I don't think it's a big problem that a test tool needs to be extended
when the software it's testing changes.

If there are enough other widely used programs that actually break from
additions probably would need a v2 of the sysfs interface for extensions
(with new file or directory names), and keep v1 frozen for
compatibility. 

But I don't think that's the case today?

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 10/27] drivers/memory: don't check resource with devm_ioremap_resource

2013-07-23 Thread Stephen Warren

On 07/23/2013 11:01 AM, Wolfram Sang wrote:
> devm_ioremap_resource does sanity checks on the given resource. No need to
> duplicate this in the driver.
> 
> Signed-off-by: Wolfram Sang 
> ---
> Please apply via the subsystem-tree.

Greg KH usually commits patches to thus tree. I Cc'd him here.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 08/27] drivers/iommu: don't check resource with devm_ioremap_resource

2013-07-23 Thread Stephen Warren

On 07/23/2013 11:01 AM, Wolfram Sang wrote:
> devm_ioremap_resource does sanity checks on the given resource. No need to
> duplicate this in the driver.
> 
> Signed-off-by: Wolfram Sang 
> ---
> Please apply via the subsystem-tree.

You probably want to Cc the usual commiter (Joerg Roedel
). I've done so here.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Ksummit-2013-discuss] [ATTEND] How to act on LKML

2013-07-23 Thread James Bottomley

On Tue, 2013-07-23 at 19:51 -0500, Felipe Contreras wrote:
> On Sat, Jul 20, 2013 at 8:02 PM, Daniel Phillips
>  wrote:
> > On 07/20/2013 12:36 PM, Felipe Contreras wrote:
> >> I think you need more than "hope" to change one of the fundamental
> >> rules of LKML; be open and honest, even if that means expressing your
> >> opinion in a way that others might consider offensive and colorful.
> >
> > Logical fallacy type: bifurcation. You can be open and honest without
> > being offensive or abusive.
> 
> You are mistaken, that is not what the false dichotomy fallacy means.
> I'm not saying you have to be A (open and honest), or B (polite), and
> that you can't be both, if that's what you arguing (which seems to be
> the case), you are wrong, and to argue against that position would be
> a straw man fallacy.
> 
> Your mistaken fallacy seems to be that you think one can *always* be
> both A (open and honest), and B (polite), I'm not sure if there's a
> name for that fallacy, but you don't provide any evidence for that
> claim.

It's not actually one of the original logical fallacies, but it's called
argument to moderation or false compromise: The fallacy is the
assumption that the original statements represent extremal positions of
a continuum so there must always be middle ground which represents the
correct statement.  To those accepting the fallacy making the middle
ground statement by that fact alone demonstrates the invalidity of the
previous proposition.

I think it's not in the original fallacies because they come from Greek
rhetoric and the Greeks believed dialectic: the taking opposite
positions and arguing them thoroughly.  It's only with the advent of
Western European political systems that we're conditioned to seek
compromise without rigorous examination.  This actually makes argument
to moderation one of the most effective rhetorical tools in use today
for discrediting an opponent's argument without actually addressing it.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3 v6] cpufreq: Add debugfs directory for cpufreq

2013-07-23 Thread Chanwoo Choi

Hi Viresh,

On 07/22/2013 07:11 PM, Viresh Kumar wrote:
> On 18 July 2013 16:47, Chanwoo Choi  wrote:
>> +#ifdef CONFIG_CPU_FREQ_STAT
>> +/* The cpufreq_debugfs is used to create debugfs root directory for 
>> CPUFreq. */
>> +static struct dentry *cpufreq_debugfs;
>> +
>> +static int cpufreq_create_debugfs_dir(struct cpufreq_policy *policy,
>> + struct device *dev)
>> +{
>> +   char name[CPUFREQ_NAME_LEN];
>> +   unsigned int cpus, size, idx;
>> +
>> +   if (!cpufreq_debugfs)
>> +   return -EINVAL;
>> +
>> +   cpus = cpumask_weight(policy->cpus);
> 
> I remember I told you not to use policy->cpus for this purpose?? But
> related_cpus.

You're right. I'll use policy->related_cpus instead of policy->cpus.

> 
>> +   idx = cpus > 1 ? policy->cpu : 0;
> 
>> +   policy->cpu_debugfs[idx] = debugfs_create_dir(name, cpufreq_debugfs);
> 
> This is broken. A policy may contain cpus 9,10 only.. You will allocate array
> for 2 cpus and try to access cpu_debugfs[9] :)

Right, I'll consider other method to resolve issue related to index of array.

> 
>> +   if (!policy->cpu_debugfs[idx]) {
>> +   pr_err("creating debugfs directory failed\n");
>> +   return -ENODEV;
>> +   }
>> +
>> +   return 0;
>> +}
>> +
>> +static int cpufreq_create_debugfs_symlink(struct cpufreq_policy *policy,
>> +  unsigned int src_cpu,
>> +  unsigned int dest_cpu)
> 
> Only use policy and cpu for which symlink has to be created as param
> to this routine. And create link to policy->cpu.
> 

OK, I'll simplify function prototype(cpufreq_create_debugfs_symlink) by removing
unnecessary parameter.

>> +{
>> +   char symlink_name[CPUFREQ_NAME_LEN];
>> +   char target_name[CPUFREQ_NAME_LEN];
>> +
>> +   if (!cpufreq_debugfs)
>> +   return -EINVAL;
>> +
>> +   if (!policy->cpu_debugfs[src_cpu])
>> +   return -EINVAL;
>> +
>> +   sprintf(symlink_name, "cpu%d", dest_cpu);
>> +   sprintf(target_name, "./cpu%d", src_cpu);
>> +   policy->cpu_debugfs[dest_cpu] = debugfs_create_symlink(
>> +   symlink_name,
>> +   cpufreq_debugfs,
>> +   target_name);
>> +   if (!policy->cpu_debugfs[dest_cpu]) {
>> +   pr_err("creating debugfs symlink failed\n");
>> +   return -ENODEV;
>> +   }
>> +
>> +   return 0;
>> +}
>> +
>> +static void cpufreq_remove_debugfs_dir(struct cpufreq_policy *policy,
>> +  unsigned int cpu)
>> +{
>> +   unsigned int idx = cpumask_weight(policy->cpus) > 1 ? cpu : 0;
>> +
>> +   if (!policy->cpu_debugfs[idx])
>> +   return;
>> +
>> +   debugfs_remove_recursive(policy->cpu_debugfs[idx]);
> 
> Whey do we need recursive here? And what exactly does recursive will
> do?
> 

If cpu is last user of policy, __cpufreq_remove_dev() have to remove debugfs 
directory
and child file/directory of root debugfs directory. So, I used 
debugfs_remove_recursive() function.

>> +}
>> +
> 
> same problem here too.
>> +static void cpufreq_move_debugfs_dir(struct cpufreq_policy *policy,
>> +unsigned int new_cpu)
>> +{
>> +   struct dentry *old_entry, *new_entry;
>> +   char new_dir_name[CPUFREQ_NAME_LEN];
>> +   unsigned int j, old_cpu = policy->cpu;
>> +
>> +   if (!policy->cpu_debugfs[new_cpu])
>> +   return;
>> +
>> +   /*
>> +* Remove symbolic link of debugfs directory except for debugfs
>> +* directory of old_cpu.
>> +*/
>> +   for_each_present_cpu(j) {
>> +   if (old_cpu == j)
>> +   continue;
>> +
>> +   debugfs_remove(policy->cpu_debugfs[j]);
> 
> Why you need this? We aren't removing the earlier dentry at all here.
> 
>> +   }
>> +
>> +   /*
>> +* Change debugfs directory name from as following:
>> +* - old debugfs dir name : /sys/kernel/debugfs/cpufreq/cpu${old_cpu}
>> +* - new debugfs dir name : /sys/kernel/debugfs/cpufreq/cpu${new_cpu}
>> +*/
>> +   sprintf(new_dir_name, "cpu%d", new_cpu);
>> +   old_entry = policy->cpu_debugfs[old_cpu];
>> +   new_entry = debugfs_rename(cpufreq_debugfs, old_entry,
>> +  cpufreq_debugfs, new_dir_name);
> 
> This routine returns old_entry only.. and so you can simply create a
> single routine with name dentry.

I used 'new_entry' variable to improve readability to distinguish between 
old_entry and new_entry.
But, as your comment, I'll simplify this statement to remove unnecessary code.

> 
>> +   if (!new_entry) {
>> +   pr_err("changing debugfs directory name failed\n");
>> +   goto err_rename;
>> +   }
>> +
>> +

Re: [REVIEW][PATCH] vfs: Lock in place mounts from more privileged users

2013-07-23 Thread Andy Lutomirski

On Tue, Jul 23, 2013 at 11:30 AM, Eric W. Biederman
 wrote:
>
> When creating a less privileged mount namespace or propogating mounts
> from a more privileged to a less privileged mount namespace lock the
> submounts so they may not be unmounted individually in the child mount
> namespace revealing what is under them.

I would propose a different rule: if vfsmount b is mounted on vfsmount
a, then to unmount b, you must be ns_capable(CAP_SYS_MOUNT) on either
a's namespace or b's namespace.  The idea is that you should be able
to see under a mount if you own the parent (because it's yours) or if
you own the child (because you, or someone no more privileged than
you, put it there).  This may result in a simpler patch and should do
much the same thing.

>
> This enforces the reasonable expectation that it is not possible to
> see under a mount point.  Most of the time mounts are on empty
> directories and revealing that does not matter, however I have seen an
> occassionaly sloppy configuration where there were interesting things
> concealed under a mount point that probably should not be revealed.
>
> Expirable submounts are not locked because they will eventually
> unmount automatically so whatever is under them already needs
> to be safe for unprivileged users to access.
>
> From a practical standpoint these restrictions do not appear to be
> significant for unprivileged users of the mount namespace.  Recursive
> bind mounts and pivot_root continues to work, and mounts that are
> created in a mount namespace may be unmounted there.  All of which
> means that the common idiom of keeping a directory of interesting
> files and using pivot_root to throw everything else away continues to
> work just fine.

Is there some kind of recursive unmount that will get rid of the
pivot_root result and everything under it?

In any case, I think that something like this patch is probably
-stable material: I suspect that things like seunshare and systemd's
instance directories are currently insecure.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Panic - Bisected] f1a18a10566081abfce1649c2f3884b28fff7372 cases panic on boot

2013-07-23 Thread Zhang Rui

On Tue, 2013-07-23 at 21:56 -0300, Kevin Winchester wrote:
> On 22 July 2013 23:11, Linus Torvalds  wrote:
> > On 22 July 2013 21:45, Kevin Winchester  wrote:
> >> I have found that the new CPU Package temperature thermal driver introduced
> >> in this merge window causes my HP laptop to panic on boot.
> >
> > I just merged Zhang's pull request that should contain a fix for this,
> > but was planning on the allmoconfig build finishing before pushing it
> > out. Give me a few minutes..
> >
> 
> Yes, I have tested with your latest tree, and it works even with the
> driver enabled.
> 
Good to know. Thanks for testing. :)

-rui


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fs: bio-integrity: fix possible segmentation fault

2013-07-23 Thread Gu Zheng

On 07/24/2013 07:12 AM, Andi Shyti wrote:

> free bvec_integrity_pool if it's allocated, not bio_integrity_pool
> 
> Signed-off-by: Andi Shyti 


Reviewed-by: Gu Zheng 

> ---
>  fs/bio-integrity.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
> index 8fb4291..45e944f 100644
> --- a/fs/bio-integrity.c
> +++ b/fs/bio-integrity.c
> @@ -734,7 +734,7 @@ void bioset_integrity_free(struct bio_set *bs)
>   mempool_destroy(bs->bio_integrity_pool);
>  
>   if (bs->bvec_integrity_pool)
> - mempool_destroy(bs->bio_integrity_pool);
> + mempool_destroy(bs->bvec_integrity_pool);
>  }
>  EXPORT_SYMBOL(bioset_integrity_free);
>  


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Panic - Bisected] f1a18a10566081abfce1649c2f3884b28fff7372 cases panic on boot

2013-07-23 Thread Kevin Winchester

On 22 July 2013 23:11, Linus Torvalds  wrote:
> On 22 July 2013 21:45, Kevin Winchester  wrote:
>> I have found that the new CPU Package temperature thermal driver introduced
>> in this merge window causes my HP laptop to panic on boot.
>
> I just merged Zhang's pull request that should contain a fix for this,
> but was planning on the allmoconfig build finishing before pushing it
> out. Give me a few minutes..
>

Yes, I have tested with your latest tree, and it works even with the
driver enabled.

Thanks!

Kevin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Ksummit-2013-discuss] [ATTEND] How to act on LKML

2013-07-23 Thread Felipe Contreras

On Sat, Jul 20, 2013 at 8:02 PM, Daniel Phillips
 wrote:
> On 07/20/2013 12:36 PM, Felipe Contreras wrote:
>> I think you need more than "hope" to change one of the fundamental
>> rules of LKML; be open and honest, even if that means expressing your
>> opinion in a way that others might consider offensive and colorful.
>
> Logical fallacy type: bifurcation. You can be open and honest without
> being offensive or abusive.

You are mistaken, that is not what the false dichotomy fallacy means.
I'm not saying you have to be A (open and honest), or B (polite), and
that you can't be both, if that's what you arguing (which seems to be
the case), you are wrong, and to argue against that position would be
a straw man fallacy.

Your mistaken fallacy seems to be that you think one can *always* be
both A (open and honest), and B (polite), I'm not sure if there's a
name for that fallacy, but you don't provide any evidence for that
claim.

And even supposing that such an obvious fallacy (that one can *always*
be both open and honest, and polite) was true, the fact that something
*can* be done, doesn't mean it *should* be done.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1764 matches

Mail list logo