date:20130825

RE: [PATCH 4/4] Documentation: Add device tree bindings for Freescale FTM PWM

2013-08-25 Thread Xiubo Li-B47053

Hi Thierry,


> Subject: Re: [PATCH 4/4] Documentation: Add device tree bindings for
> Freescale FTM PWM
> 
> On Thu, Aug 22, 2013 at 08:26:10AM +0200, Sascha Hauer wrote:
> > On Thu, Aug 22, 2013 at 02:55:42AM +, Xiubo Li-B47053 wrote:
> > > Hi Tomasz,
> > >
> > > Thanks for your comments.
> > >
> > >
> > > > Could you explain meaning of this property more precisely? I'm
> > > > interested especially how is this related to the PWM IP block and
> boards.
> > > >
> > >
> > > Yes.
> > > There are 8 channels most. While the pinctrls of 4th and 5th
> > > channels could be used by uart's Rx and Tx, then these 2 channels
> > > won't be used for pwm output, so there will be 6 channels available
> by the pwm.
> > > Thus, the pwm chip will register only 6 pwms(6 channels)
> > > most("fsl,pwm-channel-orders = {0 1 2 3 6 7}").And also the "fsl,pwm-
> channel-number" will be 6.
> >
> > If the chip has eight PWMs I would register all of them. If some of
> > them are not routed out by the pinmux then just nothing happens if you
> > use them. In a sane devicetree they won't be referenced anyway when
> > they are not routed out of the SoC.
> 
> In that case, shouldn't this be hooked up to the pinctrl subsystem as
> well? As I understand the above, the logical thing would be for each PWM
> channel's .request() operation to configure the pinmuxing appropriately.
> And if it can't be configured as necessary then .request() should return
> an error (or propagate the error from the pinctrl subsystem).
> 

That's maybe better, if so, the pinctrl configuration must be split into two 
steps:
1, get the channel pinctrl "active" and "idle" states by callig 
pinctrl_lookup_state() in .request().
2, select the proper state in .enable()/.disable().




Thanks very much.

--
Best Regards.
Xiubo



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 8/9] nohz_full: Add full-system-idle state machine

2013-08-25 Thread Lai Jiangshan

On 08/20/2013 10:47 AM, Paul E. McKenney wrote:
> From: "Paul E. McKenney" 
> 
> This commit adds the state machine that takes the per-CPU idle data
> as input and produces a full-system-idle indication as output.  This
> state machine is driven out of RCU's quiescent-state-forcing
> mechanism, which invokes rcu_sysidle_check_cpu() to collect per-CPU
> idle state and then rcu_sysidle_report() to drive the state machine.
> 
> The full-system-idle state is sampled using rcu_sys_is_idle(), which
> also drives the state machine if RCU is idle (and does so by forcing
> RCU to become non-idle).  This function returns true if all but the
> timekeeping CPU (tick_do_timer_cpu) are idle and have been idle long
> enough to avoid memory contention on the full_sysidle_state state
> variable.  The rcu_sysidle_force_exit() may be called externally
> to reset the state machine back into non-idle state.
> 
> For large systems the state machine is driven out of RCU's
> force-quiescent-state logic, which provides good scalability at the price
> of millisecond-scale latencies on the transition to full-system-idle
> state.  This is not so good for battery-powered systems, which are usually
> small enough that they don't need to care about scalability, but which
> do care deeply about energy efficiency.  Small systems therefore drive
> the state machine directly out of the idle-entry code.  The number of
> CPUs in a "small" system is defined by a new NO_HZ_FULL_SYSIDLE_SMALL
> Kconfig parameter, which defaults to 8.  Note that this is a build-time
> definition.
> 
> Signed-off-by: Paul E. McKenney 
> Cc: Frederic Weisbecker 
> Cc: Steven Rostedt 
> Cc: Lai Jiangshan 
> [ paulmck: Use true and false for boolean constants per Lai Jiangshan. ]
> Reviewed-by: Josh Triplett 
> ---
>  include/linux/rcupdate.h |  18 +++
>  kernel/rcutree.c |  16 ++-
>  kernel/rcutree.h |   5 +
>  kernel/rcutree_plugin.h  | 284 
> ++-
>  kernel/time/Kconfig  |  27 +
>  5 files changed, 343 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 30bea9c..f1f1bc3 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -1011,4 +1011,22 @@ static inline bool rcu_is_nocb_cpu(int cpu) { return 
> false; }
>  #endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
>  
>  
> +/* Only for use by adaptive-ticks code. */
> +#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
> +extern bool rcu_sys_is_idle(void);
> +extern void rcu_sysidle_force_exit(void);
> +#else /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
> +
> +static inline bool rcu_sys_is_idle(void)
> +{
> + return false;
> +}
> +
> +static inline void rcu_sysidle_force_exit(void)
> +{
> +}
> +
> +#endif /* #else #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
> +
> +
>  #endif /* __LINUX_RCUPDATE_H */
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 7b5be56..eca70f44 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -734,6 +734,7 @@ static int dyntick_save_progress_counter(struct rcu_data 
> *rdp,
>bool *isidle, unsigned long *maxj)
>  {
>   rdp->dynticks_snap = atomic_add_return(0, >dynticks->dynticks);
> + rcu_sysidle_check_cpu(rdp, isidle, maxj);
>   return (rdp->dynticks_snap & 0x1) == 0;
>  }
>  
> @@ -1373,11 +1374,17 @@ int rcu_gp_fqs(struct rcu_state *rsp, int 
> fqs_state_in)
>   rsp->n_force_qs++;
>   if (fqs_state == RCU_SAVE_DYNTICK) {
>   /* Collect dyntick-idle snapshots. */
> + if (is_sysidle_rcu_state(rsp)) {
> + isidle = 1;
> + maxj = jiffies - ULONG_MAX / 4;
> + }
>   force_qs_rnp(rsp, dyntick_save_progress_counter,
>, );
> + rcu_sysidle_report_gp(rsp, isidle, maxj);
>   fqs_state = RCU_FORCE_QS;
>   } else {
>   /* Handle dyntick-idle and offline CPUs. */
> + isidle = 0;
>   force_qs_rnp(rsp, rcu_implicit_dynticks_qs, , );
>   }
>   /* Clear flag to prevent immediate re-entry. */
> @@ -2103,9 +2110,12 @@ static void force_qs_rnp(struct rcu_state *rsp,
>   cpu = rnp->grplo;
>   bit = 1;
>   for (; cpu <= rnp->grphi; cpu++, bit <<= 1) {
> - if ((rnp->qsmask & bit) != 0 &&
> - f(per_cpu_ptr(rsp->rda, cpu), isidle, maxj))
> - mask |= bit;
> + if ((rnp->qsmask & bit) != 0) {
> + if ((rnp->qsmaskinit & bit) != 0)
> + *isidle = 0;
> + if (f(per_cpu_ptr(rsp->rda, cpu), isidle, maxj))
> + mask |= bit;
> + }
>   }
>   if (mask != 0) {
>  
> diff --git a/kernel/rcutree.h b/kernel/rcutree.h
> index 9dd8b17..6fd3659 100644
> ---

Re: [PATCH v6 01/10] tracing: Add support for SOFT_DISABLE to syscall events

2013-08-25 Thread Masami Hiramatsu

(2013/08/23 8:27), Tom Zanussi wrote:
> The original SOFT_DISABLE patches didn't add support for soft disable
> of syscall events; this adds it and paves the way for future patches
> allowing triggers to be added to syscall events, since triggers are
> built on top of SOFT_DISABLE.
> 
> Add an array of ftrace_event_file pointers indexed by syscall number
> to the trace array and remove the existing enabled bitmaps, which as a
> result are now redundant.  The ftrace_event_file structs in turn
> contain the soft disable flags we need for per-syscall soft disable
> accounting; later patches add additional 'trigger' flags and
> per-syscall triggers and filters.
> 

This looks good for me.

Reviewed-by: Masami Hiramatsu 

> Signed-off-by: Tom Zanussi 
> ---
>  kernel/trace/trace.h  |  4 ++--
>  kernel/trace/trace_syscalls.c | 36 ++--
>  2 files changed, 32 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index fe39acd..b1227b9 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -192,8 +192,8 @@ struct trace_array {
>  #ifdef CONFIG_FTRACE_SYSCALLS
>   int sys_refcount_enter;
>   int sys_refcount_exit;
> - DECLARE_BITMAP(enabled_enter_syscalls, NR_syscalls);
> - DECLARE_BITMAP(enabled_exit_syscalls, NR_syscalls);
> + struct ftrace_event_file *enter_syscall_files[NR_syscalls];
> + struct ftrace_event_file *exit_syscall_files[NR_syscalls];
>  #endif
>   int stop_count;
>   int clock_id;
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index 559329d..230cdb6 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -302,6 +302,7 @@ static int __init syscall_exit_define_fields(struct 
> ftrace_event_call *call)
>  static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
>  {
>   struct trace_array *tr = data;
> + struct ftrace_event_file *ftrace_file;
>   struct syscall_trace_enter *entry;
>   struct syscall_metadata *sys_data;
>   struct ring_buffer_event *event;
> @@ -314,7 +315,13 @@ static void ftrace_syscall_enter(void *data, struct 
> pt_regs *regs, long id)
>   syscall_nr = trace_get_syscall_nr(current, regs);
>   if (syscall_nr < 0)
>   return;
> - if (!test_bit(syscall_nr, tr->enabled_enter_syscalls))
> +
> + /* Here we're inside the tp handler's rcu_read_lock (__DO_TRACE()) */
> + ftrace_file = rcu_dereference_raw(tr->enter_syscall_files[syscall_nr]);
> + if (!ftrace_file)
> + return;
> +
> + if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, _file->flags))
>   return;
>  
>   sys_data = syscall_nr_to_meta(syscall_nr);
> @@ -345,6 +352,7 @@ static void ftrace_syscall_enter(void *data, struct 
> pt_regs *regs, long id)
>  static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
>  {
>   struct trace_array *tr = data;
> + struct ftrace_event_file *ftrace_file;
>   struct syscall_trace_exit *entry;
>   struct syscall_metadata *sys_data;
>   struct ring_buffer_event *event;
> @@ -356,7 +364,13 @@ static void ftrace_syscall_exit(void *data, struct 
> pt_regs *regs, long ret)
>   syscall_nr = trace_get_syscall_nr(current, regs);
>   if (syscall_nr < 0)
>   return;
> - if (!test_bit(syscall_nr, tr->enabled_exit_syscalls))
> +
> + /* Here we're inside the tp handler's rcu_read_lock (__DO_TRACE()) */
> + ftrace_file = rcu_dereference_raw(tr->exit_syscall_files[syscall_nr]);
> + if (!ftrace_file)
> + return;
> +
> + if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, _file->flags))
>   return;
>  
>   sys_data = syscall_nr_to_meta(syscall_nr);
> @@ -397,7 +411,7 @@ static int reg_event_syscall_enter(struct 
> ftrace_event_file *file,
>   if (!tr->sys_refcount_enter)
>   ret = register_trace_sys_enter(ftrace_syscall_enter, tr);
>   if (!ret) {
> - set_bit(num, tr->enabled_enter_syscalls);
> + rcu_assign_pointer(tr->enter_syscall_files[num], file);
>   tr->sys_refcount_enter++;
>   }
>   mutex_unlock(_trace_lock);
> @@ -415,9 +429,14 @@ static void unreg_event_syscall_enter(struct 
> ftrace_event_file *file,
>   return;
>   mutex_lock(_trace_lock);
>   tr->sys_refcount_enter--;
> - clear_bit(num, tr->enabled_enter_syscalls);
> + rcu_assign_pointer(tr->enter_syscall_files[num], NULL);
>   if (!tr->sys_refcount_enter)
>   unregister_trace_sys_enter(ftrace_syscall_enter, tr);
> + /*
> +  * Callers expect the event to be completely disabled on
> +  * return, so wait for current handlers to finish.
> +  */
> + synchronize_sched();
>   mutex_unlock(_trace_lock);
>  }
>  
> @@ -435,7 +454,7 @@ static int

[PATCH] ARM: DTS: DRA7: Add TPS659038 PMIC nodes

2013-08-25 Thread Keerthy

The Patch adds nodes for TPS659038 PMIC for DRA7 boards.

It is based on top of:
http://comments.gmane.org/gmane.linux.ports.arm.omap/102459.

Documentation:  Documentation/devicetree/bindings/mfd/palmas.txt
Documentation/devicetree/bindings/regulator/palmas-pmic.txt 


Boot Tested on DRA7 d1 Board.

Signed-off-by: Keerthy 
---
 arch/arm/boot/dts/dra7-evm.dts |  118 
 1 file changed, 118 insertions(+)

Index: linux/arch/arm/boot/dts/dra7-evm.dts
===
--- linux.orig/arch/arm/boot/dts/dra7-evm.dts   2013-08-26 09:57:52.496173554 
+0530
+++ linux/arch/arm/boot/dts/dra7-evm.dts2013-08-26 10:38:44.995414695 
+0530
@@ -93,6 +93,119 @@
pinctrl-names = "default";
pinctrl-0 = <_pins>;
clock-frequency = <40>;
+
+   tps659038: tps659038@58 {
+   compatible = "ti,tps659038";
+   reg = <0x58>;
+
+   tps659038_pmic {
+   compatible = "ti,tps659038-pmic";
+
+   regulators {
+   smps123_reg: smps123 {
+   /* VDD_MPU */
+   regulator-name = "smps123";
+   regulator-min-microvolt = < 
85>;
+   regulator-max-microvolt = 
<125>;
+   regulator-always-on;
+   regulator-boot-on;
+   };
+
+   smps45_reg: smps45 {
+   /* VDD_DSPEVE */
+   regulator-name = "smps45";
+   regulator-min-microvolt = < 
85>;
+   regulator-max-microvolt = 
<115>;
+   regulator-boot-on;
+   };
+
+   smps6_reg: smps6 {
+   /* VDD_GPU - over VDD_SMPS6 */
+   regulator-name = "smps6";
+   regulator-min-microvolt = 
<85>;
+   regulator-max-microvolt = 
<1250>;
+   regulator-boot-on;
+   };
+
+   smps7_reg: smps7 {
+   /* CORE_VDD */
+   regulator-name = "smps7";
+   regulator-min-microvolt = 
<85>;
+   regulator-max-microvolt = 
<103>;
+   regulator-always-on;
+   regulator-boot-on;
+   };
+
+   smps8_reg: smps8 {
+   /* VDD_IVAHD */
+   regulator-name = "smps8";
+   regulator-min-microvolt = < 
85>;
+   regulator-max-microvolt = 
<125>;
+   regulator-boot-on;
+   };
+
+   smps9_reg: smps9 {
+   /* VDDS1V8 */
+   regulator-name = "smps9";
+   regulator-min-microvolt = 
<180>;
+   regulator-max-microvolt = 
<180>;
+   regulator-always-on;
+   regulator-boot-on;
+   };
+
+   ldo1_reg: ldo1 {
+   /* LDO1_OUT --> SDIO  */
+   regulator-name = "ldo1";
+   regulator-min-microvolt = 
<180>;
+   regulator-max-microvolt = 
<330>;
+   regulator-boot-on;
+   };
+
+   ldo2_reg: ldo2 {
+   /* VDD_RTCIO */
+   /* LDO2 -> VDDSHV5, LDO2 also 
goes to CAN_PHY_3V3 */
+   regulator-name = "ldo2";
+

Re: [PATCH v7 7/7] cpufreq:exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ

2013-08-25 Thread Viresh Kumar

On 13 August 2013 15:38, Lukasz Majewski  wrote:
> Special driver data flag (CPUFREQ_BOOST_FREQ) has been added to indicate
> frequency, which can be only enabled for BOOST mode.
> This frequency shall be used only for limited time, since it might cause
> target device to overheat.
>
> Signed-off-by: Lukasz Majewski 
> Signed-off-by: Myungjoo Ham 
>
> ---
> Changes for v7:
> - None

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 6/7] Documentation:cpufreq:boost: Update BOOST documentation

2013-08-25 Thread Viresh Kumar

On 13 August 2013 15:38, Lukasz Majewski  wrote:
> Since the support for software and hardware controlled boosting has been
> added, the corresponding Documentation entry had been updated.
>
> Signed-off-by: Lukasz Majewski 
> Signed-off-by: Myungjoo Ham 
>
> ---
> Changes for v7:
> - None

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 5/7] cpufreq:exynos:Extend Exynos cpufreq driver to support boost framework

2013-08-25 Thread Viresh Kumar

On 13 August 2013 15:38, Lukasz Majewski  wrote:
> The cpufreq_driver's boost_supported flag is true only when boost
> support is explicitly enabled. Boost related attributes are exported only
> under the same condition.
>
> Signed-off-by: Lukasz Majewski 
> Signed-off-by: Myungjoo Ham 
>
> ---
> Changes for v7:
> - Replace CONFIG_CPU_FREQ_BOOST_SW with CONFIG_ARM_EXYNOS_CPU_FREQ_BOOST_SW
> - Move boost_supported initialization to struct cpufreq_driver exynos_driver

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 4/4] Documentation: Add device tree bindings for Freescale FTM PWM

2013-08-25 Thread Xiubo Li-B47053

Hi Stephen,


> Subject: Re: [PATCH 4/4] Documentation: Add device tree bindings for
> Freescale FTM PWM
> 
> On 08/23/2013 01:36 AM, Thierry Reding wrote:
> > On Thu, Aug 22, 2013 at 08:26:10AM +0200, Sascha Hauer wrote:
> >> On Thu, Aug 22, 2013 at 02:55:42AM +, Xiubo Li-B47053 wrote:
> >>> Hi Tomasz,
> >>>
> >>> Thanks for your comments.
> >>>
> >>>
>  Could you explain meaning of this property more precisely?
>  I'm interested especially how is this related to the PWM IP block
>  and boards.
> 
> >>>
> >>> Yes. There are 8 channels most. While the pinctrls of 4th and 5th
> >>> channels could be used by uart's Rx and Tx, then these 2 channels
> >>> won't be used for pwm output, so there will be 6 channels available
> >>> by the pwm. Thus, the pwm chip will register only 6 pwms(6 channels)
> >>> most("fsl,pwm-channel-orders = {0 1 2 3
> >>> 6 7}").And also the "fsl,pwm-channel-number" will be 6.
> >>
> >> If the chip has eight PWMs I would register all of them. If some of
> >> them are not routed out by the pinmux then just nothing happens if
> >> you use them. In a sane devicetree they won't be referenced anyway
> >> when they are not routed out of the SoC.
> >
> > In that case, shouldn't this be hooked up to the pinctrl subsystem as
> > well? As I understand the above, the logical thing would be for each
> > PWM channel's .request() operation to configure the pinmuxing
> > appropriately. And if it can't be configured as necessary then
> > .request() should return an error (or propagate the error from the
> > pinctrl subsystem).
> 
> I think the pin-muxing should be static, i.e. set up when the PWM device
> as a whole probe()s, rather than being twiddled at request/free time.
> Certainly the pinmux support in the device core is now set up to acquire
> the default state right before probe(). I don't see a need to do anything
> custom here.

As we have tolk about this before in [PATCH 1/4]:
"
> > Why do you need to manipulate the pinctrl to en/disable a channel?
> >
> 
> This is because in Vybrid VF610 TOWER board, there are 4 leds, and each
> led's one point(diode's positive pole) is connected to 3.3V, and the
> other point is connected to pwm's one channel. When the 4 pinctrls are
> configured as enable at the same time, the 4 pinctrls is low valtage, and
> the 4 leds will be lighted up as default, then when you enable/disable
> one led will effects others.
> 
> These pinctrls are belong to pwm, and I don't think led or other customer
> could control them directly.
> So, here I authorize the 4 pinctrls to each channel controls.
>
"
For the reason above, I have to control the pinctrls separately.

If all the pinctrls set as default state, the 8 pinctrls must be controlled 
together.
And the 4 leds will all be lighted up as default and will influence each other.

Thanks very much.

--
Best Regards.
Xiubo




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 4/7] cpufreq:boost:Kconfig: Provide support for software managed BOOST

2013-08-25 Thread Viresh Kumar

On 13 August 2013 15:38, Lukasz Majewski  wrote:
> For safety reasons new flag - CONFIG_CPU_FREQ_BOOST_SW has been added.
> Only after selecting "EXYNOS Frequency Overclocking - Software" Kconfig
> option the software managed boost is enabled. It also selects thermal
> subsystem to be compiled in. Thermal is necessary for disabling boost
> and cooling down the device when overheating detected.
>
> Boost _MUST_NOT_ work without thermal subsystem with properly defined
> overheating temperatures.
>
> This option doesn't affect x86's ACPI hardware managed boost support
> (i.e. Intel, AMD). In this situation boost management is embedded at
> hardware.
>
> Signed-off-by: Lukasz Majewski 
> Signed-off-by: Myungjoo Ham 
>
> ---
> Changes for v7:
> - Remove superfluous "default n" definition
> - Generic CPU_FREQ_BOOST_SW depends on THERMAL

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 3/7] thermal:boost: Automatic enable/disable of BOOST feature

2013-08-25 Thread Viresh Kumar

On 13 August 2013 15:38, Lukasz Majewski  wrote:
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index b5defd4..ec19da9 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -408,10 +408,24 @@ int cpufreq_frequency_table_target(struct 
> cpufreq_policy *policy,
>
>  void cpufreq_frequency_table_update_policy_cpu(struct cpufreq_policy 
> *policy);
>  ssize_t cpufreq_show_cpus(const struct cpumask *mask, char *buf);
> +#ifdef CONFIG_CPU_FREQ
>  int cpufreq_boost_trigger_state(int state);
>  int cpufreq_boost_supported(void);
>  int cpufreq_boost_enabled(void);
> -
> +#else
> +static inline int cpufreq_boost_trigger_state(int state)
> +{
> +   return 0;
> +}
> +static inline int cpufreq_boost_supported(void)
> +{
> +   return 0;
> +}
> +static inline int cpufreq_boost_enabled(void)
> +{
> +   return 0;
> +}
> +#endif

I almost gave an Ack before I saw this :)
This should be moved to the first patch I believe..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpuset: mm: Reduce large amounts of memory barrier related damage v3

2013-08-25 Thread Rik van Riel

On 08/23/2013 02:15 PM, Peter Zijlstra wrote:

> So I guess the quick and ugly solution is something like the below. 

This still crashes :)

> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1762,19 +1762,21 @@ unsigned slab_node(void)
>  static unsigned offset_il_node(struct mempolicy *pol,
>   struct vm_area_struct *vma, unsigned long off)
>  {
> - unsigned nnodes = nodes_weight(pol->v.nodes);
> - unsigned target;
> - int c;
> - int nid = -1;
> + unsigned nnodes, target;
> + int c, nid;
>  
> +again:
> + nnodes = nodes_weight(pol->v.nodes);
>   if (!nnodes)
>   return numa_node_id();
> +
>   target = (unsigned int)off % nnodes;
> - c = 0;
> - do {
> + for (c = 0, nid = -1; c <= target; c++)
>   nid = next_node(nid, pol->v.nodes);
> - c++;
> - } while (c <= target);
> +
> + if (unlikely((unsigned)nid >= MAX_NUMNODES))
> + goto again;

I'll go kick off a compile that replaces the conditional above with:

if (unlikely(!node_online(nid)))
goto again;

>   return nid;
>  }


-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 2/7] cpufreq:acpi:x86: Adjust the acpi-cpufreq.c code to work with common boost solution

2013-08-25 Thread Viresh Kumar

On 13 August 2013 15:38, Lukasz Majewski  wrote:
> The Intel's hardware based boost solution driver has been changed to 
> cooperate with
> common cpufreq boost framework.
>
> The global sysfs boost attribute entry code 
> (/sys/devices/system/cpu/cpufreq/boost)
> has been moved to a core cpufreq code. This attribute is now only visible,
> when cpufreq driver supports it.
>
> The _store_boost() function has been redesigned to be used as set_boost
> callback.
>
> Signed-off-by: Lukasz Majewski 
> Signed-off-by: Myungjoo Ham 
>
> ---
> Changes for v7:
> - Remove superfluous acpi_cpufreq_driver.boost_supported = false at
>   acpi_cpufreq_boost_init()

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 1/7] cpufreq: Add boost frequency support in core

2013-08-25 Thread Viresh Kumar

Some minor nitpicking, nothing much :)

On 13 August 2013 15:38, Lukasz Majewski  wrote:
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> +static int cpufreq_boost_set_sw(int state)
> +{
> +   struct cpufreq_frequency_table *freq_table;
> +   struct cpufreq_policy *policy;
> +   int ret = -EINVAL;
> +
> +   list_for_each_entry(policy, _policy_list, policy_list) {
> +   freq_table = cpufreq_frequency_get_table(policy->cpu);
> +   if (freq_table) {
> +   ret = cpufreq_frequency_table_cpuinfo(policy,
> +   freq_table);
> +   if (!ret) {
> +   policy->user_policy.max = policy->max;
> +   __cpufreq_governor(policy, 
> CPUFREQ_GOV_LIMITS);
> +   }

In case ret wasn't 0 (i.e. we failed), we should print an error
message and break
our loop ?

> +   }
> +   }
> +
> +   return ret;
> +}
> +
> +int cpufreq_boost_trigger_state(int state)
> +{
> +   unsigned long flags;
> +   int ret = 0;
> +
> +   if (cpufreq_driver->boost_enabled == state)
> +   return 0;
> +
> +   write_lock_irqsave(_driver_lock, flags);
> +   cpufreq_driver->boost_enabled = state;
> +   write_unlock_irqrestore(_driver_lock, flags);
> +
> +   ret = cpufreq_driver->set_boost(state);
> +   if (ret) {
> +   write_lock_irqsave(_driver_lock, flags);
> +   cpufreq_driver->boost_enabled = !state;
> +   write_unlock_irqrestore(_driver_lock, flags);
> +
> +   pr_err("%s: Cannot %s BOOST\n", __func__,
> +  state ? "enabled" : "disabled");

s/enabled/enable and s/disabled/disable

> +   }
> +
> +   return ret;
> +}
> +
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Commit 9a11899 (USB: OHCI: add missing PCI PM callbacks to ohci-pci.c) breaks several builds

2013-08-25 Thread Guenter Roeck


On 08/25/2013 09:26 PM, Greg Kroah-Hartman wrote:

On Sun, Aug 25, 2013 at 09:02:15PM -0700, Guenter Roeck wrote:

On 08/25/2013 08:30 PM, Guenter Roeck wrote:

Broken builds:
  mips:ath79_defconfig
  parisc:defconfig
  sparc32:defconfig
  sparc64:defconfig
  tile:defconfig


Add:
powerpc:ppc64e_defconfig
powerpc:cell_defconfig
powerps:maple_defconfig

That makes it 8 out of 82 builds, or roughly 10% of all builds.

My qemu test build for powerpc (which has its own config file)  fails as well 
:(.


Ugh, I got no reports of this from linux-next or the 0-day build system,
odd.



Maybe the build system isn't as comprehensive as mine, or it just takes a bit 
longer.
My system caught it because you updated linux-stable to 3.11-rc7, which 
triggers all builds.

The functions are only declared if CONFIG_PM is defined, yet are called 
unconditionally.
This means the patch breaks in all configurations where CONFIG_PM is undefined 
and
CONFIG_USB_OHCI_HCD_PCI is defined.

Add this to the agenda at the kernel summit: We need more automated test 
coverage.
It is a bit scary that my little pc-based server farm catches problems like this
faster than everything else out there. And it isn't even the kind of problem
I am looking for, but rather a side effect of the builds I am running.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 01/14] cpufreq: cpufreq-cpu0: add dt node parsing for 'cooling-zones'

2013-08-25 Thread Viresh Kumar

On 24 August 2013 04:45, Eduardo Valentin  wrote:
> diff --git a/drivers/cpufreq/cpufreq-cpu0.c b/drivers/cpufreq/cpufreq-cpu0.c
> index ad1fde2..ede6487 100644
> --- a/drivers/cpufreq/cpufreq-cpu0.c
> +++ b/drivers/cpufreq/cpufreq-cpu0.c
> @@ -20,6 +20,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 

In alphabetical order please..

> @@ -268,6 +272,13 @@ static int cpu0_cpufreq_probe(struct platform_device 
> *pdev)
> goto out_free_table;
> }
>
> +   /*
> +* For now, just loading the cooling device;
> +* thermal DT code takes care of matching them.
> +*/
> +   if (of_find_property(np, "cooling-zones", NULL))
> +   cdev = cpufreq_cooling_register(cpu_present_mask);

Should we check if it passed or failed? And if failed Atleast flag an
appropriate message?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 03/10] sched: Clean-up struct sd_lb_stat

2013-08-25 Thread Paul Turner

On Sun, Aug 25, 2013 at 7:56 PM, Lei Wen  wrote:
> On Tue, Aug 20, 2013 at 12:01 AM, Peter Zijlstra  wrote:
>> From: Joonsoo Kim 
>>
>> There is no reason to maintain separate variables for this_group
>> and busiest_group in sd_lb_stat, except saving some space.
>> But this structure is always allocated in stack, so this saving
>> isn't really benificial [peterz: reducing stack space is good; in this
>> case readability increases enough that I think its still beneficial]
>>
>> This patch unify these variables, so IMO, readability may be improved.
>>
>> Signed-off-by: Joonsoo Kim 
>> [peterz: lots of style edits, a few fixes and a rename]
>> Signed-off-by: Peter Zijlstra 
>> Link: 
>> http://lkml.kernel.org/r/1375778203-31343-4-git-send-email-iamjoonsoo@lge.com
>> ---
>>  kernel/sched/fair.c |  225 
>> +---
>>  1 file changed, 112 insertions(+), 113 deletions(-)
>>
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -4277,36 +4277,6 @@ static unsigned long task_h_load(struct
>>
> [snip]...
>> -   env->imbalance = DIV_ROUND_CLOSEST(
>> -   sds->max_load * sds->busiest->sgp->power, SCHED_POWER_SCALE);
>> +   env->imbalance = DIV_ROUND_CLOSEST(sds->busiest_stat.avg_load *
>> +   sds->busiest->sgp->power, SCHED_POWER_SCALE);
>>
>
> I am wondering whether we could change this line as below is more appropriate,
> since it would avoid the division here:
>env->imbalance = (sds->busiest_stat.avg_load * 
> sds->busiest->sgp->power)
>   >> SCHED_POWER_SHIFT;
>
> I am not sure whether compiler would be smarter enough to covert into
>>> operation,
> if it see SCHED_POWER_SCALE is 1024 here.

This would change the rounding.  Fortunately, gcc is smart enough to
handle this.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v6 2/3] mmc: dw_mmc: Honor requests to set the clock to 0 (turn off clock)

2013-08-25 Thread Seungwon Jeon

On Fri, August 23, 2013, Doug Anderson wrote:
> Previously the dw_mmc driver would ignore any requests to disable the
> card's clock.  This doesn't seem like a good thing in general, but had
> one extra bad side effect in the following situtation:
> * mmc core would set clk to 400kHz at boot time while initting
> * mmc core would set clk to 0 since no card, but it would be ignored.
> * suspend to ram and resume; clocks in the dw_mmc IP block are now 0
>   but dw_mmc thinks that they're 400kHz (it ignored the set to 0).
> * insert card
> * mmc core would set clk to 400kHz which would be considered a no-op.
> 
> Note that if there is no card in the slot and we do a suspend/resume
> cycle, we _do_ still end up with differences in a dw_mmc register
> dump, but the differences are clock related and we've got the clock
> disabled both before and after, so this should be OK.
> 
> Signed-off-by: Doug Anderson 
> ---
> Changes in v6:
> - Replaces previous pathes that ensured saving/restoring clocks.
> 
>  drivers/mmc/host/dw_mmc.c | 21 +++--
>  1 file changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
> index ee5f167..f6c7545 100644
> --- a/drivers/mmc/host/dw_mmc.c
> +++ b/drivers/mmc/host/dw_mmc.c
> @@ -635,7 +635,11 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot, 
> bool force_clkinit)
>   u32 div;
>   u32 clk_en_a;
> 
> - if (slot->clock != host->current_speed || force_clkinit) {
> + if (slot->clock == 0) {
> + mci_writel(host, CLKENA, 0);
> + mci_send_cmd(slot,
> +  SDMMC_CMD_UPD_CLK | SDMMC_CMD_PRV_DAT_WAIT, 0);
Basically dw_mmc driver uses host's low power mode(auto clock gating)
So, how about keeping origin code rather than programming clock setting to '0'?
 
> + } else if (slot->clock != host->current_speed || force_clkinit) {
And then, if condition('slot->clock is not zero') is added in order to allow to 
set clock,
print messages which Jaehoon pointed would be solved.

Thanks,
Seungwon Jeon

>   div = host->bus_hz / slot->clock;
>   if (host->bus_hz % slot->clock && host->bus_hz > slot->clock)
>   /*
> @@ -675,9 +679,8 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot, 
> bool force_clkinit)
>   /* inform CIU */
>   mci_send_cmd(slot,
>SDMMC_CMD_UPD_CLK | SDMMC_CMD_PRV_DAT_WAIT, 0);
> -
> - host->current_speed = slot->clock;
>   }
> + host->current_speed = slot->clock;
> 
>   /* Set the current slot bus width */
>   mci_writel(host, CTYPE, (slot->ctype << slot->id));
> @@ -807,13 +810,11 @@ static void dw_mci_set_ios(struct mmc_host *mmc, struct 
> mmc_ios *ios)
> 
>   mci_writel(slot->host, UHS_REG, regs);
> 
> - if (ios->clock) {
> - /*
> -  * Use mirror of ios->clock to prevent race with mmc
> -  * core ios update when finding the minimum.
> -  */
> - slot->clock = ios->clock;
> - }
> + /*
> +  * Use mirror of ios->clock to prevent race with mmc
> +  * core ios update when finding the minimum.
> +  */
> + slot->clock = ios->clock;
> 
>   if (drv_data && drv_data->set_ios)
>   drv_data->set_ios(slot->host, ios);
> --
> 1.8.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/16] cpufreq: create & use cpufreq_generic_get() routine

2013-08-25 Thread Viresh Kumar

On 24 August 2013 20:20, Rafael J. Wysocki  wrote:
> OK, let me rephrase that more directly: Please, slow down.  Allow your 
> previous
> changes to be integrated before you throw more of them at people.

Okay, I will try :)

Btw, Are you going to take any of my patches for 3.12?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] arcmsr: Fix bug of updating adapter firmware through ioctl(ARCHTTP) interface

2013-08-25 Thread 黃清隆


From: Ching 

Fix bug of updating adapter firmware through ioctl(ARCHTTP) interface.
Signed-off-by: Ching 
---

patch3
Description: Binary data

Re: Unusually high system CPU usage with recent kernels

2013-08-25 Thread Paul E. McKenney

On Sun, Aug 25, 2013 at 09:50:21PM +0200, Tibor Billes wrote:
> From: Paul E. McKenney Sent: 08/24/13 11:03 PM
> > On Sat, Aug 24, 2013 at 09:59:45PM +0200, Tibor Billes wrote:
> > > From: Paul E. McKenney Sent: 08/22/13 12:09 AM
> > > > On Wed, Aug 21, 2013 at 11:05:51PM +0200, Tibor Billes wrote:
> > > > > > From: Paul E. McKenney Sent: 08/21/13 09:12 PM
> > > > > > On Wed, Aug 21, 2013 at 08:14:46PM +0200, Tibor Billes wrote:
> > > > > > > > From: Paul E. McKenney Sent: 08/20/13 11:43 PM
> > > > > > > > On Tue, Aug 20, 2013 at 10:52:26PM +0200, Tibor Billes wrote:
> > > > > > > > > > From: Paul E. McKenney Sent: 08/20/13 04:53 PM
> > > > > > > > > > On Tue, Aug 20, 2013 at 08:01:28AM +0200, Tibor Billes 
> > > > > > > > > > wrote:
> > > > > > > > > > > Hi,
> > > > > > > > > > > 
> > > > > > > > > > > I was using the 3.9.7 stable release and tried to upgrade 
> > > > > > > > > > > to the 3.10.x series.
> > > > > > > > > > > The 3.10.x series was showing unusually high (>75%) 
> > > > > > > > > > > system CPU usage in some
> > > > > > > > > > > situations, making things really slow. The latest stable 
> > > > > > > > > > > I tried is 3.10.7.
> > > > > > > > > > > I also tried 3.11-rc5, they both show this behaviour. 
> > > > > > > > > > > This behaviour doesn't
> > > > > > > > > > > show up when the system is idling, only when doing some 
> > > > > > > > > > > CPU intensive work,
> > > > > > > > > > > like compiling with multiple threads. Compiling with only 
> > > > > > > > > > > one thread seems not
> > > > > > > > > > > to trigger this behaviour.
> > > > > > > > > > > 
> > > > > > > > > > > To be more precise I did a `perf record -a` while 
> > > > > > > > > > > compiling a large C++ program
> > > > > > > > > > > with scons using 4 threads, the result is appended at the 
> > > > > > > > > > > end of this email.
> > > > > > > > > > 
> > > > > > > > > > New one on me! You are running a mainstream system 
> > > > > > > > > > (x86_64), so I am
> > > > > > > > > > surprised no one else noticed.
> > > > > > > > > > 
> > > > > > > > > > Could you please send along your .config file?
> > > > > > > > > 
> > > > > > > > > Here it is
> > > > > > > > 
> > > > > > > > Interesting. I don't see RCU stuff all that high on the list, 
> > > > > > > > but
> > > > > > > > the items I do see lead me to suspect RCU_FAST_NO_HZ, which has 
> > > > > > > > some
> > > > > > > > relevance to the otherwise inexplicable group of commits you 
> > > > > > > > located
> > > > > > > > with your bisection. Could you please rerun with 
> > > > > > > > CONFIG_RCU_FAST_NO_HZ=n?
> > > > > > > > 
> > > > > > > > If that helps, there are some things I could try.
> > > > > > > 
> > > > > > > It did help. I didn't notice anything unusual when running with 
> > > > > > > CONFIG_RCU_FAST_NO_HZ=n.
> > > > > > 
> > > > > > Interesting. Thank you for trying this -- and we at least have a
> > > > > > short-term workaround for this problem. I will put a patch together
> > > > > > for further investigation.
> > > > > 
> > > > > I don't specifically need this config option so I'm fine without it in
> > > > > the long term, but I guess it's not supposed to behave like that.
> > > > 
> > > > OK, good, we have a long-term workload for your specific case,
> > > > even better. ;-)
> > > > 
> > > > But yes, there are situations where RCU_FAST_NO_HZ needs to work
> > > > a bit better. I hope you will bear with me with a bit more
> > > > testing...
> > > >
> > > > > > In the meantime, could you please tell me how you were measuring
> > > > > > performance for your kernel builds? Wall-clock time required to 
> > > > > > complete
> > > > > > one build? Number of builds completed per unit time? Something else?
> > > > > 
> > > > > Actually, I wasn't all this sophisticated. I have a system monitor
> > > > > applet on my top panel (using MATE, Linux Mint), four little graphs,
> > > > > one of which shows CPU usage. Different colors indicate different kind
> > > > > of CPU usage. Blue shows user space usage, red shows system usage, and
> > > > > two more for nice and iowait. During a normal compile it's almost
> > > > > completely filled with blue user space CPU usage, only the top few
> > > > > pixels show some iowait and system usage. With CONFIG_RCU_FAST_NO_HZ
> > > > > set, about 3/4 of the graph was red system CPU usage, the rest was
> > > > > blue. So I just looked for a pile of red on my graphs when I tested
> > > > > different kernel builds. But also compile speed was horrible I 
> > > > > couldn't
> > > > > wait for the build to finish. Even the UI got unresponsive.
> > > > 
> > > > We have been having problems with CPU accounting, but this one looks
> > > > quite real.
> > > > 
> > > > > Now I did some measuring. In the normal case a compile finished in 36
> > > > > seconds, compiled 315 object files. Here are some output lines from
> > > > > dstat -tasm --vm during compile:
> > > > > system total-cpu-usage -dsk/total- -net/total- 
> > > >

Re: Commit 9a11899 (USB: OHCI: add missing PCI PM callbacks to ohci-pci.c) breaks several builds

2013-08-25 Thread Greg Kroah-Hartman

On Sun, Aug 25, 2013 at 09:02:15PM -0700, Guenter Roeck wrote:
> On 08/25/2013 08:30 PM, Guenter Roeck wrote:
> > Broken builds:
> >  mips:ath79_defconfig
> >  parisc:defconfig
> >  sparc32:defconfig
> >  sparc64:defconfig
> >  tile:defconfig
> >
> Add:
>   powerpc:ppc64e_defconfig
>   powerpc:cell_defconfig
>   powerps:maple_defconfig
> 
> That makes it 8 out of 82 builds, or roughly 10% of all builds.
> 
> My qemu test build for powerpc (which has its own config file)  fails as well 
> :(.

Ugh, I got no reports of this from linux-next or the 0-day build system,
odd.

Alan, can you send a follow-on patch to fix this?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/3] arcmsr: Fix command throttling for ARC188x series

2013-08-25 Thread 黃清隆


From: Ching 

Fix command throttling for ARC188x series adapter.
Signed-off-by: Ching  
---

patch2
Description: Binary data

Re: [PATCH 3/3] raid6/test: replace echo -e with printf

2013-08-25 Thread NeilBrown

On Thu, 22 Aug 2013 17:09:11 +0200 "H. Peter Anvin"  wrote:

> Acked-by: H. Peter Anvin 

Applied with the Ack - thanks.

NeilBrown

> 
> Max Filippov  wrote:
> >-e is a non-standard echo option, echo output is
> >implementation-dependent when it is used. Replace echo -e with printf
> >as
> >suggested by POSIX echo manual.
> >
> >Cc: NeilBrown 
> >Cc: Jim Kukunas 
> >Cc: "H. Peter Anvin" 
> >Cc: Yuanhan Liu 
> >Signed-off-by: Max Filippov 
> >---
> > lib/raid6/test/Makefile |2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> >
> >diff --git a/lib/raid6/test/Makefile b/lib/raid6/test/Makefile
> >index 087332d..73b0151 100644
> >--- a/lib/raid6/test/Makefile
> >+++ b/lib/raid6/test/Makefile
> >@@ -28,7 +28,7 @@ ifeq ($(IS_X86),yes)
> > gcc -c -x assembler - >&/dev/null &&\
> > rm ./-.o && echo -DCONFIG_AS_AVX2=1)
> > else
> >-HAS_ALTIVEC := $(shell echo -e '\#include \nvector
> >int a;' |\
> >+HAS_ALTIVEC := $(shell printf '\#include \nvector
> >int a;\n' |\
> >  gcc -c -x c - >&/dev/null && \
> >  rm ./-.o && echo yes)
> > ifeq ($(HAS_ALTIVEC),yes)
> 



signature.asc
Description: PGP signature

[PATCH 1/3] arcmsr: Support Areca new SATA Raid Adapter ARC1214/1224/1264/1284

2013-08-25 Thread 黃清隆


From: Ching 


Support Areca new SATA Raid adapter ARC1214/1224/1264/1284.
Modify maximum outstanding command number, notify command complete with auto 
request sense

Signed-off-by: Ching  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] vme: vme_ca91cx42.c: fix to pass correct device identity to free_irq()

2013-08-25 Thread Wei Yongjun

From: Wei Yongjun 

free_irq() expects the same device identity that was passed to
corresponding request_irq(), otherwise the IRQ is not freed.

Signed-off-by: Wei Yongjun 
---
 drivers/vme/bridges/vme_ca91cx42.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/vme/bridges/vme_ca91cx42.c 
b/drivers/vme/bridges/vme_ca91cx42.c
index 64bfea3..f24e234 100644
--- a/drivers/vme/bridges/vme_ca91cx42.c
+++ b/drivers/vme/bridges/vme_ca91cx42.c
@@ -243,6 +243,8 @@ static int ca91cx42_irq_init(struct vme_bridge 
*ca91cx42_bridge)
 static void ca91cx42_irq_exit(struct ca91cx42_driver *bridge,
struct pci_dev *pdev)
 {
+   struct vme_bridge *ca91cx42_bridge;
+
/* Disable interrupts from PCI to VME */
iowrite32(0, bridge->base + VINT_EN);
 
@@ -251,7 +253,9 @@ static void ca91cx42_irq_exit(struct ca91cx42_driver 
*bridge,
/* Clear Any Pending PCI Interrupts */
iowrite32(0x00FF, bridge->base + LINT_STAT);
 
-   free_irq(pdev->irq, pdev);
+   ca91cx42_bridge = container_of((void *)bridge, struct vme_bridge,
+  driver_priv);
+   free_irq(pdev->irq, ca91cx42_bridge);
 }
 
 static int ca91cx42_iack_received(struct ca91cx42_driver *bridge, int level)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Commit 9a11899 (USB: OHCI: add missing PCI PM callbacks to ohci-pci.c) breaks several builds

2013-08-25 Thread Guenter Roeck


On 08/25/2013 08:30 PM, Guenter Roeck wrote:

Broken builds:
 mips:ath79_defconfig
 parisc:defconfig
 sparc32:defconfig
 sparc64:defconfig
 tile:defconfig


Add:
powerpc:ppc64e_defconfig
powerpc:cell_defconfig
powerps:maple_defconfig

That makes it 8 out of 82 builds, or roughly 10% of all builds.

My qemu test build for powerpc (which has its own config file)  fails as well 
:(.

Guenter


Error log:
drivers/usb/host/ohci-pci.c: In function 'ohci_pci_init':
drivers/usb/host/ohci-pci.c:309:35: error: 'ohci_suspend' undeclared (first use 
in this function)
drivers/usb/host/ohci-pci.c:309:35: note: each undeclared identifier is 
reported only once for each function it appears in
drivers/usb/host/ohci-pci.c:310:34: error: 'ohci_resume' undeclared (first use 
in this function)
make[3]: *** [drivers/usb/host/ohci-pci.o] Error 1

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: unused swap offset / bad page map.

2013-08-25 Thread Hillf Danton

On Fri, Aug 23, 2013 at 11:53 AM, Dave Jones  wrote:
>
> It actually seems worse, seems I can trigger it even easier now, as if
> there's a leak.
>
Can you please try the new fix for TLB flush?

commit  2b047252d087be7f2ba
Fix TLB gather virtual address range invalidation corner cases
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 2/3] mmc: dw_mmc: Honor requests to set the clock to 0 (turn off clock)

2013-08-25 Thread Doug Anderson

Jaehoon,

On Sun, Aug 25, 2013 at 6:31 PM, Jaehoon Chung  wrote:
> Hi Doug,
>
> On 08/24/2013 05:40 AM, Doug Anderson wrote:
>> Jaehoon,
>>
>> On Fri, Aug 23, 2013 at 6:21 AM, Jaehoon Chung  
>> wrote:
>>> Hi Doug,
>>>
>>> If the clock-gating is enabled, then maybe it's continuously printed the 
>>> kernel message for Bus_speed.
>>
>> Can you explain?  I don't think dw_mmc has support for clock gating
>> right now.  ...or are there some patches that I'm not aware of?  I
>> could believe that if you've got some non-upstream clock gating
>> patches that these would need to be modified to handle it...  ...but
>> unless those are slated to land upstream it seems like I can't take
>> them into account, can I?
> If i enabled the CONFIG_MMC_CLK_GATE, the i have found the below message 
> whenever some operation is run.
> I will test more with your patch.

Ah, sorry!  I wasn't aware of that config option.  I was thinking of
automatic clock gating based on something like the common clock
framework.  When there are no more users of a gate clock it will get
turned off.  To have that work dw_mmc would need to release its biu /
ciu clocks at some point.

If I had to guess, I'd speculate that perhaps we should just change
the printout to a dev_debug(), though I do find that printout
incredibly useful.  If I had to guess I'd say that the mmc core is
switching between a clock of 0 and a full speed clock constantly.  If
that's true then it means that dw_mmc used to treat that like a no-op.
 Now it actually gates the clock.  If you comment out the printout, do
things still work?  Does your power consumption go down?

Let me know if you find anything.  Otherwise I can try to reproduce this week.

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] net: xilinx: fix memleak

2013-08-25 Thread Libo Chen


 decrease device_node refcount np1 in err case.

Signed-off-by: Libo Chen 
---
 drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c 
b/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
index e90e1f4..64b4639 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
@@ -175,6 +175,7 @@ int axienet_mdio_setup(struct axienet_local *lp, struct 
device_node *np)
printk(KERN_WARNING "Setting MDIO clock divisor to "
   "default %d\n", DEFAULT_CLOCK_DIVISOR);
clk_div = DEFAULT_CLOCK_DIVISOR;
+   of_node_put(np1);
goto issue;
}

-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Commit 9a11899 (USB: OHCI: add missing PCI PM callbacks to ohci-pci.c) breaks several builds

2013-08-25 Thread Guenter Roeck


Broken builds:
mips:ath79_defconfig
parisc:defconfig
sparc32:defconfig
sparc64:defconfig
tile:defconfig

Error log:
drivers/usb/host/ohci-pci.c: In function 'ohci_pci_init':
drivers/usb/host/ohci-pci.c:309:35: error: 'ohci_suspend' undeclared (first use 
in this function)
drivers/usb/host/ohci-pci.c:309:35: note: each undeclared identifier is 
reported only once for each function it appears in
drivers/usb/host/ohci-pci.c:310:34: error: 'ohci_resume' undeclared (first use 
in this function)
make[3]: *** [drivers/usb/host/ohci-pci.o] Error 1

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] driver core / ACPI: Avoid device removal locking problems

2013-08-25 Thread Gu Zheng

Hi Rafael,

On 08/26/2013 04:09 AM, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki 
> 
> There are two mutexes, device_hotplug_lock and acpi_scan_lock, held
> around the acpi_bus_trim() call in acpi_scan_hot_remove() which
> generally removes devices (it removes ACPI device objects at least,
> but it may also remove "physical" device objects through .detach()
> callbacks of ACPI scan handlers).  Thus, potentially, device sysfs
> attributes are removed under these locks and to remove those
> attributes it is necessary to hold the s_active references of their
> directory entries for writing.
> 
> On the other hand, the execution of a .show() or .store() callback
> from a sysfs attribute is carried out with that attribute's s_active
> reference held for reading.  Consequently, if any device sysfs
> attribute that may be removed from within acpi_scan_hot_remove()
> through acpi_bus_trim() has a .store() or .show() callback which
> acquires either acpi_scan_lock or device_hotplug_lock, the execution
> of that callback may deadlock with the removal of the attribute.
> [Unfortunately, the "online" device attribute of CPUs and memory
> blocks and the "eject" attribute of ACPI device objects are affected
> by this issue.]
> 
> To avoid those deadlocks introduce a new protection mechanism that
> can be used by the device sysfs attributes in question.  Namely,
> if a device sysfs attribute's .store() or .show() callback routine
> is about to acquire device_hotplug_lock or acpi_scan_lock, it can
> first execute read_lock_device_remove() and return an error code if
> that function returns false.  If true is returned, the lock in
> question may be acquired and read_unlock_device_remove() must be
> called.  [This mechanism is implemented by means of an additional
> rwsem in drivers/base/core.c.]
> 
> Make the affected sysfs attributes in the driver core and ACPI core
> use read_lock_device_remove() and read_unlock_device_remove() as
> described above.
> 
> Signed-off-by: Rafael J. Wysocki 
> Reported-by: Gu Zheng 

I'm sorry to forget to mention that the original reporter is
Yasuaki Ishimatsu . I continued
the investigation and found more issues.

We tested this patch on kernel 3.11-rc6, but it seems that the
issue is still there. Detail info as following.

Thanks,
Gu

==  

 
[ INFO: possible circular locking dependency detected ] 

 
3.11.0-rc6-lockdebug-refea+ #162 Tainted: GF

 
--- 

 
kworker/0:2/754 is trying to acquire lock:  

 
 (s_active#73){.+}, at: [] sysfs_addrm_finish+0x3b/0x70   

 


 
but task is already holding lock:   

 
 (mem_sysfs_mutex){+.+.+.}, at: [] 
remove_memory_block+0x1d/0xa0   



 
which lock already depends on the new lock. 

 


 


 
the existing dependency chain (in reverse order) is:

 


 
-> #4 (mem_sysfs_mutex){+.+.+.}:

[v4][PATCH 3/8] book3e/kexec/kdump: enable kexec for kernel

2013-08-25 Thread Tiejun Chen

We need to active KEXEC for book3e and bypass or convert non-book3e stuff
in kexec coverage.

Signed-off-by: Tiejun Chen 
---
 arch/powerpc/Kconfig   |2 +-
 arch/powerpc/kernel/machine_kexec_64.c |  148 ++--
 arch/powerpc/kernel/misc_64.S  |6 ++
 3 files changed, 89 insertions(+), 67 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 7205989..3c91ad0 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -378,7 +378,7 @@ config ARCH_ENABLE_MEMORY_HOTREMOVE
 
 config KEXEC
bool "kexec system call"
-   depends on (PPC_BOOK3S || FSL_BOOKE || (44x && !SMP))
+   depends on (PPC_BOOK3S || FSL_BOOKE || (44x && !SMP)) || PPC_BOOK3E
help
  kexec is a system call that implements the ability to shutdown your
  current kernel, and to start another kernel.  It is like a reboot
diff --git a/arch/powerpc/kernel/machine_kexec_64.c 
b/arch/powerpc/kernel/machine_kexec_64.c
index 611acdf..ee153a8 100644
--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -30,72 +30,6 @@
 #include 
 #include 
 
-int default_machine_kexec_prepare(struct kimage *image)
-{
-   int i;
-   unsigned long begin, end;   /* limits of segment */
-   unsigned long low, high;/* limits of blocked memory range */
-   struct device_node *node;
-   const unsigned long *basep;
-   const unsigned int *sizep;
-
-   if (!ppc_md.hpte_clear_all)
-   return -ENOENT;
-
-   /*
-* Since we use the kernel fault handlers and paging code to
-* handle the virtual mode, we must make sure no destination
-* overlaps kernel static data or bss.
-*/
-   for (i = 0; i < image->nr_segments; i++)
-   if (image->segment[i].mem < __pa(_end))
-   return -ETXTBSY;
-
-   /*
-* For non-LPAR, we absolutely can not overwrite the mmu hash
-* table, since we are still using the bolted entries in it to
-* do the copy.  Check that here.
-*
-* It is safe if the end is below the start of the blocked
-* region (end <= low), or if the beginning is after the
-* end of the blocked region (begin >= high).  Use the
-* boolean identity !(a || b)  === (!a && !b).
-*/
-   if (htab_address) {
-   low = __pa(htab_address);
-   high = low + htab_size_bytes;
-
-   for (i = 0; i < image->nr_segments; i++) {
-   begin = image->segment[i].mem;
-   end = begin + image->segment[i].memsz;
-
-   if ((begin < high) && (end > low))
-   return -ETXTBSY;
-   }
-   }
-
-   /* We also should not overwrite the tce tables */
-   for_each_node_by_type(node, "pci") {
-   basep = of_get_property(node, "linux,tce-base", NULL);
-   sizep = of_get_property(node, "linux,tce-size", NULL);
-   if (basep == NULL || sizep == NULL)
-   continue;
-
-   low = *basep;
-   high = low + (*sizep);
-
-   for (i = 0; i < image->nr_segments; i++) {
-   begin = image->segment[i].mem;
-   end = begin + image->segment[i].memsz;
-
-   if ((begin < high) && (end > low))
-   return -ETXTBSY;
-   }
-   }
-
-   return 0;
-}
-
 #define IND_FLAGS (IND_DESTINATION | IND_INDIRECTION | IND_DONE | IND_SOURCE)
 
 static void copy_segments(unsigned long ind)
@@ -367,6 +301,87 @@ void default_machine_kexec(struct kimage *image)
/* NOTREACHED */
 }
 
+#ifdef CONFIG_PPC_BOOK3E
+int default_machine_kexec_prepare(struct kimage *image)
+{
+   int i;
+   /*
+* Since we use the kernel fault handlers and paging code to
+* handle the virtual mode, we must make sure no destination
+* overlaps kernel static data or bss.
+*/
+   for (i = 0; i < image->nr_segments; i++)
+   if (image->segment[i].mem < __pa(_end))
+   return -ETXTBSY;
+   return 0;
+}
+#else /* CONFIG_PPC_BOOK3E */
+int default_machine_kexec_prepare(struct kimage *image)
+{
+   int i;
+   unsigned long begin, end;   /* limits of segment */
+   unsigned long low, high;/* limits of blocked memory range */
+   struct device_node *node;
+   const unsigned long *basep;
+   const unsigned int *sizep;
+
+   if (!ppc_md.hpte_clear_all)
+   return -ENOENT;
+
+   /*
+* Since we use the kernel fault handlers and paging code to
+* handle the virtual mode, we must make sure no destination
+* overlaps kernel static data or bss.
+*/
+   for (i = 0; i < image->nr_segments; i++)
+   if (image->segment[i].mem <

[v4][PATCH 2/8] powerpc/book3e: support CONFIG_RELOCATABLE

2013-08-25 Thread Tiejun Chen

book3e is different with book3s since 3s includes the exception
vectors code in head_64.S as it relies on absolute addressing
which is only possible within this compilation unit. So we have
to get that label address with got.

And when boot a relocated kernel, we should reset ipvr properly again
after .relocate.

Signed-off-by: Tiejun Chen 
---
 arch/powerpc/include/asm/exception-64e.h |   11 +++
 arch/powerpc/kernel/exceptions-64e.S |   18 +-
 arch/powerpc/kernel/head_64.S|   25 +
 3 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/exception-64e.h 
b/arch/powerpc/include/asm/exception-64e.h
index 51fa43e..371a77f 100644
--- a/arch/powerpc/include/asm/exception-64e.h
+++ b/arch/powerpc/include/asm/exception-64e.h
@@ -214,10 +214,21 @@ exc_##label##_book3e:
 #define TLB_MISS_STATS_SAVE_INFO_BOLTED
 #endif
 
+#ifndef CONFIG_RELOCATABLE
 #define SET_IVOR(vector_number, vector_offset) \
li  r3,vector_offset@l; \
ori r3,r3,interrupt_base_book3e@l;  \
mtspr   SPRN_IVOR##vector_number,r3;
+#else /* !CONFIG_RELOCATABLE */
+/* In relocatable case the value of the constant expression 'expr' is only
+ * offset. So instead, we should loads the address of label 'name'.
+ */
+#define SET_IVOR(vector_number, vector_offset) \
+   LOAD_REG_ADDR(r3,interrupt_base_book3e);\
+   rlwinm  r3,r3,0,15,0;   \
+   ori r3,r3,vector_offset@l;  \
+   mtspr   SPRN_IVOR##vector_number,r3;
+#endif /* CONFIG_RELOCATABLE */
 
 #endif /* _ASM_POWERPC_EXCEPTION_64E_H */
 
diff --git a/arch/powerpc/kernel/exceptions-64e.S 
b/arch/powerpc/kernel/exceptions-64e.S
index 99cb68e..e71511c 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -1097,7 +1097,15 @@ skpinv:  addir6,r6,1 /* 
Increment */
  * r4 = MAS0 w/TLBSEL & ESEL for the temp mapping
  */
/* Now we branch the new virtual address mapped by this entry */
+#ifdef CONFIG_RELOCATABLE
+   /* We have to find out address from lr. */
+   bl  1f  /* Find our address */
+1: mflrr6
+   addir6,r6,(2f - 1b)
+   tovirt(r6,r6)
+#else
LOAD_REG_IMMEDIATE(r6,2f)
+#endif
lis r7,MSR_KERNEL@h
ori r7,r7,MSR_KERNEL@l
mtspr   SPRN_SRR0,r6
@@ -1348,9 +1356,17 @@ _GLOBAL(book3e_secondary_thread_init)
mflrr28
b   3b
 
-_STATIC(init_core_book3e)
+_GLOBAL(init_core_book3e)
/* Establish the interrupt vector base */
+#ifdef CONFIG_RELOCATABLE
+/* In relocatable case the value of the constant expression 'expr' is only
+ * offset. So instead, we should loads the address of label 'name'.
+ */
+   tovirt(r2,r2)
+   LOAD_REG_ADDR(r3, interrupt_base_book3e)
+#else
LOAD_REG_IMMEDIATE(r3, interrupt_base_book3e)
+#endif
mtspr   SPRN_IVPR,r3
sync
blr
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 3d11d80..27cfbcd 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -414,12 +414,25 @@ _STATIC(__after_prom_start)
/* process relocations for the final address of the kernel */
lis r25,PAGE_OFFSET@highest /* compute virtual base of kernel */
sldir25,r25,32
+#if defined(CONFIG_PPC_BOOK3E)
+   tovirt(r26,r26) /* on booke, we already run at 
PAGE_OFFSET */
+#endif
lwz r7,__run_at_load-_stext(r26)
+#if defined(CONFIG_PPC_BOOK3E)
+   tophys(r26,r26) /* Restore for the remains. */
+#endif
cmplwi  cr0,r7,1/* flagged to stay where we are ? */
bne 1f
add r25,r25,r26
 1: mr  r3,r25
bl  .relocate
+#if defined(CONFIG_PPC_BOOK3E)
+   /* In relocatable case we always have to load the address of label 
'name'
+* to set IVPR. So after .relocate we have to update IVPR with current
+* address of label.
+*/
+   bl  .init_core_book3e
+#endif
 #endif
 
 /*
@@ -447,12 +460,24 @@ _STATIC(__after_prom_start)
  * variable __run_at_load, if it is set the kernel is treated as relocatable
  * kernel, otherwise it will be moved to PHYSICAL_START
  */
+#if defined(CONFIG_PPC_BOOK3E)
+   tovirt(r26,r26) /* on booke, we already run at 
PAGE_OFFSET */
+#endif
lwz r7,__run_at_load-_stext(r26)
+#if defined(CONFIG_PPC_BOOK3E)
+   tophys(r26,r26) /* Restore for the remains. */
+#endif
cmplwi  cr0,r7,1
bne 3f
 
+#ifdef CONFIG_PPC_BOOK3E
+   LOAD_REG_ADDR(r5, __end_interrupts)
+   LOAD_REG_ADDR(r11, _stext)
+   sub r5,r5,r11
+#else
/* just copy interrupts */
LOAD_REG_IMMEDIATE(r5, __end_interrupts - _stext)
+#endif
b   5f
 3:
 #endif
-- 
1.7.9.5

--
To unsubscribe from this list: send the line

[v4][PATCH 1/8] powerpc/book3e: rename interrupt_end_book3e with __end_interrupts

2013-08-25 Thread Tiejun Chen

We can rename 'interrupt_end_book3e' with '__end_interrupts' then
book3s/book3e can share this unique label to make sure we can use
this conveniently.

Signed-off-by: Tiejun Chen 
---
 arch/powerpc/kernel/exceptions-64e.S |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64e.S 
b/arch/powerpc/kernel/exceptions-64e.S
index 2d06704..99cb68e 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -309,8 +309,8 @@ interrupt_base_book3e:  
/* fake trap */
EXCEPTION_STUB(0x300, hypercall)
EXCEPTION_STUB(0x320, ehpriv)
 
-   .globl interrupt_end_book3e
-interrupt_end_book3e:
+   .globl __end_interrupts
+__end_interrupts:
 
 /* Critical Input Interrupt */
START_EXCEPTION(critical_input);
@@ -493,7 +493,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
beq+1f
 
LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)
-   LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e)
+   LOAD_REG_IMMEDIATE(r15,__end_interrupts)
cmpld   cr0,r10,r14
cmpld   cr1,r10,r15
blt+cr0,1f
@@ -559,7 +559,7 @@ kernel_dbg_exc:
beq+1f
 
LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)
-   LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e)
+   LOAD_REG_IMMEDIATE(r15,__end_interrupts)
cmpld   cr0,r10,r14
cmpld   cr1,r10,r15
blt+cr0,1f
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v4][PATCH 8/8] book3e/kexec/kdump: recover "r4 = 0" to create the initial TLB

2013-08-25 Thread Tiejun Chen

In commit 96f013f, "powerpc/kexec: Add kexec "hold" support for Book3e
processors", requires that GPR4 survive the "hold" process, for IBM Blue
Gene/Q with with some very strange firmware. But for FSL Book3E, r4 = 1
to indicate that the initial TLB entry for this core already exists so
we still should set r4 with 0 to create that initial TLB.

Signed-off-by: Tiejun Chen 
---
 arch/powerpc/kernel/head_64.S |4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index fa74d20..001b112 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -127,6 +127,10 @@ __secondary_hold:
/* Grab our physical cpu number */
mr  r24,r3
/* stash r4 for book3e */
+#ifdef CONFIG_PPC_FSL_BOOK3E
+   /* we need to setup initial TLB entry. */
+   li  r4,0
+#endif
mr  r25,r4
 
/* Tell the master cpu we're here */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v4][PATCH 6/8] book3e/kexec/kdump: implement ppc64 kexec specfic

2013-08-25 Thread Tiejun Chen

ppc64 kexec mechanism has a different implementation with ppc32.

Signed-off-by: Tiejun Chen 
---
 arch/powerpc/platforms/85xx/smp.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/platforms/85xx/smp.c 
b/arch/powerpc/platforms/85xx/smp.c
index 549948a..137ad10 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -277,6 +277,7 @@ struct smp_ops_t smp_85xx_ops = {
 };
 
 #ifdef CONFIG_KEXEC
+#ifdef CONFIG_PPC32
 atomic_t kexec_down_cpus = ATOMIC_INIT(0);
 
 void mpc85xx_smp_kexec_cpu_down(int crash_shutdown, int secondary)
@@ -295,6 +296,14 @@ static void mpc85xx_smp_kexec_down(void *arg)
if (ppc_md.kexec_cpu_down)
ppc_md.kexec_cpu_down(0,1);
 }
+#else
+void mpc85xx_smp_kexec_cpu_down(int crash_shutdown, int secondary)
+{
+   local_irq_disable();
+   hard_irq_disable();
+   mpic_teardown_this_cpu(secondary);
+}
+#endif
 
 static void map_and_flush(unsigned long paddr)
 {
@@ -346,11 +355,14 @@ static void mpc85xx_smp_flush_dcache_kexec(struct kimage 
*image)
 
 static void mpc85xx_smp_machine_kexec(struct kimage *image)
 {
+#ifdef CONFIG_PPC32
int timeout = INT_MAX;
int i, num_cpus = num_present_cpus();
+#endif
 
mpc85xx_smp_flush_dcache_kexec(image);
 
+#ifdef CONFIG_PPC32
if (image->type == KEXEC_TYPE_DEFAULT)
smp_call_function(mpc85xx_smp_kexec_down, NULL, 0);
 
@@ -368,6 +380,7 @@ static void mpc85xx_smp_machine_kexec(struct kimage *image)
if ( i == smp_processor_id() ) continue;
mpic_reset_core(i);
}
+#endif
 
default_machine_kexec(image);
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v4][PATCH 7/8] book3e/kexec/kdump: redefine VIRT_PHYS_OFFSET

2013-08-25 Thread Tiejun Chen

Book3e is always aligned 1GB to create TLB so we should
use (KERNELBASE - MEMORY_START) as VIRT_PHYS_OFFSET to
get __pa/__va properly while boot kdump.

Signed-off-by: Tiejun Chen 
---
 arch/powerpc/include/asm/page.h |2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 988c812..5b00081 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -112,6 +112,8 @@ extern long long virt_phys_offset;
 /* See Description below for VIRT_PHYS_OFFSET */
 #ifdef CONFIG_RELOCATABLE_PPC32
 #define VIRT_PHYS_OFFSET virt_phys_offset
+#elif defined(CONFIG_PPC_BOOK3E_64)
+#define VIRT_PHYS_OFFSET (KERNELBASE - MEMORY_START)
 #else
 #define VIRT_PHYS_OFFSET (KERNELBASE - PHYSICAL_START)
 #endif
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v4][PATCH 5/8] book3e/kexec/kdump: introduce a kexec kernel flag

2013-08-25 Thread Tiejun Chen

We need to introduce a flag to indicate we're already running
a kexec kernel then we can go proper path. For example, We
shouldn't access spin_table from the bootloader to up any secondary
cpu for kexec kernel, and kexec kernel already know how to jump to
generic_secondary_smp_init.

Signed-off-by: Tiejun Chen 
---
 arch/powerpc/include/asm/smp.h|1 +
 arch/powerpc/kernel/head_64.S |   10 ++
 arch/powerpc/kernel/misc_64.S |6 ++
 arch/powerpc/platforms/85xx/smp.c |   20 +++-
 4 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 98da78e..92f7e61 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -207,6 +207,7 @@ extern void generic_secondary_thread_init(void);
 extern unsigned long __secondary_hold_spinloop;
 extern unsigned long __secondary_hold_acknowledge;
 extern char __secondary_hold;
+extern unsigned long __run_at_kexec;
 
 extern void __early_start(void);
 #endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index fd42f8a..fa74d20 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -89,6 +89,10 @@ __secondary_hold_spinloop:
 __secondary_hold_acknowledge:
.llong  0x0
 
+   .globl  __run_at_kexec
+__run_at_kexec:
+   .llong  0x0 /* Flag for the secondary kernel from kexec. */
+
 #ifdef CONFIG_RELOCATABLE
/* This flag is set to 1 by a loader if the kernel should run
 * at the loaded address instead of the linked address.  This
@@ -426,6 +430,7 @@ _STATIC(__after_prom_start)
add r25,r25,r26
 1: mr  r3,r25
bl  .relocate
+
 #if defined(CONFIG_PPC_BOOK3E)
/* In relocatable case we always have to load the address of label 
'name'
 * to set IVPR. So after .relocate we have to update IVPR with current
@@ -463,6 +468,11 @@ _STATIC(__after_prom_start)
 #if defined(CONFIG_PPC_BOOK3E)
tovirt(r26,r26) /* on booke, we already run at 
PAGE_OFFSET */
 #endif
+#if defined(CONFIG_KEXEC) || defined(CONFIG_CRASH_DUMP)
+   /* If relocated we need to restore this flag on that relocated address. 
*/
+   ld  r7,__run_at_kexec-_stext(r3)
+   std r7,__run_at_kexec-_stext(r26)
+#endif
lwz r7,__run_at_load-_stext(r26)
 #if defined(CONFIG_PPC_BOOK3E)
tophys(r26,r26) /* Restore for the remains. */
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 049be29..adf60b6 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -639,6 +639,12 @@ _GLOBAL(kexec_sequence)
bl  .copy_and_flush /* (dest, src, copy limit, start offset) */
 1: /* assume normal blr return */
 
+   /* notify we're going into kexec kernel for SMP. */
+   LOAD_REG_ADDR(r3,__run_at_kexec)
+   li  r4,1
+   std r4,0(r3)
+   sync
+
/* release other cpus to the new kernel secondary start at 0x60 */
mflrr5
li  r6,1
diff --git a/arch/powerpc/platforms/85xx/smp.c 
b/arch/powerpc/platforms/85xx/smp.c
index ea9c626..549948a 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -150,6 +150,9 @@ static int smp_85xx_kick_cpu(int nr)
int hw_cpu = get_hard_smp_processor_id(nr);
int ioremappable;
int ret = 0;
+#ifdef CONFIG_PPC64
+   unsigned long *ptr = NULL;
+#endif
 
WARN_ON(nr < 0 || nr >= NR_CPUS);
WARN_ON(hw_cpu < 0 || hw_cpu >= NR_CPUS);
@@ -238,11 +241,18 @@ out:
 #else
smp_generic_kick_cpu(nr);
 
-   flush_spin_table(spin_table);
-   out_be32(_table->pir, hw_cpu);
-   out_be64((u64 *)(_table->addr_h),
- __pa((u64)*((unsigned long long *)generic_secondary_smp_init)));
-   flush_spin_table(spin_table);
+   ptr  = (unsigned long *)((unsigned long)&__run_at_kexec);
+   /* We shouldn't access spin_table from the bootloader to up any
+* secondary cpu for kexec kernel, and kexec kernel already
+* know how to jump to generic_secondary_smp_init.
+*/
+   if (!*ptr) {
+   flush_spin_table(spin_table);
+   out_be32(_table->pir, hw_cpu);
+   out_be64((u64 *)(_table->addr_h),
+__pa((u64)*((unsigned long long 
*)generic_secondary_smp_init)));
+   flush_spin_table(spin_table);
+   }
 #endif
 
local_irq_restore(flags);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v4][PATCH 0/8] powerpc/book3e: support kexec and kdump

2013-08-25 Thread Tiejun Chen

Ben,

I don't see any further comments, so could you merge this kindly?

This patchset is used to support kexec and kdump on book3e.

Tested on fsl-p5040 DS.

v4:

* rebase on next branch

v3:

* add one patch to rename interrupt_end_book3e with __end_interrupts
  then we can have a unique lable for book3e and book3s.
* add some comments for "book3e/kexec/kdump: enable kexec for kernel"
* clean "book3e/kexec/kdump: introduce a kexec kernel flag"

v2:
* rebase on merge branch

v1:
* improve some patch head
* rebase on next branch with patch 7


Tiejun Chen (8):
  powerpc/book3e: rename interrupt_end_book3e with __end_interrupts
  powerpc/book3e: support CONFIG_RELOCATABLE
  book3e/kexec/kdump: enable kexec for kernel
  book3e/kexec/kdump: create a 1:1 TLB mapping
  book3e/kexec/kdump: introduce a kexec kernel flag
  book3e/kexec/kdump: implement ppc64 kexec specfic
  book3e/kexec/kdump: redefine VIRT_PHYS_OFFSET
  book3e/kexec/kdump: recover "r4 = 0" to create the initial TLB

 arch/powerpc/Kconfig |2 +-
 arch/powerpc/include/asm/exception-64e.h |   11 +++
 arch/powerpc/include/asm/page.h  |2 +
 arch/powerpc/include/asm/smp.h   |1 +
 arch/powerpc/kernel/exceptions-64e.S |   26 +-
 arch/powerpc/kernel/head_64.S|   48 +-
 arch/powerpc/kernel/machine_kexec_64.c   |  148 +-
 arch/powerpc/kernel/misc_64.S|   67 +-
 arch/powerpc/platforms/85xx/smp.c|   33 ++-
 9 files changed, 257 insertions(+), 81 deletions(-)

Tiejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v4][PATCH 4/8] book3e/kexec/kdump: create a 1:1 TLB mapping

2013-08-25 Thread Tiejun Chen

book3e have no real MMU mode so we have to create a 1:1 TLB
mapping to make sure we can access the real physical address.
And correct something to support this pseudo real mode on book3e.

Signed-off-by: Tiejun Chen 
---
 arch/powerpc/kernel/head_64.S |9 ---
 arch/powerpc/kernel/misc_64.S |   55 -
 2 files changed, 60 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 27cfbcd..fd42f8a 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -447,12 +447,12 @@ _STATIC(__after_prom_start)
tovirt(r3,r3)   /* on booke, we already run at 
PAGE_OFFSET */
 #endif
mr. r4,r26  /* In some cases the loader may  */
+#if defined(CONFIG_PPC_BOOK3E)
+   tovirt(r4,r4)
+#endif
beq 9f  /* have already put us at zero */
li  r6,0x100/* Start offset, the first 0x100 */
/* bytes were copied earlier.*/
-#ifdef CONFIG_PPC_BOOK3E
-   tovirt(r6,r6)   /* on booke, we already run at 
PAGE_OFFSET */
-#endif
 
 #ifdef CONFIG_RELOCATABLE
 /*
@@ -495,6 +495,9 @@ _STATIC(__after_prom_start)
 p_end: .llong  _end - _stext
 
 4: /* Now copy the rest of the kernel up to _end */
+#if defined(CONFIG_PPC_BOOK3E)
+   tovirt(r26,r26)
+#endif
addis   r5,r26,(p_end - _stext)@ha
ld  r5,(p_end - _stext)@l(r5)   /* get _end */
 5: bl  .copy_and_flush /* copy the rest */
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 2d3fd07..049be29 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -480,6 +480,49 @@ kexec_flag:
 
 
 #ifdef CONFIG_KEXEC
+#ifdef CONFIG_PPC_BOOK3E
+/* BOOK3E have no a real MMU mode so we have to setup the initial TLB
+ * for a core to map v:0 to p:0 as 1:1. This current implementation
+ * assume that 1G is enough for kexec.
+ */
+#include 
+kexec_create_tlb:
+   /* Invalidate all TLBs to avoid any TLB conflict. */
+   PPC_TLBILX_ALL(0,R0)
+   sync
+   isync
+
+   mfspr   r10,SPRN_TLB1CFG
+   andi.   r10,r10,TLBnCFG_N_ENTRY /* Extract # entries */
+   subir10,r10,1   /* Often its always safe to use last */
+   lis r9,MAS0_TLBSEL(1)@h
+   rlwimi  r9,r10,16,4,15  /* Setup MAS0 = TLBSEL | ESEL(r9) */
+
+/* Setup a temp mapping v:0 to p:0 as 1:1 and return to it.
+ */
+#ifdef CONFIG_SMP
+#define M_IF_SMP   MAS2_M
+#else
+#define M_IF_SMP   0
+#endif
+   mtspr   SPRN_MAS0,r9
+
+   lis r9,(MAS1_VALID|MAS1_IPROT)@h
+   ori r9,r9,(MAS1_TSIZE(BOOK3E_PAGESZ_1GB))@l
+   mtspr   SPRN_MAS1,r9
+
+   LOAD_REG_IMMEDIATE(r9, 0x0 | M_IF_SMP)
+   mtspr   SPRN_MAS2,r9
+
+   LOAD_REG_IMMEDIATE(r9, 0x0 | MAS3_SR | MAS3_SW | MAS3_SX)
+   mtspr   SPRN_MAS3,r9
+   li  r9,0
+   mtspr   SPRN_MAS7,r9
+
+   tlbwe
+   isync
+   blr
+#endif
 
 /* kexec_smp_wait(void)
  *
@@ -493,6 +536,10 @@ kexec_flag:
  */
 _GLOBAL(kexec_smp_wait)
lhz r3,PACAHWCPUID(r13)
+#ifdef CONFIG_PPC_BOOK3E
+   /* Create a 1:1 mapping. */
+   bl  kexec_create_tlb
+#endif
bl  real_mode
 
li  r4,KEXEC_STATE_REAL_MODE
@@ -509,6 +556,7 @@ _GLOBAL(kexec_smp_wait)
  * don't overwrite r3 here, it is live for kexec_wait above.
  */
 real_mode: /* assume normal blr return */
+#ifndef CONFIG_PPC_BOOK3E
 1: li  r9,MSR_RI
li  r10,MSR_DR|MSR_IR
mflrr11 /* return address to SRR0 */
@@ -520,7 +568,10 @@ real_mode: /* assume normal blr return */
mtspr   SPRN_SRR1,r10
mtspr   SPRN_SRR0,r11
rfid
-
+#else
+   /* the real mode is nothing for book3e. */
+   blr
+#endif
 
 /*
  * kexec_sequence(newstack, start, image, control, clear_all())
@@ -569,6 +620,8 @@ _GLOBAL(kexec_sequence)
mtmsrd  r3,1
 #else
wrteei  0
+   /* Create a 1:1 mapping. */
+   bl  kexec_create_tlb
 #endif
 
/* copy dest pages, flush whole dest image */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] staging: rtl8192e: Remove pt_regs * irq handler parameter

2013-08-25 Thread navin patidar

struct pt_regs pointer is no longer passed as a irq handler
argument. and also remove unnecessary macros.

Signed-off-by: navin patidar 
---
 drivers/staging/rtl8192e/rtl8192e/rtl_core.c |5 +++--
 drivers/staging/rtl8192e/rtl8192e/rtl_core.h |6 --
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c 
b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
index 2b6c61c..d952a34 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
@@ -94,6 +94,7 @@ MODULE_DEVICE_TABLE(pci, rtl8192_pci_id_tbl);
 static int rtl8192_pci_probe(struct pci_dev *pdev,
const struct pci_device_id *id);
 static void rtl8192_pci_disconnect(struct pci_dev *pdev);
+static irqreturn_t rtl8192_interrupt(int irq, void *netdev);
 
 static struct pci_driver rtl8192_pci_driver = {
.name = DRV_NAME,   /* Driver name   */
@@ -1324,7 +1325,7 @@ static short rtl8192_init(struct net_device *dev)
(unsigned long)dev);
 
rtl8192_irq_disable(dev);
-   if (request_irq(dev->irq, (void *)rtl8192_interrupt_rsl, IRQF_SHARED,
+   if (request_irq(dev->irq, rtl8192_interrupt, IRQF_SHARED,
dev->name, dev)) {
printk(KERN_ERR "Error allocating IRQ %d", dev->irq);
return -1;
@@ -2704,7 +2705,7 @@ out:
 }
 
 
-irqreturn_type rtl8192_interrupt(int irq, void *netdev, struct pt_regs *regs)
+irqreturn_t rtl8192_interrupt(int irq, void *netdev)
 {
struct net_device *dev = (struct net_device *) netdev;
struct r8192_priv *priv = (struct r8192_priv *)rtllib_priv(dev);
diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_core.h 
b/drivers/staging/rtl8192e/rtl8192e/rtl_core.h
index 87d4d34..9d7cb0e 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_core.h
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_core.h
@@ -88,10 +88,6 @@
.subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID , \
.driver_data = (kernel_ulong_t)&(cfg)
 
-#define irqreturn_type irqreturn_t
-
-#define rtl8192_interrupt(x, y, z) rtl8192_interrupt_rsl(x, y)
-
 #define RTL_MAX_SCAN_SIZE 128
 
 #define RTL_RATE_MAX   30
@@ -1044,8 +1040,6 @@ void rtl8192_set_chan(struct net_device *dev, short ch);
 void check_rfctrl_gpio_timer(unsigned long data);
 
 void rtl8192_hw_wakeup_wq(void *data);
-irqreturn_type rtl8192_interrupt(int irq, void *netdev, struct pt_regs *regs);
-
 short rtl8192_pci_initdescring(struct net_device *dev);
 
 void rtl8192_cancel_deferred_work(struct r8192_priv *priv);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 03/10] sched: Clean-up struct sd_lb_stat

2013-08-25 Thread Lei Wen

On Tue, Aug 20, 2013 at 12:01 AM, Peter Zijlstra  wrote:
> From: Joonsoo Kim 
>
> There is no reason to maintain separate variables for this_group
> and busiest_group in sd_lb_stat, except saving some space.
> But this structure is always allocated in stack, so this saving
> isn't really benificial [peterz: reducing stack space is good; in this
> case readability increases enough that I think its still beneficial]
>
> This patch unify these variables, so IMO, readability may be improved.
>
> Signed-off-by: Joonsoo Kim 
> [peterz: lots of style edits, a few fixes and a rename]
> Signed-off-by: Peter Zijlstra 
> Link: 
> http://lkml.kernel.org/r/1375778203-31343-4-git-send-email-iamjoonsoo@lge.com
> ---
>  kernel/sched/fair.c |  225 
> +---
>  1 file changed, 112 insertions(+), 113 deletions(-)
>
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4277,36 +4277,6 @@ static unsigned long task_h_load(struct
>
[snip]...
> -   env->imbalance = DIV_ROUND_CLOSEST(
> -   sds->max_load * sds->busiest->sgp->power, SCHED_POWER_SCALE);
> +   env->imbalance = DIV_ROUND_CLOSEST(sds->busiest_stat.avg_load *
> +   sds->busiest->sgp->power, SCHED_POWER_SCALE);
>

I am wondering whether we could change this line as below is more appropriate,
since it would avoid the division here:
   env->imbalance = (sds->busiest_stat.avg_load * sds->busiest->sgp->power)
  >> SCHED_POWER_SHIFT;

I am not sure whether compiler would be smarter enough to covert into
>> operation,
if it see SCHED_POWER_SCALE is 1024 here.

Thanks,
Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2] pinctrl: Pass all configs to driver on pin_config_set()

2013-08-25 Thread Sherman Yin

Hi Linus,

>Didn't you get review from Stephen Warren?

Yes, just wasn't sure when those tags should be added.  They have been 
added to v3 now.

>Please try to put all the maintainers for the above files on the To: line
>so they get a chance to review/ack the patch.

Ok.  I've added the emails from get_maintainer.pl for each of the files.
v3 has been rebased today and I also applied the API changes to 
pinctrl-palmas.c

Regards,
Sherman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/1] ipv6:remove ipv6 global address after the interface is down

2013-08-25 Thread zhuyj


2.6.34.x kernels require a similar logic change as commit 73a8bd74
[ipv6:Revert 'administrative down' address handling changes]
introduces for newer kernels.

In 2.6.34.x kernels, when an interface with ipv6 global address is
restarted, the ipv6 route item disappear, but ipv6 global address
still remains. Compared with kernel versions 3.4+, the ipv6 global
address and route item should both disappear. To be consistent with
kernel versions 3.4+, we remove ipv6 address from the interface that
is shutdown.

Signed-off-by: yanjun.zhu
---
 net/ipv6/addrconf.c |   68 +-
 1 files changed, 13 insertions(+), 55 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 018f431..bcb9d24 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2652,10 +2652,9 @@ static void addrconf_bonding_change(struct net_device 
*dev, unsigned long event)
 static int addrconf_ifdown(struct net_device *dev, int how)
 {
struct inet6_dev *idev;
-   struct inet6_ifaddr *ifa, *keep_list, **bifa;
+   struct inet6_ifaddr *ifa, **bifa;
struct net *net = dev_net(dev);
-   int state;
-   int i;
+   int state, i;
 
 	ASSERT_RTNL();
 
@@ -2686,9 +2685,7 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 
 		write_lock_bh(_hash_lock);

while ((ifa = *bifa) != NULL) {
-   if (ifa->idev == idev &&
-   (how || !(ifa->flags_F_PERMANENT) ||
-ipv6_addr_type(>addr) & IPV6_ADDR_LINKLOCAL)) 
{
+   if (ifa->idev == idev){
*bifa = ifa->lst_next;
ifa->lst_next = NULL;
__in6_ifa_put(ifa);
@@ -2726,65 +2723,27 @@ static int addrconf_ifdown(struct net_device *dev, int 
how)
write_lock_bh(>lock);
}
 #endif
-   keep_list = NULL;
-   bifa = _list;
while ((ifa = idev->addr_list) != NULL) {
idev->addr_list = ifa->if_next;
ifa->if_next = NULL;
-
addrconf_del_timer(ifa);
 
-		/* If just doing link down, and address is permanent

-  and not link-local, then retain it. */
-   if (how == 0 &&
-   (ifa->flags_F_PERMANENT) &&
-   !(ipv6_addr_type(>addr) & IPV6_ADDR_LINKLOCAL)) {
-
-   /* Move to holding list */
-   *bifa = ifa;
-   bifa = >if_next;
-
-   /* If not doing DAD on this address, just keep it. */
-   if ((dev->flags&(IFF_NOARP|IFF_LOOPBACK)) ||
-   idev->cnf.accept_dad <= 0 ||
-   (ifa->flags & IFA_F_NODAD))
-   continue;
-
-   /* If it was tentative already, no need to notify */
-   if (ifa->flags & IFA_F_TENTATIVE)
-   continue;
-
-   /* Flag it for later restoration when link comes up */
-   ifa->flags |= IFA_F_TENTATIVE;
-   ifa->state = INET6_IFADDR_STATE_DAD;
-
-   write_unlock_bh(>lock);
-
-   in6_ifa_hold(ifa);
-   } else {
-   write_unlock_bh(>lock);
-   spin_lock_bh(>state_lock);
-   state = ifa->state;
-   ifa->state = INET6_IFADDR_STATE_DEAD;
-   spin_unlock_bh(>state_lock);
-
-   if (state == INET6_IFADDR_STATE_DEAD)
-   goto put_ifa;
+   write_unlock_bh(>lock);
+   spin_lock_bh(>state_lock);
+   state = ifa->state;
+   ifa->state = INET6_IFADDR_STATE_DEAD;
+   spin_unlock_bh(>state_lock);
+
+   if (state != INET6_IFADDR_STATE_DEAD){
+   __ipv6_ifa_notify(RTM_DELADDR, ifa);
+   atomic_notifier_call_chain(_chain, 
NETDEV_DOWN, ifa);
}
 
-		__ipv6_ifa_notify(RTM_DELADDR, ifa);

-   if (ifa->state == INET6_IFADDR_STATE_DEAD)
-   atomic_notifier_call_chain(_chain,
-  NETDEV_DOWN, ifa);
-
-put_ifa:
in6_ifa_put(ifa);
 
 		write_lock_bh(>lock);

}
 
-	idev->addr_list = keep_list;

-
write_unlock_bh(>lock);
 
 	/* Step 5: Discard multicast list */

@@ -4103,8 +4062,7 @@ static void __ipv6_ifa_notify(int event, struct 
inet6_ifaddr *ifp)
addrconf_leave_solict(ifp->idev, >addr);
dst_hold(>rt->u.dst);
 
-		if (ifp->state == INET6_IFADDR_STATE_DEAD &&

-   ip6_del_rt(ifp->rt))
+   if (ip6_del_rt(ifp->rt))
dst_free(>rt->u.dst);
break;
}


On 08/26/2013 10:28

Re: [PATCH v3 2/2] media: i2c: adv7343: add OF support

2013-08-25 Thread Prabhakar Lad

Hi Sylwester,

On Fri, Aug 23, 2013 at 11:33 PM, Sylwester Nawrocki
 wrote:
> Cc: DT binding maintainers
>
> On 07/20/2013 08:21 AM, Lad, Prabhakar wrote:
>> From: "Lad, Prabhakar" 
>>
>> add OF support for the adv7343 driver.
>>
>> Signed-off-by: Lad, Prabhakar 
>> ---
> [...]
>>  .../devicetree/bindings/media/i2c/adv7343.txt  |   48 
>> 
>>  drivers/media/i2c/adv7343.c|   46 
>> ++-
>>  2 files changed, 93 insertions(+), 1 deletion(-)
>>  create mode 100644 Documentation/devicetree/bindings/media/i2c/adv7343.txt
>>
>> diff --git a/Documentation/devicetree/bindings/media/i2c/adv7343.txt 
>> b/Documentation/devicetree/bindings/media/i2c/adv7343.txt
>> new file mode 100644
>> index 000..5653bc2
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/media/i2c/adv7343.txt
>> @@ -0,0 +1,48 @@
>> +* Analog Devices adv7343 video encoder
>> +
>> +The ADV7343 are high speed, digital-to-analog video encoders in a 64-lead 
>> LQFP
>> +package. Six high speed, 3.3 V, 11-bit video DACs provide support for 
>> composite
>> +(CVBS), S-Video (Y-C), and component (YPrPb/RGB) analog outputs in standard
>> +definition (SD), enhanced definition (ED), or high definition (HD) video
>> +formats.
>> +
>> +Required Properties :
>> +- compatible: Must be "adi,adv7343"
>> +
>> +Optional Properties :
>> +- adi,power-mode-sleep-mode: on enable the current consumption is reduced to
>> +   micro ampere level. All DACs and the internal PLL
>> +   circuit are disabled.
>
> Sorry for getting back so late to this. I realize this is already queued in
> the media tree. But this binding doesn't look good enough to me. I think it
> will need to be corrected during upcoming -rc period.
>
Thanks for the catch :-)

> It might be hard to figure out only from the chip's datasheet what
> adi,power-mode-sleep-mode really refers to. AFAICS it is for assigning some
> value to a specific register. If we really need to specify register values
> in the device tree then it would probably make sense to describe to which
> register this apply. Now the name looks like derived from some structure
> member name in the Linux driver of the device.
>
the property is derived from the datasheet itself for example the
'adi,power-mode-sleep-mode' --> Register 0x0 power mode bit 0
'adi,power-mode-pll-ctrl' ---> Register 0x0 power mode bit 1
'adi,dac-enable' > Register 0x0 power mode bit 2-7
'adi,sd-dac-enable' ---> Register 0x82 SD mode register bit 1-2

[1] http://www.analog.com/static/imported-files/data_sheets/ADV7342_7343.pdf

>> +- adi,power-mode-pll-ctrl: PLL and oversampling control. This control allows
>> +internal PLL 1 circuit to be powered down and the
>> +oversampling to be switched off.
>
> Similar comments applies to this property.
>
>> +- ad,adv7343-power-mode-dac: array configuring the power on/off DAC's 1..6,
>> +   0 = OFF and 1 = ON, Default value when this
>> +   property is not specified is <0 0 0 0 0 0>.
>
> Name of the property is incorrect here. It has changed to "adi,dac-enable".
>
OK

>> +- ad,adv7343-sd-config-dac-out: array configure SD DAC Output's 1 and 2, 0 
>> = OFF
>> +  and 1 = ON, Default value when this property 
>> is
>> +  not specified is <0 0>.
>
> Similarly, "adi,sd-dac-enable.
>
OK

Regards,
--Prabhakar Lad
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2 3/3] s390/kprobes: add support for pc-relative long displacement instructions

2013-08-25 Thread Masami Hiramatsu

(2013/08/23 20:04), Heiko Carstens wrote:
> With the general-instruction extension facility (z10) a couple of
> instructions with a pc-relative long displacement were introduced.
> The kprobes support for these instructions however was never implemented.
> 
> In result, if anybody ever put a probe on any of these instructions the
> result would have been random behaviour after the instruction got executed
> within the insn slot.
> 
> So lets add the missing handling for these instructions. Since all of the
> new instructions have 32 bit signed displacement the easiest solution is
> to allocate an insn slot that is within the same 2GB area like the original
> instruction and patch the displacement field.
> 

At least generic kprobes flow looks good for me.

Reviewed-by: Masami Hiramatsu 

> Signed-off-by: Heiko Carstens 
> ---
>  arch/s390/include/asm/kprobes.h |4 +-
>  arch/s390/kernel/kprobes.c  |  144 
> +--
>  2 files changed, 140 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/s390/include/asm/kprobes.h b/arch/s390/include/asm/kprobes.h
> index dcf6948..4176dfe 100644
> --- a/arch/s390/include/asm/kprobes.h
> +++ b/arch/s390/include/asm/kprobes.h
> @@ -31,6 +31,8 @@
>  #include 
>  #include 
>  
> +#define __ARCH_WANT_KPROBES_INSN_SLOT
> +
>  struct pt_regs;
>  struct kprobe;
>  
> @@ -57,7 +59,7 @@ typedef u16 kprobe_opcode_t;
>  /* Architecture specific copy of original instruction */
>  struct arch_specific_insn {
>   /* copy of original instruction */
> - kprobe_opcode_t insn[MAX_INSN_SIZE];
> + kprobe_opcode_t *insn;
>  };
>  
>  struct prev_kprobe {
> diff --git a/arch/s390/kernel/kprobes.c b/arch/s390/kernel/kprobes.c
> index 3388b2b..cb7ac9e 100644
> --- a/arch/s390/kernel/kprobes.c
> +++ b/arch/s390/kernel/kprobes.c
> @@ -37,6 +37,26 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
>  
>  struct kretprobe_blackpoint kretprobe_blacklist[] = { };
>  
> +DEFINE_INSN_CACHE_OPS(dmainsn);
> +
> +static void *alloc_dmainsn_page(void)
> +{
> + return (void *)__get_free_page(GFP_KERNEL | GFP_DMA);
> +}
> +
> +static void free_dmainsn_page(void *page)
> +{
> + free_page((unsigned long)page);
> +}
> +
> +struct kprobe_insn_cache kprobe_dmainsn_slots = {
> + .mutex = __MUTEX_INITIALIZER(kprobe_dmainsn_slots.mutex),
> + .alloc = alloc_dmainsn_page,
> + .free = free_dmainsn_page,
> + .pages = LIST_HEAD_INIT(kprobe_dmainsn_slots.pages),
> + .insn_size = MAX_INSN_SIZE,
> +};
> +
>  static int __kprobes is_prohibited_opcode(kprobe_opcode_t *insn)
>  {
>   switch (insn[0] >> 8) {
> @@ -100,9 +120,8 @@ static int __kprobes get_fixup_type(kprobe_opcode_t *insn)
>   fixup |= FIXUP_RETURN_REGISTER;
>   break;
>   case 0xc0:
> - if ((insn[0] & 0x0f) == 0x00 || /* larl  */
> - (insn[0] & 0x0f) == 0x05)   /* brasl */
> - fixup |= FIXUP_RETURN_REGISTER;
> + if ((insn[0] & 0x0f) == 0x05)   /* brasl */
> + fixup |= FIXUP_RETURN_REGISTER;
>   break;
>   case 0xeb:
>   if ((insn[2] & 0xff) == 0x44 || /* bxhg  */
> @@ -117,18 +136,128 @@ static int __kprobes get_fixup_type(kprobe_opcode_t 
> *insn)
>   return fixup;
>  }
>  
> +static int __kprobes is_insn_relative_long(kprobe_opcode_t *insn)
> +{
> + /* Check if we have a RIL-b or RIL-c format instruction which
> +  * we need to modify in order to avoid instruction emulation. */
> + switch (insn[0] >> 8) {
> + case 0xc0:
> + if ((insn[0] & 0x0f) == 0x00) /* larl */
> + return true;
> + break;
> + case 0xc4:
> + switch (insn[0] & 0x0f) {
> + case 0x02: /* llhrl  */
> + case 0x04: /* lghrl  */
> + case 0x05: /* lhrl   */
> + case 0x06: /* llghrl */
> + case 0x07: /* sthrl  */
> + case 0x08: /* lgrl   */
> + case 0x0b: /* stgrl  */
> + case 0x0c: /* lgfrl  */
> + case 0x0d: /* lrl*/
> + case 0x0e: /* llgfrl */
> + case 0x0f: /* strl   */
> + return true;
> + }
> + break;
> + case 0xc6:
> + switch (insn[0] & 0x0f) {
> + case 0x00: /* exrl   */
> + case 0x02: /* pfdrl  */
> + case 0x04: /* cghrl  */
> + case 0x05: /* chrl   */
> + case 0x06: /* clghrl */
> + case 0x07: /* clhrl  */
> + case 0x08: /* cgrl   */
> + case 0x0a: /* clgrl  */
> + case 0x0c: /* cgfrl  */
> + case 0x0d: /* crl*/
> + case 0x0e: /* clgfrl */
> + case 0x0f: /* clrl   */
> + return true;
> + }
> + break;
> + }
> + return false;
> +}
> +
> +static void __kprobes copy_instruction(struct kprobe *p)
> +{
> +

Re: [PATCH 5/8] rcu: eliminate deadlock for rcu read site

2013-08-25 Thread Lai Jiangshan

On 08/26/2013 01:43 AM, Paul E. McKenney wrote:
> On Sun, Aug 25, 2013 at 11:19:37PM +0800, Lai Jiangshan wrote:
>> Hi, Steven
>>
>> Any comments about this patch?
> 
> For whatever it is worth, it ran without incident for two hours worth
> of rcutorture on my P5 test (boosting but no CPU hotplug).
> 
> Lai, do you have a specific test for this patch?  

Also rcutorture.
(A special module is added to ensure all paths of my code are covered.)

> Your deadlock
> scenario looks plausible, but is apparently not occurring in the
> mainline kernel.

Yes, you can leave this possible bug until the real problem happens
or just disallow overlapping.
I can write some debug code for it which allow us find out
the problems earlier.

I guess this is an useful usage pattern of rcu:

again:
rcu_read_lock();
obj = read_dereference(ptr);
spin_lock_XX(obj->lock);
if (obj is invalid) {
spin_unlock_XX(obj->lock);
rcu_read_unlock();
goto again;
}
rcu_read_unlock();
# use obj
spin_unlock_XX(obj->lock);

If we encourage this pattern, we should fix all the related problems.

Thanks,
Lai

> 
>   Thanx, Paul
> 
>> Thanks,
>> Lai
>>
>>
>> On Fri, Aug 23, 2013 at 2:26 PM, Lai Jiangshan  wrote:
>>
>>> [PATCH] rcu/rt_mutex: eliminate a kind of deadlock for rcu read site
>>>
>>> Current rtmutex's lock->wait_lock doesn't disables softirq nor irq, it will
>>> cause rcu read site deadlock when rcu overlaps with any
>>> softirq-context/irq-context lock.
>>>
>>> @L is a spinlock of softirq or irq context.
>>>
>>> CPU1cpu2(rcu boost)
>>> rcu_read_lock() rt_mutext_lock()
>>>   raw_spin_lock(lock->wait_lock)
>>> spin_lock_XX(L)   >> irq>
>>> rcu_read_unlock() do_softirq()
>>>   rcu_read_unlock_special()
>>> rt_mutext_unlock()
>>>   raw_spin_lock(lock->wait_lock)spin_lock_XX(L)  **DEADLOCK**
>>>
>>> This patch fixes this kind of deadlock by removing rt_mutext_unlock() from
>>> rcu_read_unlock(), new rt_mutex_rcu_deboost_unlock() is called instead.
>>> Thus rtmutex's lock->wait_lock will not be called from rcu_read_unlock().
>>>
>>> This patch does not eliminate all kinds of rcu-read-site deadlock,
>>> if @L is a scheduler lock, it will be deadlock, we should apply Paul's rule
>>> in this case.(avoid overlapping or preempt_disable()).
>>>
>>> rt_mutex_rcu_deboost_unlock() requires the @waiter is queued, so we
>>> can't directly call rt_mutex_lock() in the rcu_boost thread,
>>> we split rt_mutex_lock() into two steps just like pi-futex.
>>> This result a internal state in rcu_boost thread and cause
>>> rcu_boost thread a bit more complicated.
>>>
>>> Thanks
>>> Lai
>>>
>>> diff --git a/include/linux/init_task.h b/include/linux/init_task.h
>>> index 5cd0f09..8830874 100644
>>> --- a/include/linux/init_task.h
>>> +++ b/include/linux/init_task.h
>>> @@ -102,7 +102,7 @@ extern struct group_info init_groups;
>>>
>>>  #ifdef CONFIG_RCU_BOOST
>>>  #define INIT_TASK_RCU_BOOST()  \
>>> -   .rcu_boost_mutex = NULL,
>>> +   .rcu_boost_waiter = NULL,
>>>  #else
>>>  #define INIT_TASK_RCU_BOOST()
>>>  #endif
>>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>>> index e9995eb..1eca99f 100644
>>> --- a/include/linux/sched.h
>>> +++ b/include/linux/sched.h
>>> @@ -1078,7 +1078,7 @@ struct task_struct {
>>> struct rcu_node *rcu_blocked_node;
>>>  #endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
>>>  #ifdef CONFIG_RCU_BOOST
>>> -   struct rt_mutex *rcu_boost_mutex;
>>> +   struct rt_mutex_waiter *rcu_boost_waiter;
>>>  #endif /* #ifdef CONFIG_RCU_BOOST */
>>>
>>>  #if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
>>> @@ -1723,7 +1723,7 @@ static inline void rcu_copy_process(struct
>>> task_struct *p)
>>> p->rcu_blocked_node = NULL;
>>>  #endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
>>>  #ifdef CONFIG_RCU_BOOST
>>> -   p->rcu_boost_mutex = NULL;
>>> +   p->rcu_boost_waiter = NULL;
>>>  #endif /* #ifdef CONFIG_RCU_BOOST */
>>> INIT_LIST_HEAD(>rcu_node_entry);
>>>  }
>>> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
>>> index 769e12e..d207ddd 100644
>>> --- a/kernel/rcutree_plugin.h
>>> +++ b/kernel/rcutree_plugin.h
>>> @@ -33,6 +33,7 @@
>>>  #define RCU_KTHREAD_PRIO 1
>>>
>>>  #ifdef CONFIG_RCU_BOOST
>>> +#include "rtmutex_common.h"
>>>  #define RCU_BOOST_PRIO CONFIG_RCU_BOOST_PRIO
>>>  #else
>>>  #define RCU_BOOST_PRIO RCU_KTHREAD_PRIO
>>> @@ -340,7 +341,7 @@ void rcu_read_unlock_special(struct task_struct *t)
>>> unsigned long flags;
>>> struct list_head *np;
>>>  #ifdef CONFIG_RCU_BOOST
>>> -   struct rt_mutex *rbmp = NULL;
>>> +   struct rt_mutex_waiter *waiter = NULL;
>>>  #endif /* #ifdef CONFIG_RCU_BOOST */
>>>

Re: Re: [PATCH] USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support

2013-08-25 Thread liujunliang_ljl

DearJoe :

I'm sorry to ask you that, do you need me to merge the 
patch and re-send it again.

And which version of kernel will release this driver.

Thanks a lot and apologizing for making you trouble.

Thanks again.


2013-08-26 



liujunliang_ljl 



发件人： Joe Perches 
发送时间： 2013-08-25  02:15:19 
收件人： liujunliang_ljl 
抄送： davem; horms; romieu; gregkh; netdev; linux-usb; linux-kernel; sunhecheng 
主题： Re: [PATCH] USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver 
Support 
 
Some whitespace and neatening fixups.
Some conversions from 4 indent tabs to normal tabs
Signed-off-by: Joe Perches 
---
Just doing this instead of commenting about spacing
again.
 drivers/net/usb/sr9700.c | 127 +--
 1 file changed, 67 insertions(+), 60 deletions(-)
diff --git a/drivers/net/usb/sr9700.c b/drivers/net/usb/sr9700.c
index 27c86ec..4262b9d 100644
--- a/drivers/net/usb/sr9700.c
+++ b/drivers/net/usb/sr9700.c
@@ -29,7 +29,7 @@ static int sr_read(struct usbnet *dev, u8 reg, u16 length, 
void *data)
  int err;

  err = usbnet_read_cmd(dev, SR_RD_REGS, SR_REQ_RD_REG,
- 0, reg, data, length);
+   0, reg, data, length);
  if ((err != length) && (err >= 0))
  err = -EINVAL;
  return err;
@@ -40,7 +40,7 @@ static int sr_write(struct usbnet *dev, u8 reg, u16 length, 
void *data)
  int err;

  err = usbnet_write_cmd(dev, SR_WR_REGS, SR_REQ_WR_REG,
- 0, reg, data, length);
+0, reg, data, length);
  if ((err >= 0) && (err < length))
  err = -EINVAL;
  return err;
@@ -54,19 +54,19 @@ static int sr_read_reg(struct usbnet *dev, u8 reg, u8 
*value)
 static int sr_write_reg(struct usbnet *dev, u8 reg, u8 value)
 {
  return usbnet_write_cmd(dev, SR_WR_REGS, SR_REQ_WR_REG,
- value, reg, NULL, 0);
+ value, reg, NULL, 0);
 }

 static void sr_write_async(struct usbnet *dev, u8 reg, u16 length, void *data)
 {
  usbnet_write_cmd_async(dev, SR_WR_REGS, SR_REQ_WR_REG,
- 0, reg, data, length);
+0, reg, data, length);
 }

 static void sr_write_reg_async(struct usbnet *dev, u8 reg, u8 value)
 {
  usbnet_write_cmd_async(dev, SR_WR_REGS, SR_REQ_WR_REG,
- value, reg, NULL, 0);
+value, reg, NULL, 0);
 }

 static int wait_phy_eeprom_ready(struct usbnet *dev, int phy)
@@ -89,7 +89,7 @@ static int wait_phy_eeprom_ready(struct usbnet *dev, int phy)

  if (i >= SR_SHARE_TIMEOUT) {
  netdev_err(dev->net, "%s write timed out!\n",
- phy ? "phy" : "eeprom");
+phy ? "phy" : "eeprom");
  ret = -EIO;
  goto out;
  }
@@ -98,7 +98,8 @@ out:
  return ret;
 }

-static int sr_share_read_word(struct usbnet *dev, int phy, u8 reg, __le16 
*value)
+static int sr_share_read_word(struct usbnet *dev, int phy, u8 reg,
+   __le16 *value)
 {
  int ret;

@@ -115,14 +116,15 @@ static int sr_share_read_word(struct usbnet *dev, int 
phy, u8 reg, __le16 *value
  ret = sr_read(dev, EPDR, 2, value);

  netdev_dbg(dev->net, "read shared %d 0x%02x returned 0x%04x, %d\n",
- phy, reg, *value, ret);
+phy, reg, *value, ret);

 out:
  mutex_unlock(>phy_mutex);
  return ret;
 }

-static int sr_share_write_word(struct usbnet *dev, int phy, u8 reg, __le16 
value)
+static int sr_share_write_word(struct usbnet *dev, int phy, u8 reg,
+__le16 value)
 {
  int ret;

@@ -156,7 +158,8 @@ static int sr9700_get_eeprom_len(struct net_device *dev)
  return SR_EEPROM_LEN;
 }

-static int sr9700_get_eeprom(struct net_device *net, struct ethtool_eeprom 
*eeprom, u8 *data)
+static int sr9700_get_eeprom(struct net_device *net,
+  struct ethtool_eeprom *eeprom, u8 *data)
 {
  struct usbnet *dev = netdev_priv(net);
  __le16 *ebuf = (__le16 *)data;
@@ -168,7 +171,8 @@ static int sr9700_get_eeprom(struct net_device *net, struct 
ethtool_eeprom *eepr
  return -EINVAL;

  for (i = 0; i < eeprom->len / 2; i++)
- ret = sr_read_eeprom_word(dev, eeprom->offset / 2 + i, [i]);
+ ret = sr_read_eeprom_word(dev, eeprom->offset / 2 + i,
+   [i]);

  return ret;
 }
@@ -199,12 +203,13 @@ static int sr_mdio_read(struct net_device *netdev, int 
phy_id, int loc)
  res = le16_to_cpu(res) & ~BMSR_LSTATUS;

  netdev_dbg(dev->net, "sr_mdio_read() phy_id=0x%02x, loc=0x%02x, 
returns=0x%04x\n",
- phy_id, loc, res);
+phy_id, loc, res);

  return res;
 }

-static void sr_mdio_write(struct net_device *netdev, int phy_id, int loc, int 
val)
+static void sr_mdio_write(struct net_device *netdev, int phy_id, int loc,
+   int val)
 {
  struct usbnet *dev = netdev_priv(netdev);
  __le16 res = cpu_to_le16(val);
@@ -215,7 +220,7 @@ static void sr_mdio_write(struct net_device *netdev, int 
phy_id, int loc, int va
  }

  netdev_dbg(dev->net, "sr_mdio_write() phy_id=0x%02x, loc=0x%02x, 
val=0x%04x\n",
- phy_id, loc, val);
+phy_id, loc, val);

  sr_share_write_word(dev, 1, loc, res);
 }
@@ -242,15 +247,15 @@ static int sr9700_ioctl(struct net_device *net, struct 
ifreq *rq, int cmd)
 }

 static const struct ethtool_ops

Re: Re: Re: [PATCH] USB2NET : SR9700 : One chip USB 1.1 USB2NETSR9700Device Driver Support

2013-08-25 Thread liujunliang_ljl

Dear all :

Thanks a lot.


2013-08-26 



liujunliang_ljl 



发件人： Joe Perches 
发送时间： 2013-08-26  10:19:35 
收件人： liujunliang_ljl 
抄送： davem; horms; romieu; gregkh; netdev; linux-usb; linux-kernel; sunhecheng 
主题： Re: Re: [PATCH] USB2NET : SR9700 : One chip USB 1.1 USB2NETSR9700Device 
Driver Support 
 
On Mon, 2013-08-26 at 10:14 +0800, liujunliang_ljl wrote:
> do you need me to merge the patch and re-send it again.
I do not.
> And which version of kernel will release this driver.
No idea.
That's up to David Miller or Greg KH to pick
up the driver and maybe take the follow-on
patch I sent.
I just sent the patch because you seemed to
have a bit of difficulty integrating the
suggestions others were giving you.
> Thanks a lot and apologizing for making you trouble.
Oh, no worries.  It was a trifle.
cheers, Joe
.

ipv6 global address remains while route item disappears after this interface is restared in 2.6.34.x

2013-08-25 Thread zhuyj


With two directly connected targets running kernel 2.6.34.x.

TargetA - TargetB
3000::1/643000::2/64

TargetA
   - bring the interface down by doing an "ifconfig eth1 down"
   - bring the interface back up by doing an "ifconfig eth1 up"

TargetB
   - ping6 3000::1
 ping6 succeeds the first time
   - after bringing the interface on TargetA down and then back up, 
ping6 to the interface fails.


The root cause is:
IPv6 address 3000::1/64 remains while the ipv6 route on eth1 disappears. 
Thus on TargetB, running "ping6 3000::1" can not succeed.
Compared with 3.4.x, ipv6 address and ipv6 route item are removed when 
an interface is restarted in the.


Best Regards!

zhuyj
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kernel/rcutree.c: deem to be lazy if there are no callbacks.

2013-08-25 Thread Chen Gang F T


Firstly, thank you for your reply with these details. 

On 08/26/2013 03:18 AM, Paul E. McKenney wrote:
> On Thu, Aug 22, 2013 at 11:01:53AM +0800, Chen Gang wrote:
>> On 08/21/2013 10:23 PM, Paul E. McKenney wrote:
>>> On Wed, Aug 21, 2013 at 01:59:29PM +0800, Chen Gang wrote:
> 
> [ . . . ]
> 
>>> Don't get me wrong, I do welcome appropriate patches.  In fact, if
>>> you look at RCU's git history, you will see that I frequently accept
>>> patches from a fair number of people.  And if you were willing to
>>> invest some time and thought, you might eventually be able to generate
>>> an appropriate (albeit low priority) patch to this function.  However,
>>> you seem to be motivated to submit small patches with a minimum of
>>> thought and preparation, perhaps because you need to meet some external
>>> or self-imposed quota of accepted patches.  And if you are in fact driven
>>> by a quota that prevents you from taking the time required to carefully
>>> think things through, you are wasting your time with RCU.
>>
>> Hmm... at least, some contents you said above is correct to me.
>>
>> At least, I should provide 10 patches per month, it is a necessary
>> basic requirement to me.
> 
> OK, that does help explain the otherwise inexplicable approach you have
> been taking.  Let's see how you have been doing, based on committer date
> in Linus's tree:
> 
>   1 2012-11
>  15 2013-01
>   7 2013-02
>  20 2013-03
>  21 2013-04
>  12 2013-05
>  17 2013-06
>  10 2013-07
> 
> The last few months might be understated a bit due to patches
> still being in maintainer trees.  This is a nice contrast from my
> first impression of you from https://lkml.org/lkml/2013/6/9/64 and
> https://lkml.org/lkml/2013/8/19/650, neither of which gave me any
> reason to trust your work, to put it mildly.  And if I cannot trust
> your work, I obviously cannot accept your patches.
> 

Hmm... better to check patches independent personal feelings (trust
some one, or not).

;-)


> You do seem to select for localized bug fixes, which require less work
> than the performance-motivated patches you were putting forward earlier
> in this thread.  With a localized bug, you demonstrate the bug, show the
> fix, and that is that.  From what I can see, part of the problem with
> your patches in this email thread is that you are trying to move from
> localized bug fixes to performance issues without doing the additional
> work required.  Please see below for a rough outline of this additional
> work.
> 

Hmm... it seems I need describe my work flow for fixing bugs in details.

  1. Is it a bug ?
 if so, I can be marked as Reported-by and continue to 2nd.
 else, it is a waste mail.

  2. Try to fix it in simple ways (so can save the maintainers time resource).
 if it can be accepted by maintainers, it is OK (I can be Signed-off-by).
 else need continue to 3rd.

   exception: if I can not find a simple way to fix it, I will send 
[Suggestion] mail.

  3. Do the maintainers know how to fix it ?
 if yes, fix it together with maintainers (may mark me only as Reported-by).
 else need continue to Last.

  Last: I should analyze it and fix it (it is my duty to fix it).


How do you feel about this work flow ? welcome any suggestions or
completions.

Thanks.

>> And what my focus is efficiency: let appliers and maintainers together
>> to provide contributes to outside with efficiency.
> 
> Sounds great, but there are many possible definitions of "efficiency".
> Given your quota, I would expect your definition to involve number of
> patches accepted.  In contrast, my definition for RCU instead involves
> maintainability, robustness, scalability, and, for a few critical
> code paths, performance.  I therefore need you to have thought through
> and carefully tested your patch.
> 

Hmm... it seems I need give more description for the 'efficiency' which
I point to.

If it is no negative effect with the quality, we need try to use less
resources (e.g. time resources) to provide more contributions (e.g. fix
issue).


>> If you already know about it, why need I continue ?  but if you don't
>> know either, I should try.
> 
> What I need you to do in future RCU performance patch submissions is:
> 
> 1.Think through your patch and the code that it is modifying.
>   If you submit a patch to me, you should be able to answer the
>   sorts of questions that I was asking in this thread.
> 
> 2.Tell me what situations your patch helps and not.
> 
> 3.Tell me how much your patch improves performance in the
>   situations where it helps.
> 
> 4.Test the code.  If it makes a measurable difference, present
>   the performance results.  (It would be very surprising if your
>   early-loop exit patch made a significant difference, expecially
>   on a CONFIG_PREEMPT=n kernel.)
> 
> 5.Rather than randomly dropping into the code, use actual measurements
>   to determine where to

Re: [Patch v2 2/3] kprobes: allow to specify custum allocator for insn caches

2013-08-25 Thread Masami Hiramatsu

(2013/08/23 20:04), Heiko Carstens wrote:
> The current two insn slot caches both use module_alloc/module_free
> to allocate and free insn slot cache pages.
> For s390 this is not sufficient since there is the need to allocate
> insn slots that are either within the vmalloc module area or within
> dma memory.
> Therefore add a mechanism which allows to specify an own allocator
> for an own insn slot cache.
> 

Acked-by: Masami Hiramatsu 

Thank you!

> Signed-off-by: Heiko Carstens 
> ---
>  include/linux/kprobes.h |2 ++
>  kernel/kprobes.c|   20 ++--
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
> index 077f653..925eaf2 100644
> --- a/include/linux/kprobes.h
> +++ b/include/linux/kprobes.h
> @@ -268,6 +268,8 @@ extern void kprobes_inc_nmissed_count(struct kprobe *p);
>  
>  struct kprobe_insn_cache {
>   struct mutex mutex;
> + void *(*alloc)(void);   /* allocate insn page */
> + void (*free)(void *);   /* free insn page */
>   struct list_head pages; /* list of kprobe_insn_page */
>   size_t insn_size;   /* size of instruction slot */
>   int nr_garbage;
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 9e4912d..a0d367a 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -112,6 +112,7 @@ static struct kprobe_blackpoint kprobe_blacklist[] = {
>  struct kprobe_insn_page {
>   struct list_head list;
>   kprobe_opcode_t *insns; /* Page of instruction slots */
> + struct kprobe_insn_cache *cache;
>   int nused;
>   int ngarbage;
>   char slot_used[];
> @@ -132,8 +133,20 @@ enum kprobe_slot_state {
>   SLOT_USED = 2,
>  };
>  
> +static void *alloc_insn_page(void)
> +{
> + return module_alloc(PAGE_SIZE);
> +}
> +
> +static void free_insn_page(void *page)
> +{
> + module_free(NULL, page);
> +}
> +
>  struct kprobe_insn_cache kprobe_insn_slots = {
>   .mutex = __MUTEX_INITIALIZER(kprobe_insn_slots.mutex),
> + .alloc = alloc_insn_page,
> + .free = free_insn_page,
>   .pages = LIST_HEAD_INIT(kprobe_insn_slots.pages),
>   .insn_size = MAX_INSN_SIZE,
>   .nr_garbage = 0,
> @@ -182,7 +195,7 @@ kprobe_opcode_t __kprobes *__get_insn_slot(struct 
> kprobe_insn_cache *c)
>* kernel image and loaded module images reside. This is required
>* so x86_64 can correctly handle the %rip-relative fixups.
>*/
> - kip->insns = module_alloc(PAGE_SIZE);
> + kip->insns = c->alloc();
>   if (!kip->insns) {
>   kfree(kip);
>   goto out;
> @@ -192,6 +205,7 @@ kprobe_opcode_t __kprobes *__get_insn_slot(struct 
> kprobe_insn_cache *c)
>   kip->slot_used[0] = SLOT_USED;
>   kip->nused = 1;
>   kip->ngarbage = 0;
> + kip->cache = c;
>   list_add(>list, >pages);
>   slot = kip->insns;
>  out:
> @@ -213,7 +227,7 @@ static int __kprobes collect_one_slot(struct 
> kprobe_insn_page *kip, int idx)
>*/
>   if (!list_is_singular(>list)) {
>   list_del(>list);
> - module_free(NULL, kip->insns);
> + kip->cache->free(kip->insns);
>   kfree(kip);
>   }
>   return 1;
> @@ -274,6 +288,8 @@ out:
>  /* For optimized_kprobe buffer */
>  struct kprobe_insn_cache kprobe_optinsn_slots = {
>   .mutex = __MUTEX_INITIALIZER(kprobe_optinsn_slots.mutex),
> + .alloc = alloc_insn_page,
> + .free = free_insn_page,
>   .pages = LIST_HEAD_INIT(kprobe_optinsn_slots.pages),
>   /* .insn_size is initialized later */
>   .nr_garbage = 0,
> 


-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: [PATCH] USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support

2013-08-25 Thread Joe Perches

On Mon, 2013-08-26 at 10:14 +0800, liujunliang_ljl wrote:
> do you need me to merge the patch and re-send it again.

I do not.

> And which version of kernel will release this driver.

No idea.

That's up to David Miller or Greg KH to pick
up the driver and maybe take the follow-on
patch I sent.

I just sent the patch because you seemed to
have a bit of difficulty integrating the
suggestions others were giving you.

> Thanks a lot and apologizing for making you trouble.

Oh, no worries.  It was a trifle.

cheers, Joe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3] xHCI: Fixing xhci_readl definition and function call

2013-08-25 Thread Kumar Gaurav

This patch redefine function xhci_readl.xhci_readl function doesn't use 
xhci_hcd argument.
Hence there is no need of keeping it in the function arguments. 

Redefining this function breaks other functions which calls this function.
This phatch also correct those calls in xhci driver. 

Signed-off-by: Kumar Gaurav 
---
 drivers/usb/host/xhci-dbg.c  |   36 +++
 drivers/usb/host/xhci-hub.c  |   72 +++---
 drivers/usb/host/xhci-mem.c  |   20 -
 drivers/usb/host/xhci-ring.c |   12 ++---
 drivers/usb/host/xhci.c  |  100 +-
 drivers/usb/host/xhci.h  |3 +-
 6 files changed, 121 insertions(+), 122 deletions(-)

diff --git a/drivers/usb/host/xhci-dbg.c b/drivers/usb/host/xhci-dbg.c
index 5d5e58f..66dc1c0 100644
--- a/drivers/usb/host/xhci-dbg.c
+++ b/drivers/usb/host/xhci-dbg.c
@@ -32,7 +32,7 @@ void xhci_dbg_regs(struct xhci_hcd *xhci)
 
xhci_dbg(xhci, "// xHCI capability registers at %p:\n",
xhci->cap_regs);
-   temp = xhci_readl(xhci, >cap_regs->hc_capbase);
+   temp = xhci_readl(>cap_regs->hc_capbase);
xhci_dbg(xhci, "// @%p = 0x%x (CAPLENGTH AND HCIVERSION)\n",
>cap_regs->hc_capbase, temp);
xhci_dbg(xhci, "//   CAPLENGTH: 0x%x\n",
@@ -44,13 +44,13 @@ void xhci_dbg_regs(struct xhci_hcd *xhci)
 
xhci_dbg(xhci, "// xHCI operational registers at %p:\n", xhci->op_regs);
 
-   temp = xhci_readl(xhci, >cap_regs->run_regs_off);
+   temp = xhci_readl(>cap_regs->run_regs_off);
xhci_dbg(xhci, "// @%p = 0x%x RTSOFF\n",
>cap_regs->run_regs_off,
(unsigned int) temp & RTSOFF_MASK);
xhci_dbg(xhci, "// xHCI runtime registers at %p:\n", xhci->run_regs);
 
-   temp = xhci_readl(xhci, >cap_regs->db_off);
+   temp = xhci_readl(>cap_regs->db_off);
xhci_dbg(xhci, "// @%p = 0x%x DBOFF\n", >cap_regs->db_off, temp);
xhci_dbg(xhci, "// Doorbell array at %p:\n", xhci->dba);
 }
@@ -61,7 +61,7 @@ static void xhci_print_cap_regs(struct xhci_hcd *xhci)
 
xhci_dbg(xhci, "xHCI capability registers at %p:\n", xhci->cap_regs);
 
-   temp = xhci_readl(xhci, >cap_regs->hc_capbase);
+   temp = xhci_readl(>cap_regs->hc_capbase);
xhci_dbg(xhci, "CAPLENGTH AND HCIVERSION 0x%x:\n",
(unsigned int) temp);
xhci_dbg(xhci, "CAPLENGTH: 0x%x\n",
@@ -69,7 +69,7 @@ static void xhci_print_cap_regs(struct xhci_hcd *xhci)
xhci_dbg(xhci, "HCIVERSION: 0x%x\n",
(unsigned int) HC_VERSION(temp));
 
-   temp = xhci_readl(xhci, >cap_regs->hcs_params1);
+   temp = xhci_readl(>cap_regs->hcs_params1);
xhci_dbg(xhci, "HCSPARAMS 1: 0x%x\n",
(unsigned int) temp);
xhci_dbg(xhci, "  Max device slots: %u\n",
@@ -79,7 +79,7 @@ static void xhci_print_cap_regs(struct xhci_hcd *xhci)
xhci_dbg(xhci, "  Max ports: %u\n",
(unsigned int) HCS_MAX_PORTS(temp));
 
-   temp = xhci_readl(xhci, >cap_regs->hcs_params2);
+   temp = xhci_readl(>cap_regs->hcs_params2);
xhci_dbg(xhci, "HCSPARAMS 2: 0x%x\n",
(unsigned int) temp);
xhci_dbg(xhci, "  Isoc scheduling threshold: %u\n",
@@ -87,7 +87,7 @@ static void xhci_print_cap_regs(struct xhci_hcd *xhci)
xhci_dbg(xhci, "  Maximum allowed segments in event ring: %u\n",
(unsigned int) HCS_ERST_MAX(temp));
 
-   temp = xhci_readl(xhci, >cap_regs->hcs_params3);
+   temp = xhci_readl(>cap_regs->hcs_params3);
xhci_dbg(xhci, "HCSPARAMS 3 0x%x:\n",
(unsigned int) temp);
xhci_dbg(xhci, "  Worst case U1 device exit latency: %u\n",
@@ -95,14 +95,14 @@ static void xhci_print_cap_regs(struct xhci_hcd *xhci)
xhci_dbg(xhci, "  Worst case U2 device exit latency: %u\n",
(unsigned int) HCS_U2_LATENCY(temp));
 
-   temp = xhci_readl(xhci, >cap_regs->hcc_params);
+   temp = xhci_readl(>cap_regs->hcc_params);
xhci_dbg(xhci, "HCC PARAMS 0x%x:\n", (unsigned int) temp);
xhci_dbg(xhci, "  HC generates %s bit addresses\n",
HCC_64BIT_ADDR(temp) ? "64" : "32");
/* FIXME */
xhci_dbg(xhci, "  FIXME: more HCCPARAMS debugging\n");
 
-   temp = xhci_readl(xhci, >cap_regs->run_regs_off);
+   temp = xhci_readl(>cap_regs->run_regs_off);
xhci_dbg(xhci, "RTSOFF 0x%x:\n", temp & RTSOFF_MASK);
 }
 
@@ -110,7 +110,7 @@ static void xhci_print_command_reg(struct xhci_hcd *xhci)
 {
u32 temp;
 
-   temp = xhci_readl(xhci, >op_regs->command);
+   temp = xhci_readl(>op_regs->command);
xhci_dbg(xhci, "USBCMD 0x%x:\n", temp);
xhci_dbg(xhci, "  HC is %s\n",
(temp & CMD_RUN) ? "running" : "being stopped");
@@ -128,7 +128,7 @@ static void xhci_print_status(struct xhci_hcd

Re: [PATCH 2/3 v2] Refactor msi/msix restore code Part2

2013-08-25 Thread Zhenzhong Duan



On 2013-08-24 01:15, Konrad Rzeszutek Wilk wrote:

On Thu, Aug 22, 2013 at 03:14:34PM -0600, Bjorn Helgaas wrote:

On Mon, Aug 5, 2013 at 1:21 AM, Zhenzhong Duan
  wrote:

xen_initdom_restore_msi_irqs trigger a hypercall to restore addr/data/mask
in dom0. It's better to do the same in default_restore_msi_irqs for baremetal.

Move restore of mask in default_restore_msi_irqs, this could avoid mask
restored twice in dom0, and the logic for baremetal keep same.

First mask restore is in xen_initdom_restore_msi_irqs->PHYSDEVOP_restore_msi,
Second restore is __pci_restore_msix_state->msix_mask_irq.

Mask bits are under full control of xen, and the entry->masked in dom0 kernel
is invalid. restore an invalid value to mask register could mask the msix
interrupt.

Without fix, qlcnic driver calling pci_reset_function will lost interrupt
in dom0.

Konrad, this changelog still doesn't make any sense to me, but if you
ack this, I guess I can apply it.

Hey Bjorn,

Zhenzhong is patiently working to rewrite up the commit message based on
my naive questions and emails back and forth. Once it is good shape he
will post it. The code will look the same but the commit message will
be a bit more verbose and clear.

Is there an ETA when you would like these? I recall the merge window
is just around the corner - so when is your comfortable cut-off-day
so that you can make a go/no-go decision?


I guess there are also:

   Jul 24  [PATCH 1/3] Refactor msi/msix restore code Part1
   Jul 30  [PATCH 3/3 v2] Update x86_msi.restore_msi_irqs API param

and all three should be applied as a series?

.

I think the
Jul 30  [PATCH 3/3 v2] Update x86_msi.restore_msi_irqs API param

can go in anytime. That is mostly a cosmetic fixup in the API.
Zhenzhong - right?

Yes, 3rd patch doesn't depend on the first two.
One effect is making code looks consistent. But my main purpose is 
optimizing

msix restore code in initial domain.
Before patch, dom0 calls PHYSDEVOP_restore_msi hypercall for every entry in
dev->msi_list, now it calls PHYSDEVOP_restore_msi hypercall once.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] HID: apple: Add another device ID for the mid-2013 Macbook Air

2013-08-25 Thread Ian Munsie

> Brad, Linus, does the above patch work for you as well as for Ian?

That would work fine for me, but I guess we need confirmation from
someone with the ISO or JIS layouts.

Cheers,
-Ian

-- 
http://sites.google.com/site/DarkStarJunkSpace
--
http://darkstarshout.blogspot.com/
--
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: [RFC PATCH 08/11] trace-cmd: Apply the trace-msg protocol for communication between a server and clients

2013-08-25 Thread Yoshihiro YUNOMAE


(2013/08/21 2:56), Steven Rostedt wrote:

On Mon, 19 Aug 2013 18:46:39 +0900
Yoshihiro YUNOMAE  wrote:



This message protocol is incompatible with the previous unstructured message
protocol. So, if an old(new)-version client tries to connect to an
new(old)-version server, the operation should be stopped.



I'm a stickler for backward compatibility. I'm all for extensions.


No problem:)
I also worried about backward compatibility.


I know this will just complicate things, but I don't mind that. What
should happen is, it should try to connect with the new protocol, if it
fails due to an older server, then it needs to fall back to the older
method, without the added features. We can freeze the older method if
need be. But I will not let a newer trace-cmd become incompatible with
an older version. I worked hard to keep it that way. There's only a few
exceptions to that.

Note, an older client needs to also work as is with a newer server.

Anyway, the old way only needs to stay the same, it does not need added
features. For that, a switch to the new way is needed.


OK, I'll add the code switching to the new way in order to keep backward
compatibility. Would you give me your comments about following things?


0. old server and old client
Old servers send "tracecmd" as the first message.
Old clients compare the first 8byte of the first message with "tracecmd".

1. new server
- Send "tracecmd-v2" as the first message.
- Check the reply message whether the message is "tracecmd-v2" or cpus
  value.
  If "tracecmd-v2", the server uses new protocol and wait for the
  message MSG_TINIT.
  If cpus value, the server uses old protocol.

2. new client
- Receive the first message.
- Check the message whether the message is "tracecmd-v2" or not.
  If "tracecmd-v2", the client sends "tracecmd-v2" to the server. Then,
  the client sends the message MSG_TINIT.
  If "tracecmd", the client sends cpus value as the old protocol.


This is new feature, so trace-cmd does not need to keep backward
compatibility.


If you need help in accomplishing this, I'll work with you on that.


Thank you for your kindness!

Yoshihiro YUNOMAE

--
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2 1/3] kprobes: unify insn caches

2013-08-25 Thread Masami Hiramatsu

(2013/08/23 20:04), Heiko Carstens wrote:
> The two insn caches (insn, and optinsn) each have an own mutex and
> alloc/free functions (get_[opt]insn_slot() / free_[opt]insn_slot()).
> 
> Since there is the need for yet another insn cache which satifies
> dma allocations on s390, unify and simplify the current implementation:
> 
> - Move the per insn cache mutex into struct kprobe_insn_cache.
> - Move the alloc/free functions to kprobe.h so they are simply
>   wrappers for the generic __get_insn_slot/__free_insn_slot functions.
>   The implementation is done with a DEFINE_INSN_CACHE_OPS() macro
>   which provides the alloc/free functions for each cache if needed.
> - move the struct kprobe_insn_cache to kprobe.h which allows to generate
>   architecture specific insn slot caches outside of the core kprobes
>   code.
> 

Looks Good for me :)

Acked-by: Masami Hiramatsu 

Thank you!

> Signed-off-by: Heiko Carstens 
> ---
>  include/linux/kprobes.h |   32 +---
>  kernel/kprobes.c|   75 
> +--
>  2 files changed, 49 insertions(+), 58 deletions(-)
> 
> diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
> index ca1d27a..077f653 100644
> --- a/include/linux/kprobes.h
> +++ b/include/linux/kprobes.h
> @@ -264,10 +264,34 @@ extern void arch_arm_kprobe(struct kprobe *p);
>  extern void arch_disarm_kprobe(struct kprobe *p);
>  extern int arch_init_kprobes(void);
>  extern void show_registers(struct pt_regs *regs);
> -extern kprobe_opcode_t *get_insn_slot(void);
> -extern void free_insn_slot(kprobe_opcode_t *slot, int dirty);
>  extern void kprobes_inc_nmissed_count(struct kprobe *p);
>  
> +struct kprobe_insn_cache {
> + struct mutex mutex;
> + struct list_head pages; /* list of kprobe_insn_page */
> + size_t insn_size;   /* size of instruction slot */
> + int nr_garbage;
> +};
> +
> +extern kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c);
> +extern void __free_insn_slot(struct kprobe_insn_cache *c,
> +  kprobe_opcode_t *slot, int dirty);
> +
> +#define DEFINE_INSN_CACHE_OPS(__name)
> \
> +extern struct kprobe_insn_cache kprobe_##__name##_slots; \
> + \
> +static inline kprobe_opcode_t *get_##__name##_slot(void) \
> +{\
> + return __get_insn_slot(_##__name##_slots);   \
> +}\
> + \
> +static inline void free_##__name##_slot(kprobe_opcode_t *slot, int dirty)\
> +{\
> + __free_insn_slot(_##__name##_slots, slot, dirty);\
> +}\
> +
> +DEFINE_INSN_CACHE_OPS(insn);
> +
>  #ifdef CONFIG_OPTPROBES
>  /*
>   * Internal structure for direct jump optimized probe
> @@ -287,13 +311,13 @@ extern void arch_optimize_kprobes(struct list_head 
> *oplist);
>  extern void arch_unoptimize_kprobes(struct list_head *oplist,
>   struct list_head *done_list);
>  extern void arch_unoptimize_kprobe(struct optimized_kprobe *op);
> -extern kprobe_opcode_t *get_optinsn_slot(void);
> -extern void free_optinsn_slot(kprobe_opcode_t *slot, int dirty);
>  extern int arch_within_optimized_kprobe(struct optimized_kprobe *op,
>   unsigned long addr);
>  
>  extern void opt_pre_handler(struct kprobe *p, struct pt_regs *regs);
>  
> +DEFINE_INSN_CACHE_OPS(optinsn);
> +
>  #ifdef CONFIG_SYSCTL
>  extern int sysctl_kprobes_optimization;
>  extern int proc_kprobes_optimization_handler(struct ctl_table *table,
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 6e33498..9e4912d 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -121,12 +121,6 @@ struct kprobe_insn_page {
>   (offsetof(struct kprobe_insn_page, slot_used) + \
>(sizeof(char) * (slots)))
>  
> -struct kprobe_insn_cache {
> - struct list_head pages; /* list of kprobe_insn_page */
> - size_t insn_size;   /* size of instruction slot */
> - int nr_garbage;
> -};
> -
>  static int slots_per_page(struct kprobe_insn_cache *c)
>  {
>   return PAGE_SIZE/(c->insn_size * sizeof(kprobe_opcode_t));
> @@ -138,8 +132,8 @@ enum kprobe_slot_state {
>   SLOT_USED = 2,
>  };
>  
> -static DEFINE_MUTEX(kprobe_insn_mutex);  /* Protects kprobe_insn_slots */
> -static struct kprobe_insn_cache kprobe_insn_slots = {
> +struct kprobe_insn_cache kprobe_insn_slots = {
> + .mutex = __MUTEX_INITIALIZER(kprobe_insn_slots.mutex),
>   .pages = LIST_HEAD_INIT(kprobe_insn_slots.pages),
>   .insn_size = MAX_INSN_SIZE,
>   .nr_garbage = 0,
> @@ -150,10 +144,12 @@

Re: Re: [RFC PATCH 07/11] [CLEANUP] trace-cmd: Split out binding a port and fork reader from open_udp()

2013-08-25 Thread Yoshihiro YUNOMAE


(2013/08/21 2:49), Steven Rostedt wrote:

On Mon, 19 Aug 2013 18:46:37 +0900
Yoshihiro YUNOMAE  wrote:


Split out binding a port and fork reader from open_udp() for avoiding duplicate
codes between listen mode and virt-server mode.

Signed-off-by: Yoshihiro YUNOMAE 
---
  trace-listen.c |   34 ++
  1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/trace-listen.c b/trace-listen.c
index f29dd35..bf9ef9d 100644
--- a/trace-listen.c
+++ b/trace-listen.c
@@ -228,13 +228,12 @@ static void process_udp_child(int sfd, const char *host, 
const char *port,
  #define START_PORT_SEARCH 1500
  #define MAX_PORT_SEARCH 6000

-static int open_udp(const char *node, const char *port, int *pid,
-   int cpu, int pagesize, int start_port)
+static int udp_bind_a_port(int start_port, int *sfd)
  {
struct addrinfo hints;
struct addrinfo *result, *rp;
-   int sfd, s;
char buf[BUFSIZ];
+   int s;
int num_port = start_port;

   again:
@@ -250,15 +249,15 @@ static int open_udp(const char *node, const char *port, 
int *pid,
pdie("getaddrinfo: error opening udp socket");

for (rp = result; rp != NULL; rp = rp->ai_next) {
-   sfd = socket(rp->ai_family, rp->ai_socktype,
-rp->ai_protocol);
-   if (sfd < 0)
+   *sfd = socket(rp->ai_family, rp->ai_socktype,
+ rp->ai_protocol);
+   if (*sfd < 0)
continue;

-   if (bind(sfd, rp->ai_addr, rp->ai_addrlen) == 0)
+   if (bind(*sfd, rp->ai_addr, rp->ai_addrlen) == 0)
break;

-   close(sfd);
+   close(*sfd);
}

if (rp == NULL) {
@@ -270,6 +269,12 @@ static int open_udp(const char *node, const char *port, 
int *pid,

freeaddrinfo(result);

+   return num_port;
+}
+
+static void fork_udp_reader(int sfd, const char *node, const char *port,
+   int *pid, int cpu, int pagesize)
+{
*pid = fork();

if (*pid < 0)
@@ -279,6 +284,19 @@ static int open_udp(const char *node, const char *port, 
int *pid,
process_udp_child(sfd, node, port, cpu, pagesize);

close(sfd);
+}
+
+static int open_udp(const char *node, const char *port, int *pid,
+   int cpu, int pagesize, int start_port)
+{
+   int sfd;
+   int num_port;
+
+   num_port = udp_bind_a_port(start_port, );
+   if (num_port < 0)
+   return num_port;


I don't see how num_port could be less than zero.


I think so, but trace-cmd checks whether udp_port is less than zero or
not in create_all_readers().

May I submit the removal patch?

Thanks,
Yoshihiro YUNOMAE

--
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: [RFC PATCH 00/11] trace-cmd: Support the feature recording trace data of guests on the host

2013-08-25 Thread Yoshihiro YUNOMAE


Hi Steven,

Thank you for reviewing my patches.
Sorry for the late reply.

(2013/08/21 1:00), Steven Rostedt wrote:

On Mon, 19 Aug 2013 18:46:20 +0900
Yoshihiro YUNOMAE  wrote:



d) merge feature of trace data of multiple guests and a host in chronological
order
  Current trace-cmd cannot merge trace data of multiple guests and a host in
chronological order. If an user wants to analyze an I/O delay problem of a
guest, the user will want to check trace data of all guests and the host in a
file. However, trace-cmd does not support a merge feature yet, the user must
make a merging script. So, trace-cmd had better support a merge feature for
multiple files for virtualization.



This is incorrect. trace-cmd already has a merge feature for multiple
machines.

If you have two boxes that are in sync by ntp, you can do the following:

On box1:

  trace-cmd record --date -o trace-box1.dat -e all ping box2

On box2:

  trace-cmd record --date -o trace-box2.dat -e all ping box1


And then copy over trace-box2.dat to box1 and run

  trace-cmd report -i trace-box1.dat -i trace-box2.dat

And you will see a merge. I just did this on two of my boxes called ixf
and bxtest and here's a partial output:

trace-bxtest.dat:trace-cmd-1348  [003] 1377013771.682807: sys_enter:
NR 2 (22e2b00, 241, 1a4, 1a, 4361ac, 3)
trace-bxtest.dat:trace-cmd-1348  [003] 1377013771.682808: 
sys_enter_open:   filename: 0x022e2b00, flags: 0x0241, mode: 0x01a4
trace-ixf.dat:   -0 [002] 1377013771.682808: 
hrtimer_cancel:   hrtimer=0x880002110820
trace-ixf.dat:   -0 [002] 1377013771.682808: 
hrtimer_expire_entry: hrtimer=0x880002110820 now=673528250850 
function=tick_sched_timer/0x0
trace-bxtest.dat:trace-cmd-1348  [003] 1377013771.682809:
kmem_cache_alloc: (getname_flags+0x37) call_site=8117b797 
ptr=0x8800d38c bytes_req=4096 bytes_alloc=4096 
gfp_flags=GFP_KERNELGFP_NOTRACK


The --date option is used because the two machines are not in sync with
the trace time stamp. What the date option does, is to sync the
timestamp up with the gettimeofday and the output reports that. This
allows the two boxes to report information that is relatively close to
how the two interacted.


Oh, I didn't know the --date option.
As you mentioned, we can merge trace data in chronological order by
using --date option if the times of those machines are synchronized by
NTP.


If the guest and the host have the same clock, then the --date option
is not needed and the two should be able to be merged normally.


No, we can not assure that the guest and the host have the same clock
even if it is running on the same physical machine, because both kernel
doesn't share it, there is some difference between them. So, we still
need time synchronizing guest-host by NTP and --date option.

However, there are cases that times of those machines cannot be
synchronized. For example, although multiple users can run guests on
virtualization environments (e.g. multi-tenant cloud hosting), there
are no guarantee that they use the same NTP server. Moreover, even if
the times are synchronized, trace data cannot exactly be merged because
the NTP-synchronized time granularity may not be enough fine for
sorting guest-host switching events.


Also, I haven't released it yet (will soon), but trace-cmd handles
multiple buffers too. That is, with the multiple buffers that ftrace
has, it will create and read from them as well as report them.


Is it commit ID d56f30679f9811a91ed471c8e081cc7ffbed1e62?
We can download the feature from your git repository.


I'll finish my testing on all the latest features of trace-cmd I have
and push it out later today.

I'll also take a look at the rest of your patches.


Thank you!

Yoshihiro YUNOMAE

--
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 2/3] mmc: dw_mmc: Honor requests to set the clock to 0 (turn off clock)

2013-08-25 Thread Jaehoon Chung

Hi Doug,

On 08/24/2013 05:40 AM, Doug Anderson wrote:
> Jaehoon,
> 
> On Fri, Aug 23, 2013 at 6:21 AM, Jaehoon Chung  wrote:
>> Hi Doug,
>>
>> If the clock-gating is enabled, then maybe it's continuously printed the 
>> kernel message for Bus_speed.
> 
> Can you explain?  I don't think dw_mmc has support for clock gating
> right now.  ...or are there some patches that I'm not aware of?  I
> could believe that if you've got some non-upstream clock gating
> patches that these would need to be modified to handle it...  ...but
> unless those are slated to land upstream it seems like I can't take
> them into account, can I?
If i enabled the CONFIG_MMC_CLK_GATE, the i have found the below message 
whenever some operation is run.
I will test more with your patch.

[6.335000] mmc_host mmc1: Bus speed (slot 0) = 1Hz (slot req 
5200Hz, actual 5000HZ div = 1)
[6.345000] mmc_host mmc1: Bus speed (slot 0) = 1Hz (slot req 
5200Hz, actual 5000HZ div = 1)
[6.355000] mmc_host mmc1: Bus speed (slot 0) = 1Hz (slot req 
5200Hz, actual 5000HZ div = 1)
[6.365000] mmc_host mmc1: Bus speed (slot 0) = 1Hz (slot req 
5200Hz, actual 5000HZ div = 1)
[6.375000] mmc_host mmc1: Bus speed (slot 0) = 1Hz (slot req 
5200Hz, actual 5000HZ div = 1)
[6.48] mmc_host mmc1: Bus speed (slot 0) = 1Hz (slot req 
5200Hz, actual 5000HZ div = 1)
[6.49] mmc_host mmc1: Bus speed (slot 0) = 1Hz (slot req 
5200Hz, actual 5000HZ div = 1)
[6.50] mmc_host mmc1: Bus speed (slot 0) = 1Hz (slot req 
5200Hz, actual 5000HZ div = 1)
[6.53] mmc_host mmc1: Bus speed (slot 0) = 1Hz (slot req 
5200Hz, actual 5000HZ div = 1)
[6.535000] mmc_host mmc1: Bus speed (slot 0) = 1Hz (slot req 
5200Hz, actual 5000HZ div = 1)
[6.545000] mmc_host mmc1: Bus speed (slot 0) = 1Hz (slot req 
5200Hz, actual 5000HZ div = 1)

Best Regards,
Jaehoon Chung
> 
> Thanks!
> 
> -Doug
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] tile: support KVM for tilegx

2013-08-25 Thread Chris Metcalf

On 8/25/2013 7:39 AM, Gleb Natapov wrote:
> On Mon, Aug 12, 2013 at 04:24:11PM -0400, Chris Metcalf wrote:
>> This change provides the initial framework support for KVM on tilegx.
>> Basic virtual disk and networking is supported.
>>
> This needs to be broken down to more reviewable patches.

I already broke out one pre-requisite patch that wasn't strictly KVM-related:

https://lkml.org/lkml/2013/8/12/339

In addition, we've separately arranged to support booting our kernels in a way 
that is compatible with the Tilera booter running at the highest privilege 
level, which enables multiple kernel privilege levels:

https://lkml.org/lkml/2013/5/2/468

How would you recommend further breaking down this patch?  It's pretty much 
just the basic support for minimal KVM.  I suppose I could break out all the 
I/O related stuff into a separate patch, though it wouldn't amount to much; 
perhaps the console could also be broken out separately.  Any other suggestions?

> Also can you
> describe the implementation a little bit? Does tile arch has vitalization
> extension this implementation uses, or is it trap and emulate approach?
> If later does it run unmodified guest kernels? What userspace are you
> using with this implementation?

We could do full virtualization via trap and emulate, but we've elected to do a 
para-virtualized approach.  Userspace runs at PL (privilege level) 0, the guest 
kernel runs at PL1, and the host runs at PL2.  We have available per-PL 
resources for various things, and take advantage of having two on-chip timers 
(for example) to handle timing for the host and guest kernels.  We run the same 
userspace with either the host or the guest.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 1/8] mm/hwpoison: fix lose PG_dirty flag for errors on mlocked pages

2013-08-25 Thread Wanpeng Li

memory_failure() store the page flag of the error page before doing unmap,
and (only) if the first check with page flags at the time decided the error
page is unknown, it do the second check with the stored page flag since
memory_failure() does unmapping of the error pages before doing page_action().
This unmapping changes the page state, especially page_remove_rmap() (called
from try_to_unmap_one()) clears PG_mlocked, so page_action() can't catch
mlocked pages after that.

However, memory_failure() can't handle memory errors on dirty mlocked pages
correctly. try_to_unmap_one will move the dirty bit from pte to the physical
page, the second check lose it since it check the stored page flag. This patch
fix it by restore PG_dirty flag to stored page flag if the page is dirty.

Testcase:

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 

#define PAGES_TO_TEST 2
#define PAGE_SIZE   4096

int main(void)
{
char *mem;
int i;

mem = mmap(NULL, PAGES_TO_TEST * PAGE_SIZE,
PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | 
MAP_LOCKED, 0, 0);

for (i = 0; i < PAGES_TO_TEST; i++)
mem[i * PAGE_SIZE] = 'a';

if (madvise(mem, PAGES_TO_TEST * PAGE_SIZE, MADV_HWPOISON) == -1)
return -1;

return 0;
}

Before patch:

[  912.839247] Injecting memory failure for page 7dfb8 at 7f6b4e37b000
[  912.839257] MCE 0x7dfb8: clean mlocked LRU page recovery: Recovered
[  912.845550] MCE 0x7dfb8: clean mlocked LRU page still referenced by 1 users
[  912.852586] Injecting memory failure for page 7e6aa at 7f6b4e37c000
[  912.852594] MCE 0x7e6aa: clean mlocked LRU page recovery: Recovered
[  912.858936] MCE 0x7e6aa: clean mlocked LRU page still referenced by 1 users

After patch:

[  163.590225] Injecting memory failure for page 91bc2f at 7f9f5b0e5000
[  163.590264] MCE 0x91bc2f: dirty mlocked LRU page recovery: Recovered
[  163.596680] MCE 0x91bc2f: dirty mlocked LRU page still referenced by 1 users
[  163.603831] Injecting memory failure for page 91cdd3 at 7f9f5b0e6000
[  163.603852] MCE 0x91cdd3: dirty mlocked LRU page recovery: Recovered
[  163.610305] MCE 0x91cdd3: dirty mlocked LRU page still referenced by 1 users

Reviewed-by: Naoya Horiguchi 
Signed-off-by: Wanpeng Li 
---
 mm/memory-failure.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 2c13aa7..d5686d4 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1204,6 +1204,9 @@ int memory_failure(unsigned long pfn, int trapno, int 
flags)
for (ps = error_states;; ps++)
if ((p->flags & ps->mask) == ps->res)
break;
+
+   page_flags |= (p->flags & (1UL << PG_dirty));
+
if (!ps->mask)
for (ps = error_states;; ps++)
if ((page_flags & ps->mask) == ps->res)
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 4/8] mm/hwpoison: replacing atomic_long_sub() with atomic_long_dec()

2013-08-25 Thread Wanpeng Li

Repalce atomic_long_sub() with atomic_long_dec() since the page is 
normal page instead of hugetlbfs page or thp.

Reviewed-by: Naoya Horiguchi 
Signed-off-by: Wanpeng Li 
---
 mm/memory-failure.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index a6c4752..297965e 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1363,7 +1363,7 @@ int unpoison_memory(unsigned long pfn)
return 0;
}
if (TestClearPageHWPoison(p))
-   atomic_long_sub(nr_pages, _poisoned_pages);
+   atomic_long_dec(_poisoned_pages);
pr_info("MCE: Software-unpoisoned free page %#lx\n", pfn);
return 0;
}
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 7/8] mm/hwpoison: add '#' to madvise_hwpoison

2013-08-25 Thread Wanpeng Li

Add '#' to madvise_hwpoison.

Before patch:

[   95.892866] Injecting memory failure for page 19d0 at b7786000
[   95.893151] MCE 0x19d0: non LRU page recovery: Ignored

After patch:

[   95.892866] Injecting memory failure for page 0x19d0 at 0xb7786000
[   95.893151] MCE 0x19d0: non LRU page recovery: Ignored

Reviewed-by: Naoya Horiguchi 
Signed-off-by: Wanpeng Li 
---
 mm/madvise.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 95795df..588bb19 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -353,14 +353,14 @@ static int madvise_hwpoison(int bhv, unsigned long start, 
unsigned long end)
if (ret != 1)
return ret;
if (bhv == MADV_SOFT_OFFLINE) {
-   printk(KERN_INFO "Soft offlining page %lx at %lx\n",
+   pr_info("Soft offlining page %#lx at %#lx\n",
page_to_pfn(p), start);
ret = soft_offline_page(p, MF_COUNT_INCREASED);
if (ret)
break;
continue;
}
-   printk(KERN_INFO "Injecting memory failure for page %lx at 
%lx\n",
+   pr_info("Injecting memory failure for page %#lx at %#lx\n",
   page_to_pfn(p), start);
/* Ignore return value for now */
memory_failure(page_to_pfn(p), 0, MF_COUNT_INCREASED);
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 2/8] mm/hwpoison: don't need to hold compound lock for hugetlbfs page

2013-08-25 Thread Wanpeng Li

v1 -> v2:
 * drop compound_trans_order completely

compound lock is introduced by commit e9da73d67("thp: compound_lock."),
it is used to serialize put_page against __split_huge_page_refcount().
In addition, transparent hugepages will be splitted in hwpoison handler
and just one subpage will be poisoned. There is unnecessary to hold
compound lock for hugetlbfs page. This patch replace compound_trans_order
by compond_order in the place where the page is hugetlbfs page.

Reviewed-by: Naoya Horiguchi 
Signed-off-by: Wanpeng Li 
---
 include/linux/mm.h  |   14 --
 mm/memory-failure.c |   12 ++--
 2 files changed, 6 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index f022460..1745a2a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -489,20 +489,6 @@ static inline int compound_order(struct page *page)
return (unsigned long)page[1].lru.prev;
 }
 
-static inline int compound_trans_order(struct page *page)
-{
-   int order;
-   unsigned long flags;
-
-   if (!PageHead(page))
-   return 0;
-
-   flags = compound_lock_irqsave(page);
-   order = compound_order(page);
-   compound_unlock_irqrestore(page, flags);
-   return order;
-}
-
 static inline void set_compound_order(struct page *page, unsigned long order)
 {
page[1].lru.prev = (void *)order;
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 2c13aa7..efa6bd7 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -206,7 +206,7 @@ static int kill_proc(struct task_struct *t, unsigned long 
addr, int trapno,
 #ifdef __ARCH_SI_TRAPNO
si.si_trapno = trapno;
 #endif
-   si.si_addr_lsb = compound_trans_order(compound_head(page)) + PAGE_SHIFT;
+   si.si_addr_lsb = compound_order(compound_head(page)) + PAGE_SHIFT;
 
if ((flags & MF_ACTION_REQUIRED) && t == current) {
si.si_code = BUS_MCEERR_AR;
@@ -983,7 +983,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned 
long pfn,
 static void set_page_hwpoison_huge_page(struct page *hpage)
 {
int i;
-   int nr_pages = 1 << compound_trans_order(hpage);
+   int nr_pages = 1 << compound_order(hpage);
for (i = 0; i < nr_pages; i++)
SetPageHWPoison(hpage + i);
 }
@@ -991,7 +991,7 @@ static void set_page_hwpoison_huge_page(struct page *hpage)
 static void clear_page_hwpoison_huge_page(struct page *hpage)
 {
int i;
-   int nr_pages = 1 << compound_trans_order(hpage);
+   int nr_pages = 1 << compound_order(hpage);
for (i = 0; i < nr_pages; i++)
ClearPageHWPoison(hpage + i);
 }
@@ -1336,7 +1336,7 @@ int unpoison_memory(unsigned long pfn)
return 0;
}
 
-   nr_pages = 1 << compound_trans_order(page);
+   nr_pages = 1 << compound_order(page);
 
if (!get_page_unless_zero(page)) {
/*
@@ -1491,7 +1491,7 @@ static int soft_offline_huge_page(struct page *page, int 
flags)
} else {
set_page_hwpoison_huge_page(hpage);
dequeue_hwpoisoned_huge_page(hpage);
-   atomic_long_add(1 << compound_trans_order(hpage),
+   atomic_long_add(1 << compound_order(hpage),
_poisoned_pages);
}
return ret;
@@ -1551,7 +1551,7 @@ int soft_offline_page(struct page *page, int flags)
if (PageHuge(page)) {
set_page_hwpoison_huge_page(hpage);
dequeue_hwpoisoned_huge_page(hpage);
-   atomic_long_add(1 << compound_trans_order(hpage),
+   atomic_long_add(1 << compound_order(hpage),
_poisoned_pages);
} else {
SetPageHWPoison(page);
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 5/8] mm/hwpoison: don't set migration type twice to avoid hold heavy contend zone->lock

2013-08-25 Thread Wanpeng Li

v1 -> v2:
 * add more explanation in patch description.
v2 -> v3:
 * set MIGRATE_ISOLATE only if it's not set. 

Set pageblock migration type will hold zone->lock which is heavy contended
in system to avoid race. However, soft offline page will set pageblock
migration type twice during get page if the page is in used, not hugetlbfs
page and not on lru list. There is unnecessary to set the pageblock migration
type and hold heavy contended zone->lock again if the first round get page
have already set the pageblock to right migration type.

The trick here is migration type is MIGRATE_ISOLATE. There are other two parts 
can change MIGRATE_ISOLATE except hwpoison. One is memory hoplug, however, we 
hold lock_memory_hotplug() which avoid race. The second is CMA which umovable 
page allocation requst can't fallback to. So it's safe here.

Signed-off-by: Wanpeng Li 
---
 mm/memory-failure.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 297965e..f357c91 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1426,7 +1426,8 @@ static int __get_any_page(struct page *p, unsigned long 
pfn, int flags)
 * was free. This flag should be kept set until the source page
 * is freed and PG_hwpoison on it is set.
 */
-   set_migratetype_isolate(p, true);
+   if (get_pageblock_migratetype(p) != MIGRATE_ISOLATE)
+   set_migratetype_isolate(p, true);
/*
 * When the target page is a free hugepage, just remove it
 * from free hugepage list.
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 8/8] mm/hwpoison: fix memory failure still hold reference count after unpoison empty zero page

2013-08-25 Thread Wanpeng Li

madvise hwpoison inject will poison the read-only empty zero page if there is 
no write access before poison. Empty zero page reference count will be 
increased 
for hwpoison, subsequent poison zero page will return directly since page has
already been set PG_hwpoison, however, page reference count is still increased 
by get_user_pages_fast. The unpoison process will unpoison the empty zero page 
and decrease the reference count successfully for the fist time, however, 
subsequent unpoison empty zero page will return directly since page has already 
been unpoisoned and without decrease the page reference count of empty zero 
page.
This patch fix it by decrease page reference count for empty zero page which 
has 
already been unpoisoned and page count > 1.

Testcase:

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define PAGES_TO_TEST 3
#define PAGE_SIZE   4096

int main(void)
{
char *mem;
int i;

mem = mmap(NULL, PAGES_TO_TEST * PAGE_SIZE,
PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 
0);

if (madvise(mem, PAGES_TO_TEST * PAGE_SIZE, MADV_HWPOISON) == -1)
return -1;

munmap(mem, PAGES_TO_TEST * PAGE_SIZE);

return 0;
}

Add printk to dump page reference count:

[   93.075959] Injecting memory failure for page 0x19d0 at 0xb77d8000
[   93.076207] MCE 0x19d0: non LRU page recovery: Ignored
[   93.076209] pfn 0x19d0, page count = 1 after memory failure
[   93.076220] Injecting memory failure for page 0x19d0 at 0xb77d9000
[   93.076221] MCE 0x19d0: already hardware poisoned
[   93.076222] pfn 0x19d0, page count = 2 after memory failure
[   93.076224] Injecting memory failure for page 0x19d0 at 0xb77da000
[   93.076224] MCE 0x19d0: already hardware poisoned
[   93.076225] pfn 0x19d0, page count = 3 after memory failure

Before patch:

[  139.197474] MCE: Software-unpoisoned page 0x19d0
[  139.197479] pfn 0x19d0, page count = 2 after unpoison memory
[  150.478130] MCE: Page was already unpoisoned 0x19d0
[  150.478135] pfn 0x19d0, page count = 2 after unpoison memory
[  151.548288] MCE: Page was already unpoisoned 0x19d0
[  151.548292] pfn 0x19d0, page count = 2 after unpoison memory

After patch:

[  116.022122] MCE: Software-unpoisoned page 0x19d0
[  116.022127] pfn 0x19d0, page count = 2 after unpoison memory
[  117.256163] MCE: Page was already unpoisoned 0x19d0
[  117.256167] pfn 0x19d0, page count = 1 after unpoison memory
[  117.917772] MCE: Page was already unpoisoned 0x19d0
[  117.91] pfn 0x19d0, page count = 1 after unpoison memory

Signed-off-by: Wanpeng Li 
---
 mm/memory-failure.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ca714ac..fb687fd 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1335,6 +1335,8 @@ int unpoison_memory(unsigned long pfn)
page = compound_head(p);
 
if (!PageHWPoison(p)) {
+   if (pfn == my_zero_pfn(0) && page_count(p) > 1)
+   put_page(p);
pr_info("MCE: Page was already unpoisoned %#lx\n", pfn);
return 0;
}
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 3/8] mm/hwpoison: fix race against poison thp

2013-08-25 Thread Wanpeng Li

v1 -> v2:
 * unpoison thp fail  

There is a race between hwpoison page and unpoison page, memory_failure
set the page hwpoison and increase num_poisoned_pages without hold page
lock, and one page count will be accounted against thp for num_poisoned_pages.
However, unpoison can occur before memory_failure hold page lock and
split transparent hugepage, unpoison will decrease num_poisoned_pages
by 1 << compound_order since memory_failure has not yet split transparent
hugepage with page lock held. That means we account one page for hwpoison
and 1 << compound_order for unpoison. This patch fix it by inserting a 
PageTransHuge check before doing TestClearPageHWPoison, unpoison failed 
without clearing PageHWPoison and decreasing num_poisoned_pages.


A   B
memory_failue
TestSetPageHWPoison(p);
if (PageHuge(p))
nr_pages = 1 << compound_order(hpage);
else
nr_pages = 1;
atomic_long_add(nr_pages, _poisoned_pages);
unpoison_memory
nr_pages = 1<< 
compound_trans_order(page);

if(TestClearPageHWPoison(p))

atomic_long_sub(nr_pages, _poisoned_pages);
lock page
if (!PageHWPoison(p))
unlock page and return
hwpoison_user_mappings
if (PageTransHuge(hpage))
split_huge_page(hpage);


Suggested-by: Naoya Horiguchi 
Signed-off-by: Wanpeng Li 
---
 mm/memory-failure.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 5a4f4d6..a6c4752 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1339,6 +1339,16 @@ int unpoison_memory(unsigned long pfn)
return 0;
}
 
+   /*
+* unpoison_memory() can encounter thp only when the thp is being
+* worked by memory_failure() and the page lock is not held yet.
+* In such case, we yield to memory_failure() and make unpoison fail.
+*/
+   if (PageTransHuge(page)) {
+   pr_info("MCE: Memory failure is now running on %#lx\n", pfn);
+   return 0;
+   }
+
nr_pages = 1 << compound_order(page);
 
if (!get_page_unless_zero(page)) {
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 6/8] mm/hwpoison: drop forward reference declarations __soft_offline_page()

2013-08-25 Thread Wanpeng Li

Drop forward reference declarations __soft_offline_page.

Reviewed-by: Naoya Horiguchi 
Signed-off-by: Wanpeng Li 
---
 mm/memory-failure.c | 128 ++--
 1 file changed, 63 insertions(+), 65 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index f357c91..ca714ac 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1511,71 +1511,6 @@ static int soft_offline_huge_page(struct page *page, int 
flags)
return ret;
 }
 
-static int __soft_offline_page(struct page *page, int flags);
-
-/**
- * soft_offline_page - Soft offline a page.
- * @page: page to offline
- * @flags: flags. Same as memory_failure().
- *
- * Returns 0 on success, otherwise negated errno.
- *
- * Soft offline a page, by migration or invalidation,
- * without killing anything. This is for the case when
- * a page is not corrupted yet (so it's still valid to access),
- * but has had a number of corrected errors and is better taken
- * out.
- *
- * The actual policy on when to do that is maintained by
- * user space.
- *
- * This should never impact any application or cause data loss,
- * however it might take some time.
- *
- * This is not a 100% solution for all memory, but tries to be
- * ``good enough'' for the majority of memory.
- */
-int soft_offline_page(struct page *page, int flags)
-{
-   int ret;
-   unsigned long pfn = page_to_pfn(page);
-   struct page *hpage = compound_trans_head(page);
-
-   if (PageHWPoison(page)) {
-   pr_info("soft offline: %#lx page already poisoned\n", pfn);
-   return -EBUSY;
-   }
-   if (!PageHuge(page) && PageTransHuge(hpage)) {
-   if (PageAnon(hpage) && unlikely(split_huge_page(hpage))) {
-   pr_info("soft offline: %#lx: failed to split THP\n",
-   pfn);
-   return -EBUSY;
-   }
-   }
-
-   ret = get_any_page(page, pfn, flags);
-   if (ret < 0)
-   return ret;
-   if (ret) { /* for in-use pages */
-   if (PageHuge(page))
-   ret = soft_offline_huge_page(page, flags);
-   else
-   ret = __soft_offline_page(page, flags);
-   } else { /* for free pages */
-   if (PageHuge(page)) {
-   set_page_hwpoison_huge_page(hpage);
-   dequeue_hwpoisoned_huge_page(hpage);
-   atomic_long_add(1 << compound_order(hpage),
-   _poisoned_pages);
-   } else {
-   SetPageHWPoison(page);
-   atomic_long_inc(_poisoned_pages);
-   }
-   }
-   unset_migratetype_isolate(page, MIGRATE_MOVABLE);
-   return ret;
-}
-
 static int __soft_offline_page(struct page *page, int flags)
 {
int ret;
@@ -1662,3 +1597,66 @@ static int __soft_offline_page(struct page *page, int 
flags)
}
return ret;
 }
+
+/**
+ * soft_offline_page - Soft offline a page.
+ * @page: page to offline
+ * @flags: flags. Same as memory_failure().
+ *
+ * Returns 0 on success, otherwise negated errno.
+ *
+ * Soft offline a page, by migration or invalidation,
+ * without killing anything. This is for the case when
+ * a page is not corrupted yet (so it's still valid to access),
+ * but has had a number of corrected errors and is better taken
+ * out.
+ *
+ * The actual policy on when to do that is maintained by
+ * user space.
+ *
+ * This should never impact any application or cause data loss,
+ * however it might take some time.
+ *
+ * This is not a 100% solution for all memory, but tries to be
+ * ``good enough'' for the majority of memory.
+ */
+int soft_offline_page(struct page *page, int flags)
+{
+   int ret;
+   unsigned long pfn = page_to_pfn(page);
+   struct page *hpage = compound_trans_head(page);
+
+   if (PageHWPoison(page)) {
+   pr_info("soft offline: %#lx page already poisoned\n", pfn);
+   return -EBUSY;
+   }
+   if (!PageHuge(page) && PageTransHuge(hpage)) {
+   if (PageAnon(hpage) && unlikely(split_huge_page(hpage))) {
+   pr_info("soft offline: %#lx: failed to split THP\n",
+   pfn);
+   return -EBUSY;
+   }
+   }
+
+   ret = get_any_page(page, pfn, flags);
+   if (ret < 0)
+   return ret;
+   if (ret) { /* for in-use pages */
+   if (PageHuge(page))
+   ret = soft_offline_huge_page(page, flags);
+   else
+   ret = __soft_offline_page(page, flags);
+   } else { /* for free pages */
+   if (PageHuge(page)) {
+   set_page_hwpoison_huge_page(hpage);
+   dequeue_hwpoisoned_huge_page(hpage);
+   atomic_long_add(1 <<

Re: [PATCH v2] kernel/padata.c: share code between CPU_ONLINE and CPU_DOWN_FAILED, same to CPU_DOWN_PREPARE and CPU_UP_CANCELED

2013-08-25 Thread Chen Gang

On 08/23/2013 06:47 PM, Herbert Xu wrote:
> On Fri, Aug 23, 2013 at 12:44:48PM +0200, Steffen Klassert wrote:
>> On Thu, Aug 22, 2013 at 02:43:37PM +0800, Chen Gang wrote:
>>> Share code between CPU_ONLINE and CPU_DOWN_FAILED, same to
>>> CPU_DOWN_PREPARE and CPU_UP_CANCELED.
>>>
>>> It will fix 2 bugs:
>>>
>>>   "not check the return value of __padata_remove_cpu() and 
>>> __padata_add_cpu()".
>>>   "need add 'break' between CPU_UP_CANCELED and CPU_DOWN_FAILED".
>>>
>>>
>>> Signed-off-by: Chen Gang 
>>
>> This looks ok to me, Herbert can you take this one?
> 
> Sure.
> 

Thank you all.

> Thanks,
> 

Thanks.
-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/3] gpio: pcf857x: Remove pdata argument to pcf857x_irq_domain_init()

2013-08-25 Thread Kuninori Morimoto


Hi

> > The argument is not used, remove it. No board registers a pcf857x device
> > with an IRQ without specifying platform data, IRQ domain registration
> > behaviour is thus not affected by this change.
> >
> > Signed-off-by: Laurent Pinchart 
> 
> Patch applied, unless Kuninori has some objections.

Acked-by: Kuninori Morimoto 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RESEND] dma: sh: remove unnecessary platform_set_drvdata()

2013-08-25 Thread Jingoo Han

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han 
Acked-by: Simon Horman 
---
 drivers/dma/sh/shdmac.c |3 ---
 drivers/dma/sh/sudmac.c |2 --
 2 files changed, 5 deletions(-)

diff --git a/drivers/dma/sh/shdmac.c b/drivers/dma/sh/shdmac.c
index c7faded..282e93f 100755
--- a/drivers/dma/sh/shdmac.c
+++ b/drivers/dma/sh/shdmac.c
@@ -882,7 +882,6 @@ rst_err:
pm_runtime_put(>dev);
pm_runtime_disable(>dev);
 
-   platform_set_drvdata(pdev, NULL);
shdma_cleanup(>shdma_dev);
 eshdma:
synchronize_rcu();
@@ -911,8 +910,6 @@ static int sh_dmae_remove(struct platform_device *pdev)
sh_dmae_chan_remove(shdev);
shdma_cleanup(>shdma_dev);
 
-   platform_set_drvdata(pdev, NULL);
-
synchronize_rcu();
 
return 0;
diff --git a/drivers/dma/sh/sudmac.c b/drivers/dma/sh/sudmac.c
index 3c7d2b8..bf85b8e 100755
--- a/drivers/dma/sh/sudmac.c
+++ b/drivers/dma/sh/sudmac.c
@@ -391,7 +391,6 @@ static int sudmac_probe(struct platform_device *pdev)
 chan_probe_err:
sudmac_chan_remove(su_dev);
 
-   platform_set_drvdata(pdev, NULL);
shdma_cleanup(_dev->shdma_dev);
 
return err;
@@ -405,7 +404,6 @@ static int sudmac_remove(struct platform_device *pdev)
dma_async_device_unregister(dma_dev);
sudmac_chan_remove(su_dev);
shdma_cleanup(_dev->shdma_dev);
-   platform_set_drvdata(pdev, NULL);
 
return 0;
 }
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] thinkpad_acpi: Wire unused micmute LED to capslock

2013-08-25 Thread Jason A. Donenfeld

On Fri, Aug 23, 2013 at 8:18 PM, Henrique de Moraes Holschuh
 wrote:
> NACK.  This we won't do.  It is a LED misuse, and it will get in the way
> when we finally put that LED to its proper use.

Agreed. Please see my response to mjg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] input: ti_am335x_tsc: Enable shared IRQ for TSC

2013-08-25 Thread Zubair Lutfullah

Enable shared IRQ to allow ADC to share IRQ line from
parent MFD core. Only FIFO0 IRQs are for TSC and handled
on the TSC side.

Step mask would be updated from cached variable only previously.
In rare cases when both TSC and ADC are used, the cached
variable gets mixed up.
The step mask is written with the required mask every time.

Rachna Patil (TI) laid ground work for shared IRQ.

Signed-off-by: Zubair Lutfullah 
---
 drivers/input/touchscreen/ti_am335x_tsc.c |   24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/input/touchscreen/ti_am335x_tsc.c 
b/drivers/input/touchscreen/ti_am335x_tsc.c
index e1c5300..4124e580 100644
--- a/drivers/input/touchscreen/ti_am335x_tsc.c
+++ b/drivers/input/touchscreen/ti_am335x_tsc.c
@@ -52,6 +52,7 @@ struct titsc {
u32 config_inp[4];
u32 bit_xp, bit_xn, bit_yp, bit_yn;
u32 inp_xp, inp_xn, inp_yp, inp_yn;
+   u32 step_mask;
 };
 
 static unsigned int titsc_readl(struct titsc *ts, unsigned int reg)
@@ -196,7 +197,8 @@ static void titsc_step_config(struct titsc *ts_dev)
 
/* The steps1 … end and bit 0 for TS_Charge */
stepenable = (1 << (end_step + 2)) - 1;
-   am335x_tsc_se_set(ts_dev->mfd_tscadc, stepenable);
+   ts_dev->step_mask = stepenable;
+   am335x_tsc_se_set(ts_dev->mfd_tscadc, ts_dev->step_mask);
 }
 
 static void titsc_read_coordinates(struct titsc *ts_dev,
@@ -260,6 +262,10 @@ static irqreturn_t titsc_irq(int irq, void *dev)
unsigned int fsm;
 
status = titsc_readl(ts_dev, REG_IRQSTATUS);
+   /*
+* ADC and touchscreen share the IRQ line.
+* FIFO1 interrupts are used by ADC. Handle FIFO0 IRQs here only
+*/
if (status & IRQENB_FIFO0THRES) {
 
titsc_read_coordinates(ts_dev, , , , );
@@ -315,11 +321,17 @@ static irqreturn_t titsc_irq(int irq, void *dev)
}
 
if (irqclr) {
-   titsc_writel(ts_dev, REG_IRQSTATUS, irqclr);
-   am335x_tsc_se_update(ts_dev->mfd_tscadc);
-   return IRQ_HANDLED;
+   titsc_writel(ts_dev, REG_IRQSTATUS, (status | irqclr));
+   am335x_tsc_se_set(ts_dev->mfd_tscadc, ts_dev->step_mask);
}
-   return IRQ_NONE;
+
+   /* If any IRQ flags left, return none. So ADC can handle its IRQs */
+   status = titsc_readl(ts_dev, REG_IRQSTATUS);
+   if (status == false)
+   return IRQ_HANDLED;
+   else
+   return IRQ_NONE;
+
 }
 
 static int titsc_parse_dt(struct platform_device *pdev,
@@ -389,7 +401,7 @@ static int titsc_probe(struct platform_device *pdev)
}
 
err = request_irq(ts_dev->irq, titsc_irq,
- 0, pdev->dev.driver->name, ts_dev);
+ IRQF_SHARED, pdev->dev.driver->name, ts_dev);
if (err) {
dev_err(>dev, "failed to allocate irq.\n");
goto err_free_mem;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] iio: ti_am335x_adc: Add continuous sampling support

2013-08-25 Thread Zubair Lutfullah

Previously the driver had only one-shot reading functionality.
This patch adds triggered buffer support to the driver.

Continuous sampling starts when buffer is enabled.
And samples are pushed to userpace by the trigger which
triggers automatically at every hardware interrupt
of FIFO1 filling with samples upto threshold value.

Userspace responsibility to stop sampling by writing zero
in the buffer enable file.

Patil Rachna (TI) laid the ground work for ADC HW register access.
Russ Dill (TI) fixed bugs in the driver relevant to FIFOs and IRQs.

I fixed channel scanning so multiple ADC channels can be read
simultaneously and pushed to userspace.
Restructured the driver to fit IIO ABI.
And added trigger support.

Signed-off-by: Zubair Lutfullah 
Acked-by: Greg Kroah-Hartman 
Signed-off-by: Russ Dill 
---
 drivers/iio/adc/ti_am335x_adc.c  |  254 +++---
 include/linux/mfd/ti_am335x_tscadc.h |   13 ++
 2 files changed, 246 insertions(+), 21 deletions(-)

diff --git a/drivers/iio/adc/ti_am335x_adc.c b/drivers/iio/adc/ti_am335x_adc.c
index a952538..ae2202b 100644
--- a/drivers/iio/adc/ti_am335x_adc.c
+++ b/drivers/iio/adc/ti_am335x_adc.c
@@ -28,12 +28,20 @@
 #include 
 
 #include 
+#include 
+#include 
+#include 
+#include 
 
 struct tiadc_device {
struct ti_tscadc_dev *mfd_tscadc;
int channels;
u8 channel_line[8];
u8 channel_step[8];
+   int irq;
+   int buffer_en_ch_steps;
+   struct iio_trigger *trig;
+   u32 *data;
 };
 
 static unsigned int tiadc_readl(struct tiadc_device *adc, unsigned int reg)
@@ -56,10 +64,11 @@ static u32 get_adc_step_mask(struct tiadc_device *adc_dev)
return step_en;
 }
 
-static void tiadc_step_config(struct tiadc_device *adc_dev)
+static void tiadc_step_config(struct iio_dev *indio_dev)
 {
+   struct tiadc_device *adc_dev = iio_priv(indio_dev);
unsigned int stepconfig;
-   int i, steps;
+   int i, steps, chan;
 
/*
 * There are 16 configurable steps and 8 analog input
@@ -72,11 +81,13 @@ static void tiadc_step_config(struct tiadc_device *adc_dev)
 */
 
steps = TOTAL_STEPS - adc_dev->channels;
-   stepconfig = STEPCONFIG_AVG_16 | STEPCONFIG_FIFO1;
+   if (iio_buffer_enabled(indio_dev))
+   stepconfig = STEPCONFIG_AVG_16 | STEPCONFIG_FIFO1
+   | STEPCONFIG_MODE_SWCNT;
+   else
+   stepconfig = STEPCONFIG_AVG_16 | STEPCONFIG_FIFO1;
 
for (i = 0; i < adc_dev->channels; i++) {
-   int chan;
-
chan = adc_dev->channel_line[i];
tiadc_writel(adc_dev, REG_STEPCONFIG(steps),
stepconfig | STEPCONFIG_INP(chan));
@@ -85,7 +96,175 @@ static void tiadc_step_config(struct tiadc_device *adc_dev)
adc_dev->channel_step[i] = steps;
steps++;
}
+}
+
+static irqreturn_t tiadc_irq(int irq, void *private)
+{
+   struct iio_dev *indio_dev = private;
+   struct tiadc_device *adc_dev = iio_priv(indio_dev);
+   unsigned int status, config;
+   status = tiadc_readl(adc_dev, REG_IRQSTATUS);
+
+   /*
+* ADC and touchscreen share the IRQ line.
+* FIFO0 interrupts are used by TSC. Handle FIFO1 IRQs here only
+*/
+   if (status & IRQENB_FIFO1OVRRUN) {
+   /* FIFO Overrun. Clear flag. Disable/Enable ADC to recover */
+   config = tiadc_readl(adc_dev, REG_CTRL);
+   config &= ~(CNTRLREG_TSCSSENB);
+   tiadc_writel(adc_dev, REG_CTRL, config);
+   tiadc_writel(adc_dev, REG_IRQSTATUS, IRQENB_FIFO1OVRRUN
+   | IRQENB_FIFO1UNDRFLW | IRQENB_FIFO1THRES);
+   tiadc_writel(adc_dev, REG_CTRL, (config | CNTRLREG_TSCSSENB));
+   } else if (status & IRQENB_FIFO1THRES) {
+   /* Trigger to push FIFO data to iio buffer */
+   tiadc_writel(adc_dev, REG_IRQCLR, IRQENB_FIFO1THRES);
+   iio_trigger_poll(indio_dev->trig, iio_get_time_ns());
+   } else
+   return IRQ_NONE;
+
+   /* If any IRQ flags left, return none. So TSC can handle its IRQs */
+   status = tiadc_readl(adc_dev, REG_IRQSTATUS);
+   if (status == false)
+   return IRQ_HANDLED;
+   else
+   return IRQ_NONE;
+}
+
+static irqreturn_t tiadc_trigger_h(int irq, void *p)
+{
+   struct iio_poll_func *pf = p;
+   struct iio_dev *indio_dev = pf->indio_dev;
+   struct tiadc_device *adc_dev = iio_priv(indio_dev);
+   int i, k, fifo1count, read;
+   u32 *data = adc_dev->data;
+
+   fifo1count = tiadc_readl(adc_dev, REG_FIFO1CNT);
+   for (k = 0; k < fifo1count; k = k + i) {
+   for (i = 0; i < (indio_dev->scan_bytes)/4; i++) {
+   read = tiadc_readl(adc_dev, REG_FIFO1);
+   data[i] = read & FIFOREAD_DATA_MASK;
+   }
+

[PATCH V6 0/2] iio: input: ti_am335x_adc: Add continuous sampling support

2013-08-25 Thread Zubair Lutfullah

This applies to togreg branch in iio.

Round 6 updates
Fixed trigger the way iio list wanted. Driver has its own trigger.
Triggers at every FIFO Threshold IRQ and pushes samples to iio buffer.

Went through the driver and cleaned it up quite a bit.

Squashed patches together instead of having multiple tiny patches.

Zubair Lutfullah (2):
  input: ti_am335x_tsc: Enable shared IRQ for TSC
  iio: ti_am335x_adc: Add continuous sampling support

 drivers/iio/adc/ti_am335x_adc.c   |  254 ++---
 drivers/input/touchscreen/ti_am335x_tsc.c |   24 ++-
 include/linux/mfd/ti_am335x_tscadc.h  |   13 ++
 3 files changed, 264 insertions(+), 27 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Asus F5RL laptop unable to resume from S3 because of radeon module

2013-08-25 Thread Ondrej Zary

On Sunday 25 August 2013 19:12:32 Ondrej Zary wrote:
> On Sunday 25 August 2013 16:51:06 Deucher, Alexander wrote:
> > > -Original Message-
> > > From: Ondrej Zary [mailto:li...@rainbow-software.org]
> > > Sent: Friday, August 23, 2013 1:55 PM
> > > To: Deucher, Alexander
> > > Cc: Kernel development list
> > > Subject: Re: Asus F5RL laptop unable to resume from S3 because of
> > > radeon module
> > >
> > > On Friday 23 August 2013 00:08:33 Ondrej Zary wrote:
> > > > On Thursday 22 August 2013 22:56:03 Ondrej Zary wrote:
> > > > > On Thursday 22 August 2013 22:24:17 Deucher, Alexander wrote:
> > > > > > > -Original Message-
> > > > > > > From: Ondrej Zary [mailto:li...@rainbow-software.org]
> > > > > > > Sent: Thursday, August 22, 2013 4:00 PM
> > > > > > > To: Deucher, Alexander
> > > > > > > Cc: Kernel development list
> > > > > > > Subject: Re: Asus F5RL laptop unable to resume from S3 because
> > > > > > > of radeon module
> > > > > > >
> > > > > > > On Thursday 22 August 2013 21:49:41 Deucher, Alexander wrote:
> > > > > > > > > -Original Message-
> > > > > > > > > From: Ondrej Zary [mailto:li...@rainbow-software.org]
> > > > > > > > > Sent: Thursday, August 22, 2013 2:18 PM
> > > > > > > > > To: Kernel development list
> > > > > > > > > Cc: Deucher, Alexander
> > > > > > > > > Subject: Asus F5RL laptop unable to resume from S3 because
> > > > > > > > > of radeon module
> > > > > > > > >
> > > > > > > > > Hello,
> > > > > > > > > resume from suspend-to-RAM (S3) on Asus F5RL laptop does
> > > > > > > > > not work. According to many reports found by Google, it was
> > > > > > > > > always been that and there
> > > > > > > > > is no fix or workaround.
> > > > > > > >
> > > > > > > > Make sure your kernel has this patch:
> > > > > > > > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.gi
> > > > > > > >t/ comm it /? id=c ef1d00cd56f600121ad121875655ad410a001b8
> > > > > > >
> > > > > > > Just tried latest git (3.11-rc6+) and the problem persists.
> > > > > >
> > > > > > You might try adding a quirk for your system in
> > > > > > radeon_combios_asic_init() in radeon_combios.c.  You can try
> > >
> > > something
> > >
> > > > > > like this for testing:
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/radeon/radeon_combios.c
> > > > > > b/drivers/gpu/drm/radeon/radeon_combios.c index 68ce360..0419a2c
> > >
> > > 100644
> > >
> > > > > > --- a/drivers/gpu/drm/radeon/radeon_combios.c
> > > > > > +++ b/drivers/gpu/drm/radeon/radeon_combios.c
> > > > > > @@ -3398,6 +3398,8 @@ void radeon_combios_asic_init(struct
> > >
> > > drm_device
> > >
> > > > > > *dev) rdev->pdev->subsystem_device == 0x30ae)
> > > > > > return;
> > > > > >
> > > > > > +   return;
> > > > > > +
> > > > > > /* DYN CLK 1 */
> > > > > > table = combios_get_table_offset(dev,
> > >
> > > COMBIOS_DYN_CLK_1_TABLE);
> > >
> > > > > > if (table)
> > > > > >
> > > > > > If that doesn't work, you'll probably have to track down where
> > > > > > it's hanging during resume, or compare registers before and after
> > > > > > resume
> > >
> > > to
> > >
> > > > > > see if it's some particular state that's causing a problem.
> > > > >
> > > > > No change.
> > > > >
> > > > > Inserted "return -1;" before radeon_device_init() to
> > > > > radeon_driver_load_kms() - driver fails to load and resume works.
> > > > > Moved it (and changed to "r = -1; goto out;") a bit down before
> > > > > radeon_modeset_init() - driver fails to load and resume stopped
> > >
> > > working.
> > >
> > > > Going deeper... it works before rs400_startup() and does not work
> > > > after that. Will continue later.
> > >
> > > Tracked the problem down to rs400_gart_enable(). When this "Disable AGP
> > > mode"
> > > code is commented out, the machine resumes fine:
> > >
> > > if ((rdev->family == CHIP_RS690) || (rdev->family ==
> > > CHIP_RS740)) { WREG32_MC(RS480_MC_MISC_CNTL,
> > > (RS480_GART_INDEX_REG_EN |
> > > RS690_BLOCK_GFX_D3_EN)); } else {
> > >   WREG32_MC(RS480_MC_MISC_CNTL, RS480_GART_INDEX_REG_EN);
> > > }
> >
> > Does the driver work properly after resume with that part commented out
> > or does it just avoid the hang?
>
> Console seems to work fine, haven't tested X, 3D or video.

Testing it right now - everything seems to work. X, accelerated video 
(mplayer), 3D (glxgears). Both before and after suspend.
That code does something with GART so maybe I should test something like 
OpenArena (have to download it first).

> > Alex
> >
> > > > > > Alex
> > > > > >
> > > > > > > > Alex
> > > > > > > >
> > > > > > > > > Did some tests:
> > > > > > > > >
> > > > > > > > > radeon module loaded (usual state):
> > > > > > > > > After "echo mem>/sys/power/state", the laptop suspends
> > >
> > > correctly
> > >
> > > > > > > (power
> > > > > > >
> > > > > > > > > LED
> > > > > > > > > blinks). When power button is pressed, power LED goes on
> > > > > >

Re: [PATCH 07/15] Input: gameport: convert bus code to use drv_groups

2013-08-25 Thread Greg Kroah-Hartman

On Sat, Aug 24, 2013 at 04:33:18PM -0700, Dmitry Torokhov wrote:
> On Fri, Aug 23, 2013 at 02:24:33PM -0700, Greg Kroah-Hartman wrote:
> > The drv_attrs field of struct bus_type is going away soon, drv_groups
> > should be used instead.  This converts the gameport bus code to use the
> > correct field.
> > 
> > Cc: Dmitry Torokhov 
> > Signed-off-by: Greg Kroah-Hartman 
> 
> Acked-by: Dmitry Torokhov 
> 
> I assume you'll be taking the entire series through your tree?

Yes, I can easily do that, thanks.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] driver core / ACPI: Avoid device removal locking problems

2013-08-25 Thread Greg Kroah-Hartman

On Sun, Aug 25, 2013 at 10:09:47PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki 
> 
> There are two mutexes, device_hotplug_lock and acpi_scan_lock, held
> around the acpi_bus_trim() call in acpi_scan_hot_remove() which
> generally removes devices (it removes ACPI device objects at least,
> but it may also remove "physical" device objects through .detach()
> callbacks of ACPI scan handlers).  Thus, potentially, device sysfs
> attributes are removed under these locks and to remove those
> attributes it is necessary to hold the s_active references of their
> directory entries for writing.
> 
> On the other hand, the execution of a .show() or .store() callback
> from a sysfs attribute is carried out with that attribute's s_active
> reference held for reading.  Consequently, if any device sysfs
> attribute that may be removed from within acpi_scan_hot_remove()
> through acpi_bus_trim() has a .store() or .show() callback which
> acquires either acpi_scan_lock or device_hotplug_lock, the execution
> of that callback may deadlock with the removal of the attribute.
> [Unfortunately, the "online" device attribute of CPUs and memory
> blocks and the "eject" attribute of ACPI device objects are affected
> by this issue.]
> 
> To avoid those deadlocks introduce a new protection mechanism that
> can be used by the device sysfs attributes in question.  Namely,
> if a device sysfs attribute's .store() or .show() callback routine
> is about to acquire device_hotplug_lock or acpi_scan_lock, it can
> first execute read_lock_device_remove() and return an error code if
> that function returns false.  If true is returned, the lock in
> question may be acquired and read_unlock_device_remove() must be
> called.  [This mechanism is implemented by means of an additional
> rwsem in drivers/base/core.c.]
> 
> Make the affected sysfs attributes in the driver core and ACPI core
> use read_lock_device_remove() and read_unlock_device_remove() as
> described above.
> 
> Signed-off-by: Rafael J. Wysocki 
> Reported-by: Gu Zheng 

Acked-by: Greg Kroah-Hartman 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv3 02/10] clk: sunxi: fix initialization of basic clocks

2013-08-25 Thread Mike Turquette

Quoting Maxime Ripard (2013-08-04 02:47:29)
> From: Emilio López 
> 
> With the recent move towards CLK_OF_DECLARE(...), the driver stopped
> initializing osc32k, which is compatible "fixed-clock". This is because
> we never called of_clk_init(NULL). Fix this by moving the only other
> simple clock (osc24M) to use CLK_OF_DECLARE(...) and call of_clk_init(NULL)
> to initialize both of them.
> 
> Signed-off-by: Emilio López 
> Signed-off-by: Maxime Ripard 
> Cc: Mike Turquette 

Taken into clk-next.

Regards,
Mike

> ---
>  drivers/clk/sunxi/clk-sunxi.c | 11 +++
>  1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/clk/sunxi/clk-sunxi.c b/drivers/clk/sunxi/clk-sunxi.c
> index db1c45b..567963f 100644
> --- a/drivers/clk/sunxi/clk-sunxi.c
> +++ b/drivers/clk/sunxi/clk-sunxi.c
> @@ -69,6 +69,7 @@ static void __init sunxi_osc_clk_setup(struct device_node 
> *node)
> clk_register_clkdev(clk, clk_name, NULL);
> }
>  }
> +CLK_OF_DECLARE(sunxi_osc, "allwinner,sun4i-osc-clk", sunxi_osc_clk_setup);
>  
>  
>  
> @@ -422,12 +423,6 @@ static void __init sunxi_gates_clk_setup(struct 
> device_node *node,
> of_clk_add_provider(node, of_clk_src_onecell_get, clk_data);
>  }
>  
> -/* Matches for of_clk_init */
> -static const __initconst struct of_device_id clk_match[] = {
> -   {.compatible = "allwinner,sun4i-osc-clk", .data = 
> sunxi_osc_clk_setup,},
> -   {}
> -};
> -
>  /* Matches for factors clocks */
>  static const __initconst struct of_device_id clk_factors_match[] = {
> {.compatible = "allwinner,sun4i-pll1-clk", .data = _data,},
> @@ -482,8 +477,8 @@ static void __init of_sunxi_table_clock_setup(const 
> struct of_device_id *clk_mat
>  
>  void __init sunxi_init_clocks(void)
>  {
> -   /* Register all the simple sunxi clocks on DT */
> -   of_clk_init(clk_match);
> +   /* Register all the simple and basic clocks on DT */
> +   of_clk_init(NULL);
>  
> /* Register factor clocks */
> of_sunxi_table_clock_setup(clk_factors_match, 
> sunxi_factors_clk_setup);
> -- 
> 1.8.3.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2] PCI: exynos: add support for MSI

2013-08-25 Thread Arnd Bergmann

On Friday 23 August 2013, Thierry Reding wrote:
> > > +   if (IS_ENABLED(CONFIG_PCI_MSI)) {
> > > +   if (of_property_read_u32(np, "msi-base", 
> > > >msi_irq_start)) {
> > > +   dev_err(pp->dev, "Failed to parse the number of 
> > > lanes\n");
> > > +   return -EINVAL;
> > > +   }
> > > +   }
> > > +
> > 
> > What if an implementor want to use irq_domain method for msi_irq_start
> > allocation? Is it fine to return error if msi-base is not passed from
> > dt?
> 
> I agree. This should be using an IRQ domain to represent the MSI
> controller. Both Tegra and Marvell drivers do that already and if Exynos
> can follow that same path it will increase the chances of refactoring
> common bits.
> 
> Also the error message doesn't quite match up with what the code is
> doing. =)

Agreed. Besides the encoding of "base" irq values in the device tree is always
wrong, since the numbers are not at all a hardware property but an 
implementation
detail of how Linux currently uses interrupt numbers. When using DT probing,
you *have* to use IRQ domains, and the rest of Exynos does that.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] vfs: Tighten up linkat(..., AT_EMPTY_PATH)

2013-08-25 Thread Linus Torvalds

On Sun, Aug 25, 2013 at 1:06 PM, Al Viro  wrote:
>
> Timestamp updates, chmod/chown, xattr mess...

Ok, so that's just too much details.

So I'll just go back to square one, and wonder if we could/should just
make the rule be that in order to be in that LAST_BIND case, you
really have to have f_cred match your own credentials. Or have
CAP_SEARCH.

And just not have any new LOOKUP_xyz flags at all. No special cases,
no different semantics for different ops, just check f_cred. Because
if you had the permissions to do the original open (ie f_cred matches
your current credentials), then that shows that you originally had all
the pathname permissions, and you are still the same person.

And yeah, you may have opened in for reading (so the file is
technically read-only), but obviously we're re-doing all the inode
permission checks anyway, so the only thing /proc//fd/ really
gave you was the path traversal. So we shouldn't bother with "the file
descriptor is only readable", because that is simply *irrelevant*.

So it means that the case Andi brought up (truncating or
open-for-write a fd that we only had open for reading) would continue
to be allowed, because while it "sounds odd", there is no actual
problem.

And CAP_SEARCH is very much about that path lookup again. So it's
consistent with the notion that "ok, you may do odd things to file
descriptors through /proc, but we check that you cannot avoid the
pathname lookup rules".

And then we do exactly the same to flink(). So then we're all
consistent again. Not the consistency Andy worried about, but that's
the consistency that was was the security worries with flink are all
about. Because the issues with the "use the file descriptor, not the
path to the file descriptor" really are *not* about the endpoint
itself (since we will re-do the permission check for that particular
inode anyway), but about the path leading up to that end-point.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] vfs: Tighten up linkat(..., AT_EMPTY_PATH)

2013-08-25 Thread Al Viro

On Sun, Aug 25, 2013 at 12:57:25PM -0700, Linus Torvalds wrote:

> Yes. I think we should do this, but I think we should also look at
> what _other_ LOOKUP_xyz we should do for the /proc case.
> 
> For the read-only fd case, we should have a LOOKUP_WRITE flag, and
> return -EPERM if an operation is a write, and we terminate in that
> LAST_BIND case.
> 
> That would catch the truncate() case, but also the "open a read-only
> fd for write or O_TRUNC" case.
> 
> Anything else? What other path operations matter that follow links
> than truncate(), link() and open()?

Timestamp updates, chmod/chown, xattr mess...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: /proc/pid/fd && anon_inode_fops

2013-08-25 Thread Linus Torvalds

On Sun, Aug 25, 2013 at 12:48 PM, Oleg Nesterov  wrote:
>
> Or pid_revalidate(), but my concern is task_dumpable() logic.
>
> pid_revalidate() does inode->i_*id = GLOBAL_ROOT_*ID if task_dumpable()
> fails, but it can fail simply because ->mm = NULL.
>
> This means that almost everything in /proc/zombie-pid/ becomes root.
> Doesn't really hurt, but for what? Looks a bit strange imho.

The zombie case shouldn't be relevant, because a zombie will have
closed all the file descriptors anyway, so they no longer exist.

That said, task_dumpable isn't wonderful, and I suspect we could drop
that logic entirely in the tid-fd case if we just use f_cred.

The reason we have task_dumpable is exactly because we use the task
credentials, and they may not really be relevant to the file
credentials. IOW, it's there to protect against execve'ing a suid
program that opens some protected file and then in setuid()'s back the
the original user after having done the critical stuff.  But
file->f_cred is exactly about the credentials at the time of the open,
so it should make things like that irrelevant.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] driver core / ACPI: Avoid device removal locking problems

2013-08-25 Thread Rafael J. Wysocki

From: Rafael J. Wysocki 

There are two mutexes, device_hotplug_lock and acpi_scan_lock, held
around the acpi_bus_trim() call in acpi_scan_hot_remove() which
generally removes devices (it removes ACPI device objects at least,
but it may also remove "physical" device objects through .detach()
callbacks of ACPI scan handlers).  Thus, potentially, device sysfs
attributes are removed under these locks and to remove those
attributes it is necessary to hold the s_active references of their
directory entries for writing.

On the other hand, the execution of a .show() or .store() callback
from a sysfs attribute is carried out with that attribute's s_active
reference held for reading.  Consequently, if any device sysfs
attribute that may be removed from within acpi_scan_hot_remove()
through acpi_bus_trim() has a .store() or .show() callback which
acquires either acpi_scan_lock or device_hotplug_lock, the execution
of that callback may deadlock with the removal of the attribute.
[Unfortunately, the "online" device attribute of CPUs and memory
blocks and the "eject" attribute of ACPI device objects are affected
by this issue.]

To avoid those deadlocks introduce a new protection mechanism that
can be used by the device sysfs attributes in question.  Namely,
if a device sysfs attribute's .store() or .show() callback routine
is about to acquire device_hotplug_lock or acpi_scan_lock, it can
first execute read_lock_device_remove() and return an error code if
that function returns false.  If true is returned, the lock in
question may be acquired and read_unlock_device_remove() must be
called.  [This mechanism is implemented by means of an additional
rwsem in drivers/base/core.c.]

Make the affected sysfs attributes in the driver core and ACPI core
use read_lock_device_remove() and read_unlock_device_remove() as
described above.

Signed-off-by: Rafael J. Wysocki 
Reported-by: Gu Zheng 
---

For 3.12, applies on top of linux-pm.git/linux-next.

Thanks,
Rafael

---
 drivers/acpi/scan.c|8 
 drivers/base/core.c|   29 +
 include/linux/device.h |4 
 3 files changed, 41 insertions(+)

Index: linux-pm/drivers/base/core.c
===
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -408,7 +408,11 @@ static ssize_t show_online(struct device
 {
bool val;
 
+   if (!read_lock_device_remove())
+   return -EBUSY;
+
lock_device_hotplug();
+   read_unlock_device_remove();
val = !dev->offline;
unlock_device_hotplug();
return sprintf(buf, "%u\n", val);
@@ -424,7 +428,11 @@ static ssize_t store_online(struct devic
if (ret < 0)
return ret;
 
+   if (!read_lock_device_remove())
+   return -EBUSY;
+
lock_device_hotplug();
+   read_unlock_device_remove();
ret = val ? device_online(dev) : device_offline(dev);
unlock_device_hotplug();
return ret < 0 ? ret : count;
@@ -1479,8 +1487,29 @@ EXPORT_SYMBOL_GPL(put_device);
 EXPORT_SYMBOL_GPL(device_create_file);
 EXPORT_SYMBOL_GPL(device_remove_file);
 
+static DECLARE_RWSEM(device_remove_rwsem);
 static DEFINE_MUTEX(device_hotplug_lock);
 
+bool __must_check read_lock_device_remove(void)
+{
+   return down_read_trylock(_remove_rwsem);
+}
+
+void read_unlock_device_remove(void)
+{
+   up_read(_remove_rwsem);
+}
+
+void device_remove_begin(void)
+{
+   down_write(_remove_rwsem);
+}
+
+void device_remove_end(void)
+{
+   up_write(_remove_rwsem);
+}
+
 void lock_device_hotplug(void)
 {
mutex_lock(_hotplug_lock);
Index: linux-pm/include/linux/device.h
===
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -893,6 +893,10 @@ static inline bool device_supports_offli
return dev->bus && dev->bus->offline && dev->bus->online;
 }
 
+extern bool __must_check read_lock_device_remove(void);
+extern void read_unlock_device_remove(void);
+extern void device_remove_begin(void);
+extern void device_remove_end(void);
 extern void lock_device_hotplug(void);
 extern void unlock_device_hotplug(void);
 extern int device_offline(struct device *dev);
Index: linux-pm/drivers/acpi/scan.c
===
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -288,6 +288,7 @@ static void acpi_bus_device_eject(void *
struct acpi_scan_handler *handler;
u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE;
 
+   device_remove_begin();
mutex_lock(_scan_lock);
 
acpi_bus_get_device(handle, );
@@ -315,6 +316,7 @@ static void acpi_bus_device_eject(void *
 
  out:
mutex_unlock(_scan_lock);
+   device_remove_end();
return;
 
  err_out:
@@ -449,6 +451,7 @@ void acpi_bus_hot_remove_device(void *co
acpi_handle handle =

Re: [PATCH v2] vfs: Tighten up linkat(..., AT_EMPTY_PATH)

2013-08-25 Thread Linus Torvalds

On Sat, Aug 24, 2013 at 8:37 PM, Al Viro  wrote:
>
> FWIW, I'm tempted to try the following trick:
> * introduce FMODE_FLINK in file->f_mode; O_TMPFILE would set it,
> unless O_EXCL is present.
> * introduce LOOKUP_LINK, to be passed by sys_linkat() when
> resolving the target.

[ .. snipped .. ]

Yes. I think we should do this, but I think we should also look at
what _other_ LOOKUP_xyz we should do for the /proc case.

For the read-only fd case, we should have a LOOKUP_WRITE flag, and
return -EPERM if an operation is a write, and we terminate in that
LAST_BIND case.

That would catch the truncate() case, but also the "open a read-only
fd for write or O_TRUNC" case.

Anything else? What other path operations matter that follow links
than truncate(), link() and open()?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: /proc/pid/fd && anon_inode_fops

2013-08-25 Thread Oleg Nesterov

Cough. I am going off-topic again, but I can't resist...

On 08/25, Linus Torvalds wrote:
>
> Look at the code that creates the fd stat information, for example.
> It's in tid_fd_revalidate(), and it really doesn't make much sense to
> use the task credentials for it.

Or pid_revalidate(), but my concern is task_dumpable() logic.

pid_revalidate() does inode->i_*id = GLOBAL_ROOT_*ID if task_dumpable()
fails, but it can fail simply because ->mm = NULL.

This means that almost everything in /proc/zombie-pid/ becomes root.
Doesn't really hurt, but for what? Looks a bit strange imho.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Unusually high system CPU usage with recent kernels

2013-08-25 Thread Tibor Billes

From: Paul E. McKenney Sent: 08/24/13 11:03 PM
> On Sat, Aug 24, 2013 at 09:59:45PM +0200, Tibor Billes wrote:
> > From: Paul E. McKenney Sent: 08/22/13 12:09 AM
> > > On Wed, Aug 21, 2013 at 11:05:51PM +0200, Tibor Billes wrote:
> > > > > From: Paul E. McKenney Sent: 08/21/13 09:12 PM
> > > > > On Wed, Aug 21, 2013 at 08:14:46PM +0200, Tibor Billes wrote:
> > > > > > > From: Paul E. McKenney Sent: 08/20/13 11:43 PM
> > > > > > > On Tue, Aug 20, 2013 at 10:52:26PM +0200, Tibor Billes wrote:
> > > > > > > > > From: Paul E. McKenney Sent: 08/20/13 04:53 PM
> > > > > > > > > On Tue, Aug 20, 2013 at 08:01:28AM +0200, Tibor Billes wrote:
> > > > > > > > > > Hi,
> > > > > > > > > > 
> > > > > > > > > > I was using the 3.9.7 stable release and tried to upgrade 
> > > > > > > > > > to the 3.10.x series.
> > > > > > > > > > The 3.10.x series was showing unusually high (>75%) system 
> > > > > > > > > > CPU usage in some
> > > > > > > > > > situations, making things really slow. The latest stable I 
> > > > > > > > > > tried is 3.10.7.
> > > > > > > > > > I also tried 3.11-rc5, they both show this behaviour. This 
> > > > > > > > > > behaviour doesn't
> > > > > > > > > > show up when the system is idling, only when doing some CPU 
> > > > > > > > > > intensive work,
> > > > > > > > > > like compiling with multiple threads. Compiling with only 
> > > > > > > > > > one thread seems not
> > > > > > > > > > to trigger this behaviour.
> > > > > > > > > > 
> > > > > > > > > > To be more precise I did a `perf record -a` while compiling 
> > > > > > > > > > a large C++ program
> > > > > > > > > > with scons using 4 threads, the result is appended at the 
> > > > > > > > > > end of this email.
> > > > > > > > > 
> > > > > > > > > New one on me! You are running a mainstream system (x86_64), 
> > > > > > > > > so I am
> > > > > > > > > surprised no one else noticed.
> > > > > > > > > 
> > > > > > > > > Could you please send along your .config file?
> > > > > > > > 
> > > > > > > > Here it is
> > > > > > > 
> > > > > > > Interesting. I don't see RCU stuff all that high on the list, but
> > > > > > > the items I do see lead me to suspect RCU_FAST_NO_HZ, which has 
> > > > > > > some
> > > > > > > relevance to the otherwise inexplicable group of commits you 
> > > > > > > located
> > > > > > > with your bisection. Could you please rerun with 
> > > > > > > CONFIG_RCU_FAST_NO_HZ=n?
> > > > > > > 
> > > > > > > If that helps, there are some things I could try.
> > > > > > 
> > > > > > It did help. I didn't notice anything unusual when running with 
> > > > > > CONFIG_RCU_FAST_NO_HZ=n.
> > > > > 
> > > > > Interesting. Thank you for trying this -- and we at least have a
> > > > > short-term workaround for this problem. I will put a patch together
> > > > > for further investigation.
> > > > 
> > > > I don't specifically need this config option so I'm fine without it in
> > > > the long term, but I guess it's not supposed to behave like that.
> > > 
> > > OK, good, we have a long-term workload for your specific case,
> > > even better. ;-)
> > > 
> > > But yes, there are situations where RCU_FAST_NO_HZ needs to work
> > > a bit better. I hope you will bear with me with a bit more
> > > testing...
> > >
> > > > > In the meantime, could you please tell me how you were measuring
> > > > > performance for your kernel builds? Wall-clock time required to 
> > > > > complete
> > > > > one build? Number of builds completed per unit time? Something else?
> > > > 
> > > > Actually, I wasn't all this sophisticated. I have a system monitor
> > > > applet on my top panel (using MATE, Linux Mint), four little graphs,
> > > > one of which shows CPU usage. Different colors indicate different kind
> > > > of CPU usage. Blue shows user space usage, red shows system usage, and
> > > > two more for nice and iowait. During a normal compile it's almost
> > > > completely filled with blue user space CPU usage, only the top few
> > > > pixels show some iowait and system usage. With CONFIG_RCU_FAST_NO_HZ
> > > > set, about 3/4 of the graph was red system CPU usage, the rest was
> > > > blue. So I just looked for a pile of red on my graphs when I tested
> > > > different kernel builds. But also compile speed was horrible I couldn't
> > > > wait for the build to finish. Even the UI got unresponsive.
> > > 
> > > We have been having problems with CPU accounting, but this one looks
> > > quite real.
> > > 
> > > > Now I did some measuring. In the normal case a compile finished in 36
> > > > seconds, compiled 315 object files. Here are some output lines from
> > > > dstat -tasm --vm during compile:
> > > > system total-cpu-usage -dsk/total- -net/total- 
> > > > ---paging-- ---system-- swap--- --memory-usage- 
> > > > -virtual-memory
> > > >     time     |usr sys idl wai hiq siq| read  writ| recv  send|  in   
> > > > out | int   csw | used  free| used  buff  cach  free|majpf minpf alloc  
> > > > free
> > > > 21-08 21:48:05|

[PATCH V4] gpio: New driver for LSI ZEVIO SoCs

2013-08-25 Thread Fabian Vogt

This driver supports the GPIO controller found in LSI ZEVIO SoCs.
It has been successfully tested on a TI nspire CX calculator.
---
 .../devicetree/bindings/gpio/gpio-zevio.txt|  18 ++
 drivers/gpio/Kconfig   |   6 +
 drivers/gpio/Makefile  |   1 +
 drivers/gpio/gpio-zevio.c  | 214 +
 4 files changed, 239 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/gpio/gpio-zevio.txt
 create mode 100644 drivers/gpio/gpio-zevio.c

diff --git a/Documentation/devicetree/bindings/gpio/gpio-zevio.txt 
b/Documentation/devicetree/bindings/gpio/gpio-zevio.txt
new file mode 100644
index 000..892f953
--- /dev/null
+++ b/Documentation/devicetree/bindings/gpio/gpio-zevio.txt
@@ -0,0 +1,18 @@
+Zevio GPIO controller
+
+Required properties:
+- compatible = "lsi,zevio-gpio"
+- reg = 
+- #gpio-cells = <2>
+- gpio-controller;
+
+Optional:
+- #ngpios = <32>: Number of GPIOs. Defaults to 32 if absent
+
+Example:
+   gpio: gpio@9000 {
+   compatible = "lsi,zevio-gpio";
+   reg = <0x9000 0x1000>;
+   gpio-controller;
+   #gpio-cells = <2>;
+   };
diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index b2450ba..49f8c62 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -138,6 +138,12 @@ config GPIO_EP93XX
depends on ARCH_EP93XX
select GPIO_GENERIC
 
+config GPIO_ZEVIO
+   bool "LSI ZEVIO SoC memory mapped GPIOs"
+   depends on OF
+   help
+ Say yes here to support the GPIO controller in LSI ZEVIO SoCs.
+
 config GPIO_MM_LANTIQ
bool "Lantiq Memory mapped GPIOs"
depends on LANTIQ && SOC_XWAY
diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile
index ef3e983..b70cb1b 100644
--- a/drivers/gpio/Makefile
+++ b/drivers/gpio/Makefile
@@ -87,3 +87,4 @@ obj-$(CONFIG_GPIO_WM831X) += gpio-wm831x.o
 obj-$(CONFIG_GPIO_WM8350)  += gpio-wm8350.o
 obj-$(CONFIG_GPIO_WM8994)  += gpio-wm8994.o
 obj-$(CONFIG_GPIO_XILINX)  += gpio-xilinx.o
+obj-$(CONFIG_GPIO_ZEVIO)   += gpio-zevio.o
diff --git a/drivers/gpio/gpio-zevio.c b/drivers/gpio/gpio-zevio.c
new file mode 100644
index 000..35a254b
--- /dev/null
+++ b/drivers/gpio/gpio-zevio.c
@@ -0,0 +1,214 @@
+/*
+ * GPIO controller in LSI ZEVIO SoCs.
+ *
+ * Author: Fabian Vogt 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * Memory layout:
+ * This chip has four gpio sections, each controls 8 GPIOs.
+ * Bit 0 in section 0 is GPIO 0, bit 2 in section 1 is GPIO 10.
+ * Disclaimer: Reverse engineered!
+ * For more information refer to:
+ * 
http://hackspire.unsads.com/wiki/index.php/Memory-mapped_I/O_ports#9000_-_General_Purpose_I.2FO_.28GPIO.29
+ *
+ * 0x00-0x3F: Section 0
+ * +0x00: Masked interrupt status (read-only)
+ * +0x04: R: Interrupt status W: Reset interrupt status
+ * +0x08: R: Interrupt mask W: Mask interrupt
+ * +0x0C: W: Unmask interrupt (write-only)
+ * +0x10: Direction: I/O=1/0
+ * +0x14: Output
+ * +0x18: Input (read-only)
+ * +0x20: R: Level interrupt W: Set as level interrupt
+ * 0x40-0x7F: Section 1
+ * 0x80-0xBF: Section 2
+ * 0xC0-0xFF: Section 3
+ */
+
+#define ZEVIO_GPIO_SECTION_SIZE0x40
+
+#define ZEVIO_GPIO_INT_MASKED_STATUS_OFFSET0x00
+#define ZEVIO_GPIO_INT_STATUS_OFFSET   0x04
+#define ZEVIO_GPIO_INT_UNMASK_OFFSET   0x08
+#define ZEVIO_GPIO_INT_MASK_OFFSET 0x0C
+#define ZEVIO_GPIO_DIRECTION_OFFSET0x10
+#define ZEVIO_GPIO_OUTPUT_OFFSET   0x14
+#define ZEVIO_GPIO_INPUT_OFFSET0x18
+#define ZEVIO_GPIO_INT_STICKY_OFFSET   0x20
+
+#define to_zevio_gpio(chip) container_of(to_of_mm_gpio_chip(chip), \
+   struct zevio_gpio, chip)
+
+/* Bit of GPIO in section */
+#define ZEVIO_GPIO_BIT(gpio) (gpio&7)
+/* Offset to section of GPIO relative to base */
+#define ZEVIO_GPIO_SECTION_OFFSET(gpio) (((gpio>>3)&3)*ZEVIO_GPIO_SECTION_SIZE)
+/* Address of register, which is responsible for given GPIO */
+#define ZEVIO_GPIO(cntrlr, gpio, reg) IOMEM(cntrlr->chip.regs + \
+   ZEVIO_GPIO_SECTION_OFFSET(gpio) + ZEVIO_GPIO_##reg##_OFFSET)
+
+struct zevio_gpio {
+   spinlock_t  lock;
+   struct of_mm_gpio_chip  chip;
+};
+
+/* Functions for struct gpio_chip */
+static int zevio_gpio_get(struct gpio_chip *chip, unsigned pin)
+{
+   struct zevio_gpio *controller = to_zevio_gpio(chip);
+
+   /* Only reading allowed, so no spinlock needed */
+   u32 val = readl(ZEVIO_GPIO(controller, pin, INPUT));
+
+   return (val >> ZEVIO_GPIO_BIT(pin)) & 0x1;
+}
+

Re: [PATCH] mmc: omap_hsmmc: clear status flags before starting a new command

2013-08-25 Thread Chris Ball

Hi,

On Sun, Aug 25 2013, Balaji T K wrote:
> On Saturday 29 June 2013 11:55 AM, Francesco Lavra wrote:
>> Commit 1f6b9fa40e76fffaaa0b3bd6a0bfdcf1cdc06efa consolidated writes to
>> the STAT register in one location, moving them from omap_hsmmc_do_irq()
>> to omap_hsmmc_irq(). This move has the unwanted side effect that the
>> controller status flags are potentially cleared after a new command has
>> been started as a consequence of reading the previous status flags.
>> This means that if the new command changes the status flags before the
>> IRQ routine returns, those flags may be cleared without handling the
>> event which asserted them, and thus missing the event.
>> Move the writing of the STAT register back in omap_hsmmc_do_irq(),
>> before handling the status flags which generated the interrupt.
>>
>> Signed-off-by: Francesco Lavra 
>
> Reviewed and Tested-by: Balaji T K 

Thanks, pushed to mmc-next for 3.12.

- Chris.
-- 
Chris Ball  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mmc: omap_hsmmc: clear status flags before starting a new command

2013-08-25 Thread Balaji T K


On Saturday 29 June 2013 11:55 AM, Francesco Lavra wrote:

Commit 1f6b9fa40e76fffaaa0b3bd6a0bfdcf1cdc06efa consolidated writes to
the STAT register in one location, moving them from omap_hsmmc_do_irq()
to omap_hsmmc_irq(). This move has the unwanted side effect that the
controller status flags are potentially cleared after a new command has
been started as a consequence of reading the previous status flags.
This means that if the new command changes the status flags before the
IRQ routine returns, those flags may be cleared without handling the
event which asserted them, and thus missing the event.
Move the writing of the STAT register back in omap_hsmmc_do_irq(),
before handling the status flags which generated the interrupt.

Signed-off-by: Francesco Lavra 


Reviewed and Tested-by: Balaji T K 


---
  drivers/mmc/host/omap_hsmmc.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index eccedc7..301cb7e 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -1041,6 +1041,7 @@ static void omap_hsmmc_do_irq(struct omap_hsmmc_host 
*host, int status)
}
}

+   OMAP_HSMMC_WRITE(host->base, STAT, status);
if (end_cmd || ((status & CC_EN) && host->cmd))
omap_hsmmc_cmd_done(host, host->cmd);
if ((end_trans || (status & TC_EN)) && host->mrq)
@@ -1060,7 +1061,6 @@ static irqreturn_t omap_hsmmc_irq(int irq, void *dev_id)
omap_hsmmc_do_irq(host, status);

/* Flush posted write */
-   OMAP_HSMMC_WRITE(host->base, STAT, status);
status = OMAP_HSMMC_READ(host->base, STAT);
}



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] acpi: blacklist win8 OSI for buggy laptops

2013-08-25 Thread Rafael J. Wysocki

On Sunday, August 25, 2013 12:37:33 PM Felipe Contreras wrote:
> Since v3.7 the acpi backlight driver doesn't work correctly in several
> machines because ACPI code has different code for Windows 8, and the
> rest.
> 
> The commit ea45ea7 (in v3.11-rc2) tried to fix this problem by using the
> intel backlight driver, however it introduced several other issues in
> different machines.
> 
> This patch fixes both regressions by blacklisting the win8 OSI, so we
> are back to v3.6 behavior, and it should remain that way until the intel
> backlight driver is fixed.
> 
> Since v3.7, users have been forced to fix the initial regression by
> modifying the boot arguments (acpi_osi="!Windows 2012").
> 
> Once the Intel backlight driver works correctly for all machines, this
> blacklist can be removed and that driver can be used instead.
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=60682
> 
> Signed-off-by: Felipe Contreras 

Queued up for 3.12, but I've already applied your previous patch for the Asus
Zenbook, so I needed to rebase this one.  Please check the bleeding-edge branch
of the linux-pm.git tree for the result.

Also next time you fix problems reported by other people please add Reported-by
tags to the patch to give a credit to the reporters.

Thanks,
Rafael


> ---
>  drivers/acpi/blacklist.c | 35 +++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/drivers/acpi/blacklist.c b/drivers/acpi/blacklist.c
> index cb96296..42cccbe 100644
> --- a/drivers/acpi/blacklist.c
> +++ b/drivers/acpi/blacklist.c
> @@ -192,6 +192,12 @@ static int __init dmi_disable_osi_win7(const struct 
> dmi_system_id *d)
>   acpi_osi_setup("!Windows 2009");
>   return 0;
>  }
> +static int __init dmi_disable_osi_win8(const struct dmi_system_id *d)
> +{
> + printk(KERN_NOTICE PREFIX "DMI detected: %s\n", d->ident);
> + acpi_osi_setup("!Windows 2012");
> + return 0;
> +}
>  
>  static struct dmi_system_id acpi_osi_dmi_table[] __initdata = {
>   {
> @@ -269,6 +275,35 @@ static struct dmi_system_id acpi_osi_dmi_table[] 
> __initdata = {
>   },
>  
>   /*
> +  * The following machines have broken backlight support when reporting
> +  * the Windows 2012 OSI, so disable it until their support is fixed.
> +  */
> + {
> + .callback = dmi_disable_osi_win8,
> + .ident = "ASUS Zenbook Prime UX31A",
> + .matches = {
> +  DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
> +  DMI_MATCH(DMI_PRODUCT_NAME, "UX31A"),
> + },
> + },
> + {
> + .callback = dmi_disable_osi_win8,
> + .ident = "Dell Inspiron 15R SE",
> + .matches = {
> +  DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
> +  DMI_MATCH(DMI_PRODUCT_NAME, "Inspiron 7520"),
> + },
> + },
> + {
> + .callback = dmi_disable_osi_win8,
> + .ident = "Lenovo ThinkPad Edge E530",
> + .matches = {
> +  DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
> +  DMI_MATCH(DMI_PRODUCT_VERSION, "3259A2G"),
> + },
> + },
> +
> + /*
>* BIOS invocation of _OSI(Linux) is almost always a BIOS bug.
>* Linux ignores it, except for the machines enumerated below.
>*/
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V3] gpio: New driver for LSI ZEVIO SoCs

2013-08-25 Thread Fabian Vogt


Hi,


On 08/07/2013 06:53 AM, Fabian Vogt wrote:

This driver supports the GPIO controller found in LSI ZEVIO SoCs.
It has been successfully tested on a TI nspire CX calculator.


diff --git a/Documentation/devicetree/bindings/gpio/gpio-zevio.txt  
b/Documentation/devicetree/bindings/gpio/gpio-zevio.txt



+Zevio GPIO controller
+
+Required properties:
+- compatible = "lsi,zevio-gpio"


Is there only one zevio chip, or a series? Is "zevio" the full name of
the chip, including any version number?

We don't know, it's a relableled chip with
TI-NSPIRE / L9A0702 / TI-NS2006A-0 / LSI LOGIC / ZEVIO / U 0714 /  
WYJ14052-1
on it. But this driver should match the other drivers (lsi,zevio-intc,  
lsi,zevio-timer).



+- reg = 
+- #gpio-cells = <2>
+- gpio-controller;
+
+Optional:
+- #ngpios = <32>: Number of GPIOs. Defaults to 32 if absent


Perhaps one can derive that from the compatible value? The fact this
property exists implies there's more than one zevio chip, so perhaps
each should have an explicit compatible value described above?
I added it just for someone who maybe needs it. It's only two lines and  
maybe
it'll be helpful for someone. We don't know whether some similiar or this  
controller
exist in different configurations (pin count, section sice, register  
layout).

Also I hate hardcoded values which require a recompile to change..


Is the GPIO block not also an interrupt source/controller? I see the
following in the patch, and references to some IRQ registers...


+   select GENERIC_IRQ_CHIP
I forgot to remove this line after testing the interrupts, the tests went  
horribly (hard lockups)...


V4 should be underway soon.

Bye,
Fabian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 1/5] rcu: Add duplicate-callback tests to rcutorture

2013-08-25 Thread Paul E. McKenney

On Sat, Aug 24, 2013 at 03:25:36PM -0400, Mathieu Desnoyers wrote:
> * Paul E. McKenney (paul...@linux.vnet.ibm.com) wrote:
> [...]
> > The result is as follows.  Better?
> 
> Hi Paul,
> 
> Pitching in late in the thread, so that I can get a share of the fun ;-)
> 
> > Thanx, Paul
> > 
> > #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
> > static void rcu_torture_leak_cb(struct rcu_head *rhp)
> > {
> > }
> > 
> > static void rcu_torture_err_cb(struct rcu_head *rhp)
> > {
> > /*
> >  * This -might- happen due to race conditions, but is unlikely.
> >  * The scenario that leads to this happening is that the
> >  * first of the pair of duplicate callbacks is queued,
> >  * someone else starts a grace period that includes that
> >  * callback, then the second of the pair must wait for the
> >  * next grace period.  Unlikely, but can happen.  If it
> >  * does happen, the debug-objects subsystem won't have splatted.
> >  */
> > pr_alert("rcutorture: duplicated callback was invoked.\n");
> > }
> > #endif /* #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD */
> > 
> 
> Hrm. Putting an #ifdef within a function when not utterly needed is
> usually a bad idea. How about:
> 
> /*
>  * Verify that double-free causes debug-objects to complain, but only
>  * if CONFIG_DEBUG_OBJECTS_RCU_HEAD=y.  Otherwise, say that the test
>  * cannot be carried out.
>  */
> #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
> static void rcu_test_debug_objects(void)
> {
>   struct rcu_head rh1;
>   struct rcu_head rh2;
> 
>   init_rcu_head_on_stack();
>   init_rcu_head_on_stack();
>   pr_alert("rcutorture: WARN: Duplicate call_rcu() test starting.\n");
>   preempt_disable(); /* Prevent preemption from interrupting test. */
>   rcu_read_lock(); /* Make it impossible to finish a grace period. */
>   call_rcu(, rcu_torture_leak_cb); /* Start grace period. */
>   local_irq_disable(); /* Make it harder to start a new grace period. */
>   call_rcu(, rcu_torture_leak_cb);
>   call_rcu(, rcu_torture_err_cb); /* Duplicate callback. */
>   local_irq_enable();
>   rcu_read_unlock();
>   preempt_enable();
>   rcu_barrier();
>   pr_alert("rcutorture: WARN: Duplicate call_rcu() test complete.\n");
>   destroy_rcu_head_on_stack();
>   destroy_rcu_head_on_stack();
> }
> #else /* #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD */
> static void rcu_test_debug_objects(void)
> {
>   pr_alert("rcutorture: !CONFIG_DEBUG_OBJECTS_RCU_HEAD, not testing 
> duplicate call_rcu()\n");
> }
> #endif /* #else #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD */

The objection to this is that it duplicates the function header, both
copies of which must be updated.  See Josh's and my discussion on this
point earlier in the thread.  Who knows, Josh might eventually convince
me to individually ifdef the functions in kernel/rcutree_plugin.h,
but I am not quite there yet.  ;-)

That said, I do see two benefits of individual ifdef:

1.  It is easy to see when a given function is included.
As it is, there are runs of many hundreds of lines of
code where it might not be obvious to the casual reader.

2.  Duplicated code is asking for silent rebase and merge errors.
This can happen when a change to the function header collides
with a change that duplicates the function under ifdef.

So perhaps I will eventually convert to the individual-ifdef style.
At the moment, I am in no hurry.

> More comments inlined in the code below,
> 
> > /*
> >  * Verify that double-free causes debug-objects to complain, but only
> >  * if CONFIG_DEBUG_OBJECTS_RCU_HEAD=y.  Otherwise, say that the test
> >  * cannot be carried out.
> >  */
> > static void rcu_test_debug_objects(void)
> > {
> > #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
> > struct rcu_head rh1;
> > struct rcu_head rh2;
> > 
> > init_rcu_head_on_stack();
> > init_rcu_head_on_stack();
> > pr_alert("rcutorture: WARN: Duplicate call_rcu() test starting.\n");
> > preempt_disable(); /* Prevent preemption from interrupting test. */
> > rcu_read_lock(); /* Make it impossible to finish a grace period. */
> > call_rcu(, rcu_torture_leak_cb); /* Start grace period. */
> 
> Are we really "starting" a grace period ? If rcu_test_debug_objects() is
> executed after some callbacks are already queued, are we, by definition,
> "starting" the grace period ?
> 
> Also, I find it weird to have, in that order:
> 
> 1) preempt_disable()
> 2) rcu_read_lock()
> 3) local_irq_disable()
> 
> I would rather expect:
> 
> 1) rcu_read_lock()
> 2) preempt_disable()
> 3) local_irq_disable()
> 
> So they come in increasing order of impact on the system: with
> non-preemptable RCU, the read-lock and preempt disable mean the same
> thing, however, with preemptable RCU, the impact of preempt disable
> seems larger than the impact of RCU read lock: preemption is still
> enabled when within a RCU

Re: [PATCH] kernel/rcutree.c: deem to be lazy if there are no callbacks.

2013-08-25 Thread Paul E. McKenney

On Thu, Aug 22, 2013 at 11:01:53AM +0800, Chen Gang wrote:
> On 08/21/2013 10:23 PM, Paul E. McKenney wrote:
> > On Wed, Aug 21, 2013 at 01:59:29PM +0800, Chen Gang wrote:

[ . . . ]

> > Don't get me wrong, I do welcome appropriate patches.  In fact, if
> > you look at RCU's git history, you will see that I frequently accept
> > patches from a fair number of people.  And if you were willing to
> > invest some time and thought, you might eventually be able to generate
> > an appropriate (albeit low priority) patch to this function.  However,
> > you seem to be motivated to submit small patches with a minimum of
> > thought and preparation, perhaps because you need to meet some external
> > or self-imposed quota of accepted patches.  And if you are in fact driven
> > by a quota that prevents you from taking the time required to carefully
> > think things through, you are wasting your time with RCU.
> 
> Hmm... at least, some contents you said above is correct to me.
> 
> At least, I should provide 10 patches per month, it is a necessary
> basic requirement to me.

OK, that does help explain the otherwise inexplicable approach you have
been taking.  Let's see how you have been doing, based on committer date
in Linus's tree:

  1 2012-11
 15 2013-01
  7 2013-02
 20 2013-03
 21 2013-04
 12 2013-05
 17 2013-06
 10 2013-07

The last few months might be understated a bit due to patches
still being in maintainer trees.  This is a nice contrast from my
first impression of you from https://lkml.org/lkml/2013/6/9/64 and
https://lkml.org/lkml/2013/8/19/650, neither of which gave me any
reason to trust your work, to put it mildly.  And if I cannot trust
your work, I obviously cannot accept your patches.

You do seem to select for localized bug fixes, which require less work
than the performance-motivated patches you were putting forward earlier
in this thread.  With a localized bug, you demonstrate the bug, show the
fix, and that is that.  From what I can see, part of the problem with
your patches in this email thread is that you are trying to move from
localized bug fixes to performance issues without doing the additional
work required.  Please see below for a rough outline of this additional
work.

> And what my focus is efficiency: let appliers and maintainers together
> to provide contributes to outside with efficiency.

Sounds great, but there are many possible definitions of "efficiency".
Given your quota, I would expect your definition to involve number of
patches accepted.  In contrast, my definition for RCU instead involves
maintainability, robustness, scalability, and, for a few critical
code paths, performance.  I therefore need you to have thought through
and carefully tested your patch.

> If you already know about it, why need I continue ?  but if you don't
> know either, I should try.

What I need you to do in future RCU performance patch submissions is:

1.  Think through your patch and the code that it is modifying.
If you submit a patch to me, you should be able to answer the
sorts of questions that I was asking in this thread.

2.  Tell me what situations your patch helps and not.

3.  Tell me how much your patch improves performance in the
situations where it helps.

4.  Test the code.  If it makes a measurable difference, present
the performance results.  (It would be very surprising if your
early-loop exit patch made a significant difference, expecially
on a CONFIG_PREEMPT=n kernel.)

5.  Rather than randomly dropping into the code, use actual measurements
to determine where to focus your performance-improvement efforts.
Developers, even experienced ones, are really bad at guessing
where the most important performance problems are.

6.  Use your judgement.  For example, 1000-line patch to improve a
slowpath by 0.1% simply isn't worth it.  A high risk of adding
bugs for a microscopic benefit?  Thanks, but no thanks!!!

For your patch https://lkml.org/lkml/2013/8/19/651, which was closest
of the three to being useful, here are some things about RCU that you
should have taken the time to learn -before- submitting the patch:

a.  Q:  How many iterations for the for_each_rcu_flavor() loop?
A:  On CONFIG_PREEMPT=n kernels, only two iterations.
On CONFIG_PREEMPT=y kernels, only three iterations.

b.  Q:  Which flavor of RCU is most likely to have non-lazy callbacks
queued?

A:  On CONFIG_PREEMPT=y kernels, the first one in the list.
For CONFIG_PREEMPT=n kernels, it is last in the list.
(In other words, for CONFIG_PREEMPT=n kernels, this change
won't help at all, at least not without also changing the
order of the list.)

c.  Q:  Do any of the other for_each_rcu_flavor() loops care what order
the flavors are in?

A:  No.  (In other

Re: /proc/pid/fd && anon_inode_fops

2013-08-25 Thread Andy Lutomirski

On Sun, Aug 25, 2013 at 11:32 AM, Linus Torvalds
 wrote:
> On Sat, Aug 24, 2013 at 10:23 PM, Al Viro  wrote:
>>
>> We are really stuck with the current semantics here - switching to
>> *BSD one would not only mean serious surgery on descriptor handling
>> (it's one of the wartier areas in *BSD VFS, in large part because
>> of magic-open-really-a-dup kludges they have to do), it would change
>> a long-standing userland API that had been there for nearly 20 years
>> _and_ one that tends to be used in corner cases of hell knows how many
>> scripts.
>
> Actually, I'm pretty sure we did have the "dup" semantics at one point
> (long ago), and they were really nice (because you could use them to
> see where in the stream the fd was etc). It just fit so horribly badly
> into the VFS semantics that it got changed into the current "new file
> descriptor" one. Afaik, nothing broke.
>

We have fdinfo now, which is IMO much less scary.  Programs can find
the stream position, but they can't change it.  OTOH...

> So I'm not really sure about the "we're stuck with it" for semantic
> reasons, and it turns out that very few programs/scripts actually use
> /proc//fd/ at all (random use of /dev/stdin is likely the
> most common case). But I agree about the "serious surgery on
> descriptor handling" part.

.../dev/stdin doesn't actually do what you expect if input comes from
something seekable.

$ cat /proc/self/fd/3
test
$ cat /proc/self/fd/3
test
$ cat /proc/self/fd/3
test
$ cat <&3
est
$ cat /proc/self/fd/3
test
$ cat <&3

(I'm not going to advocate for changing this.)

--Andy

>
>   Linus



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: /proc/pid/fd && anon_inode_fops

2013-08-25 Thread Al Viro

On Sun, Aug 25, 2013 at 11:32:45AM -0700, Linus Torvalds wrote:
> On Sat, Aug 24, 2013 at 10:23 PM, Al Viro  wrote:
> >
> > We are really stuck with the current semantics here - switching to
> > *BSD one would not only mean serious surgery on descriptor handling
> > (it's one of the wartier areas in *BSD VFS, in large part because
> > of magic-open-really-a-dup kludges they have to do), it would change
> > a long-standing userland API that had been there for nearly 20 years
> > _and_ one that tends to be used in corner cases of hell knows how many
> > scripts.
> 
> Actually, I'm pretty sure we did have the "dup" semantics at one point
> (long ago), and they were really nice (because you could use them to
> see where in the stream the fd was etc). It just fit so horribly badly
> into the VFS semantics that it got changed into the current "new file
> descriptor" one. Afaik, nothing broke.
> 
> So I'm not really sure about the "we're stuck with it" for semantic
> reasons, and it turns out that very few programs/scripts actually use
> /proc//fd/ at all (random use of /dev/stdin is likely the
> most common case). But I agree about the "serious surgery on
> descriptor handling" part.

Well...  We are actually in better position for that these days;
right now we have very few instances of ->atomic_open(), so we could
change the calling conventions for it.  It returns 0 or -error and we
could turn that into NULL, ERR_PTR(-error) or a reference to already
opened struct file.  It's not _that_ far to propagate from that point -
atomic_open() <- lookup_open() <- do_last() <- path_openat().  So the amount
of surgery is nowhere near the horrors we used to need (and *BSD actually
does).

We could try that, but I'm really afraid that semantics changes will break
stuff; worse yet, that it'll happen to stuff in dusty corners of random admin
scripts nobody able to debug anymore ;-/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: /proc/pid/fd && anon_inode_fops

2013-08-25 Thread Linus Torvalds

On Sat, Aug 24, 2013 at 11:50 PM, Willy Tarreau  wrote:
>
> Thanks for explaining Al, that really helps me understand. However
> there's still a difference between /proc/pid called from the process
> itself (=/proc/self) and called from other processes that seems to
> suit the situation :

/proc/self has magic special properties, as you noticed.

> Thus I'm wondering if something like this could help, the idea would be
> that a with the appropriate mount option, a task could only look at its
> own descriptors unless it's running with privileges :

I'd much rather try to do it in general, and use "file->f_cred" more
aggressively for /proc//fd/ security.

We don't use f_cred at all in /proc, but that's because /proc predates
that whole thing. So instead we use the credentials of the task when
we want to look at credentials of the file, because that was the
closest approximation we used to have.

Look at the code that creates the fd stat information, for example.
It's in tid_fd_revalidate(), and it really doesn't make much sense to
use the task credentials for it. I wonder if we should do something
like the appended (whitespace-damaged and totally untested) patch.

Linus

---
  diff --git a/fs/proc/fd.c b/fs/proc/fd.c
  index 75f2890abbd8..2a5a53cc7a0a 100644
  --- a/fs/proc/fd.c
  +++ b/fs/proc/fd.c
  @@ -74,7 +74,6 @@ static int tid_fd_revalidate(struct dentry
*dentry, unsigned int flags)
   {
  struct files_struct *files;
  struct task_struct *task;
  -   const struct cred *cred;
  struct inode *inode;
  int fd;

  @@ -95,19 +94,17 @@ static int tid_fd_revalidate(struct dentry
*dentry, unsigned int flags)
  if (file) {
  unsigned f_mode = file->f_mode;

  -   rcu_read_unlock();
  -   put_files_struct(files);
  -
  if (task_dumpable(task)) {
  -   rcu_read_lock();
  -   cred = __task_cred(task);
  +   const struct cred *cred =
file->f_cred;
  inode->i_uid = cred->euid;
  inode->i_gid = cred->egid;
  -   rcu_read_unlock();
  } else {
  inode->i_uid = GLOBAL_ROOT_UID;
  inode->i_gid = GLOBAL_ROOT_GID;
  }
  +   rcu_read_unlock();
  +   put_files_struct(files);
  +

  if (S_ISLNK(inode->i_mode)) {
  unsigned i_mode = S_IFLNK;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: /proc/pid/fd && anon_inode_fops

2013-08-25 Thread Linus Torvalds

On Sat, Aug 24, 2013 at 10:23 PM, Al Viro  wrote:
>
> We are really stuck with the current semantics here - switching to
> *BSD one would not only mean serious surgery on descriptor handling
> (it's one of the wartier areas in *BSD VFS, in large part because
> of magic-open-really-a-dup kludges they have to do), it would change
> a long-standing userland API that had been there for nearly 20 years
> _and_ one that tends to be used in corner cases of hell knows how many
> scripts.

Actually, I'm pretty sure we did have the "dup" semantics at one point
(long ago), and they were really nice (because you could use them to
see where in the stream the fd was etc). It just fit so horribly badly
into the VFS semantics that it got changed into the current "new file
descriptor" one. Afaik, nothing broke.

So I'm not really sure about the "we're stuck with it" for semantic
reasons, and it turns out that very few programs/scripts actually use
/proc//fd/ at all (random use of /dev/stdin is likely the
most common case). But I agree about the "serious surgery on
descriptor handling" part.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 >

1 - 100 of 338 matches

Mail list logo