[RESEND PATCH] acpi-cpufreq: get the cur_freq from acpi_processor_performance states

2014-08-20 Thread Wang Weidong
As the initialized freq_tables maybe different from the p-states
values, so the array index is different as well.

p-states value: [2400 2400 2000 ...], while the freq_tables:
[2400 2000 ... CPUFREQ_TABLE_END]. After setted the freqs 2000,
the perf->state is 3 while the freqs_table's index should be 2.
So when call the get_cur_freq_on_cpu, the freqs value we get
is 2400.

So, fix the problem with the correct tables.

Signed-off-by: Wang Weidong 
---
 drivers/cpufreq/acpi-cpufreq.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index b0c18ed..ac93885 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -365,6 +365,7 @@ static u32 get_cur_val(const struct cpumask *mask)
 static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
 {
struct acpi_cpufreq_data *data = per_cpu(acfreq_data, cpu);
+   struct acpi_processor_performance *perf;
unsigned int freq;
unsigned int cached_freq;
 
@@ -375,7 +376,8 @@ static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
return 0;
}
 
-   cached_freq = data->freq_table[data->acpi_data->state].frequency;
+   perf = data->acpi_data;
+   cached_freq = perf->states[perf->state].core_frequency * 1000;
freq = extract_freq(get_cur_val(cpumask_of(cpu)), data);
if (freq != cached_freq) {
/*
-- 
1.7.12


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] extcon: sm5502: EXTCON_SM5502 should depend on I2C

2014-08-20 Thread Chanwoo Choi
Dear Myungjoo,

On 08/21/2014 02:27 PM, MyungJoo Ham wrote:
>> Hi Geert
>>
>> Thanks for your report. I already sent a patch[1] to fix this build break
>> and I'll send pull request to includec this patch in 3.17-rc2.
>>
>> [1] https://lkml.org/lkml/2014/8/13/761
>>
>> Best Regards,
>> Chanwoo Choi
> 
> I do not object to this patch or your patch[1].
> 
> However, wouldn't it be better to add depends on I2C at REGMAP_I2C?
> When you use REGMAP_I2C, you assume that I2C is already there, don't you?

The previous REGMAP_I2C has not the dependency on I2C.
So, Greert posted following patch[1] to fix it.
[1] https://lkml.org/lkml/2014/8/17/27

Also, if I2C is 'm' (module) and some driver has not dependency on I2C,
build break happen.

Thanks
Chanwoo Choi,


> 
> 
> Cheers,
> MyungJoo
> 
>>
>>
>> On 08/17/2014 07:08 PM, Geert Uytterhoeven wrote:
>>> EXTCON_SM5502 selects REGMAP_I2C, but if I2C=n:
>>>
>>> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_byte_reg_read’:
>>> drivers/base/regmap/regmap-i2c.c:28: error: implicit declaration of 
>>> function ‘i2c_smbus_read_byte_data’
>>> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_byte_reg_write’:
>>> drivers/base/regmap/regmap-i2c.c:46: error: implicit declaration of 
>>> function ‘i2c_smbus_write_byte_data’
>>> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_word_reg_read’:
>>> drivers/base/regmap/regmap-i2c.c:64: error: implicit declaration of 
>>> function ‘i2c_smbus_read_word_data’
>>> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_word_reg_write’:
>>> drivers/base/regmap/regmap-i2c.c:82: error: implicit declaration of 
>>> function ‘i2c_smbus_write_word_data’
>>> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_i2c_write’:
>>> drivers/base/regmap/regmap-i2c.c:96: error: implicit declaration of 
>>> function ‘i2c_master_send’
>>> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_i2c_gather_write’:
>>> drivers/base/regmap/regmap-i2c.c:117: error: implicit declaration of 
>>> function ‘i2c_check_functionality’
>>> drivers/base/regmap/regmap-i2c.c:130: error: implicit declaration of 
>>> function ‘i2c_transfer’
>>>
>>> Signed-off-by: Geert Uytterhoeven 
>>> ---
>>>  drivers/extcon/Kconfig | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/extcon/Kconfig b/drivers/extcon/Kconfig
>>> index 6f2f4727de2c..764f3a113e0a 100644
>>> --- a/drivers/extcon/Kconfig
>>> +++ b/drivers/extcon/Kconfig
>>> @@ -72,6 +72,7 @@ config EXTCON_PALMAS
>>>  
>>>  config EXTCON_SM5502
>>> tristate "SM5502 EXTCON support"
>>> +   depends on I2C
>>> select IRQ_DOMAIN
>>> select REGMAP_I2C
>>> select REGMAP_IRQ
>>>
>>
>>
>>
>>
>>
>>   
>>  
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] softlockup: make detector be aware of task switch of processes hogging cpu

2014-08-20 Thread chai wen
For now, soft lockup detector warns once for each case of process softlockup.
But the thread 'watchdog/n' may not always get the cpu at the time slot between
the task switch of two processes hogging that cpu to reset soft_watchdog_warn.

An example would be two processes hogging the cpu.  Process A causes the
softlockup warning and is killed manually by a user.  Process B immediately
becomes the new process hogging the cpu preventing the softlockup code from
resetting the soft_watchdog_warn variable.

This case is a false negative of "warn only once for a process", as there may
be a different process that is going to hog the cpu.  Resolve this by
saving/checking the task pointer of the hogging process and use that to reset
soft_watchdog_warn too.

Signed-off-by: chai wen 
Signed-off-by: Don Zickus 
---
 kernel/watchdog.c |   16 +++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 0037db6..2e55620 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
 static DEFINE_PER_CPU(bool, soft_watchdog_warn);
 static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
 static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
+static DEFINE_PER_CPU(struct task_struct *, softlockup_task_ptr_saved);
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
 static DEFINE_PER_CPU(bool, hard_watchdog_warn);
 static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
@@ -328,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct 
hrtimer *hrtimer)
return HRTIMER_RESTART;
 
/* only warn once */
-   if (__this_cpu_read(soft_watchdog_warn) == true)
+   if (__this_cpu_read(soft_watchdog_warn) == true) {
+   /*
+* Handle the case where multiple processes are
+* causing softlockups but the duration is small
+* enough, the softlockup detector can not reset
+* itself in time.  Use task pointers to detect this.
+*/
+   if (__this_cpu_read(softlockup_task_ptr_saved) !=
+   current) {
+   __this_cpu_write(soft_watchdog_warn, false);
+   __touch_watchdog();
+   }
return HRTIMER_RESTART;
+   }
 
if (softlockup_all_cpu_backtrace) {
/* Prevent multiple soft-lockup reports if one cpu is 
already
@@ -345,6 +358,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct 
hrtimer *hrtimer)
pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
smp_processor_id(), duration,
current->comm, task_pid_nr(current));
+   __this_cpu_write(softlockup_task_ptr_saved, current);
print_modules();
print_irqtrace_events(current);
if (regs)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 5/9] block: loop: convert to blk-mq

2014-08-20 Thread Ming Lei
On Wed, Aug 20, 2014 at 4:50 AM, Jens Axboe  wrote:

>
>
> Reworked a bit more:
>
> http://git.kernel.dk/?p=linux-block.git;a=commit;h=a323185a761b9a54dc340d383695b4205ea258b6

One big problem of the commit is that it is basically a serialized workqueue
because of single >run_work, and per-req work_struct has to be
used for concurrent implementation.  So looks the approach isn't flexible
enough compared with doing that in driver, or any idea about how to fix
that?


Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.16.0-rc3-rmk v5] ARM: add get_user() support for 8 byte types

2014-08-20 Thread Victor Kamensky
On 10 July 2014 12:47, Daniel Thompson  wrote:
> Recent contributions, including to DRM and binder, introduce 64-bit
> values in their interfaces. A common motivation for this is to allow
> the same ABI for 32- and 64-bit userspaces (and therefore also a shared
> ABI for 32/64 hybrid userspaces). Anyhow, the developers would like to
> avoid gotchas like having to use copy_from_user().
>
> This feature is already implemented on x86-32 and the majority of other
> 32-bit architectures. The current list of get_user_8 hold out
> architectures are: arm, avr32, blackfin, m32r, metag, microblaze,
> mn10300, sh.
>
> Credit:
>
> My name sits rather uneasily at the top of this patch. The v1 and
> v2 versions of the patch were written by Rob Clark and to produce v4
> I mostly copied code from Russell King and H. Peter Anvin. However I
> have mangled the patch sufficiently that *blame* is rightfully mine
> even if credit should more widely shared.
>
> Changelog:
>
> v5: updated to use the ret macro (requested by Russell King)
> v4: remove an inlined add on big endian systems (spotted by Russell King),
> used __ARMEB__ rather than BIG_ENDIAN (to match rest of file),
> cleared r3 on EFAULT during __get_user_8.
> v3: fix a couple of checkpatch issues
> v2: pass correct size to check_uaccess, and better handling of narrowing
> double word read with __get_user_xb() (Russell King's suggestion)
> v1: original
>
> Signed-off-by: Rob Clark 
> Signed-off-by: Daniel Thompson 
> Cc: Russell King - ARM Linux 
> ---
>  arch/arm/include/asm/uaccess.h | 20 +++-
>  arch/arm/lib/getuser.S | 37 -
>  2 files changed, 55 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> index 75d9579..7057cf8 100644
> --- a/arch/arm/include/asm/uaccess.h
> +++ b/arch/arm/include/asm/uaccess.h
> @@ -107,6 +107,8 @@ static inline void set_fs(mm_segment_t fs)
>  extern int __get_user_1(void *);
>  extern int __get_user_2(void *);
>  extern int __get_user_4(void *);
> +extern int __get_user_lo8(void *);
> +extern int __get_user_8(void *);
>
>  #define __GUP_CLOBBER_1"lr", "cc"
>  #ifdef CONFIG_CPU_USE_DOMAINS
> @@ -115,6 +117,8 @@ extern int __get_user_4(void *);
>  #define __GUP_CLOBBER_2 "lr", "cc"
>  #endif
>  #define __GUP_CLOBBER_4"lr", "cc"
> +#define __GUP_CLOBBER_lo8 "lr", "cc"
> +#define __GUP_CLOBBER_8"lr", "cc"
>
>  #define __get_user_x(__r2,__p,__e,__l,__s) \
>__asm__ __volatile__ (   \
> @@ -125,11 +129,19 @@ extern int __get_user_4(void *);
> : "0" (__p), "r" (__l)  \
> : __GUP_CLOBBER_##__s)
>
> +/* narrowing a double-word get into a single 32bit word register: */
> +#ifdef __ARMEB__
> +#define __get_user_xb(__r2, __p, __e, __l, __s)  
>   \
> +   __get_user_x(__r2, __p, __e, __l, lo8)
> +#else
> +#define __get_user_xb __get_user_x
> +#endif
> +
>  #define __get_user_check(x,p)
>   \
> ({  \
> unsigned long __limit = current_thread_info()->addr_limit - 
> 1; \
> register const typeof(*(p)) __user *__p asm("r0") = (p);\
> -   register unsigned long __r2 asm("r2");  \
> +   register typeof(x) __r2 asm("r2");  \

Above breaks V7 BE case when get_user called for target
variable of 64 bit in size but '*__p' is 32 bit or smaller. Please
look at [1] for more details.

Thanks,
Victor

[1] 
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/280806.html

> register unsigned long __l asm("r1") = __limit; \
> register int __e asm("r0"); \
> switch (sizeof(*(__p))) {   \
> @@ -142,6 +154,12 @@ extern int __get_user_4(void *);
> case 4: \
> __get_user_x(__r2, __p, __e, __l, 4);   \
> break;  \
> +   case 8: \
> +   if (sizeof((x)) < 8)\
> +   __get_user_xb(__r2, __p, __e, __l, 4);  \
> +   else\
> +   __get_user_x(__r2, __p, __e, __l, 8);   \
> +   break;  \
> default: __e = __get_user_bad(); break; \
> }   \
> x = (typeof(*(p))) 

Re: Re: [PATCH 2/2] extcon: sm5502: EXTCON_SM5502 should depend on I2C

2014-08-20 Thread MyungJoo Ham
> Hi Geert
> 
> Thanks for your report. I already sent a patch[1] to fix this build break
> and I'll send pull request to includec this patch in 3.17-rc2.
> 
> [1] https://lkml.org/lkml/2014/8/13/761
> 
> Best Regards,
> Chanwoo Choi

I do not object to this patch or your patch[1].

However, wouldn't it be better to add depends on I2C at REGMAP_I2C?
When you use REGMAP_I2C, you assume that I2C is already there, don't you?


Cheers,
MyungJoo

> 
> 
> On 08/17/2014 07:08 PM, Geert Uytterhoeven wrote:
> > EXTCON_SM5502 selects REGMAP_I2C, but if I2C=n:
> > 
> > drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_byte_reg_read’:
> > drivers/base/regmap/regmap-i2c.c:28: error: implicit declaration of 
> > function ‘i2c_smbus_read_byte_data’
> > drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_byte_reg_write’:
> > drivers/base/regmap/regmap-i2c.c:46: error: implicit declaration of 
> > function ‘i2c_smbus_write_byte_data’
> > drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_word_reg_read’:
> > drivers/base/regmap/regmap-i2c.c:64: error: implicit declaration of 
> > function ‘i2c_smbus_read_word_data’
> > drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_word_reg_write’:
> > drivers/base/regmap/regmap-i2c.c:82: error: implicit declaration of 
> > function ‘i2c_smbus_write_word_data’
> > drivers/base/regmap/regmap-i2c.c: In function ‘regmap_i2c_write’:
> > drivers/base/regmap/regmap-i2c.c:96: error: implicit declaration of 
> > function ‘i2c_master_send’
> > drivers/base/regmap/regmap-i2c.c: In function ‘regmap_i2c_gather_write’:
> > drivers/base/regmap/regmap-i2c.c:117: error: implicit declaration of 
> > function ‘i2c_check_functionality’
> > drivers/base/regmap/regmap-i2c.c:130: error: implicit declaration of 
> > function ‘i2c_transfer’
> > 
> > Signed-off-by: Geert Uytterhoeven 
> > ---
> >  drivers/extcon/Kconfig | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/extcon/Kconfig b/drivers/extcon/Kconfig
> > index 6f2f4727de2c..764f3a113e0a 100644
> > --- a/drivers/extcon/Kconfig
> > +++ b/drivers/extcon/Kconfig
> > @@ -72,6 +72,7 @@ config EXTCON_PALMAS
> >  
> >  config EXTCON_SM5502
> > tristate "SM5502 EXTCON support"
> > +   depends on I2C
> > select IRQ_DOMAIN
> > select REGMAP_I2C
> > select REGMAP_IRQ
> > 
> 
> 
> 
> 
>
>   
>  
> 
N떑꿩�r툤y鉉싕b쾊Ф푤v�^�)頻{.n�+돴쪐{콗喩zX㎍썳變}찠꼿쟺�:+v돣�쳭喩zZ+€�+zf"톒쉱�~넮녬i鎬z�췿ⅱ�?솳鈺�&�)刪f뷌^j푹y쬶끷@A첺뛴
0띠h��뭝

Re: [PATCH 0/5] usb: phy: samsung: remove old USB PHY code

2014-08-20 Thread Jingoo Han
On Thursday, August 21, 2014 1:34 PM, Vivek Gautam wrote:
> On Thu, Aug 14, 2014 at 7:55 PM, Bartlomiej Zolnierkiewicz
>  wrote:
> > Hi,
> >
> > This patch series removes the old Samsung USB PHY drivers that
> > got replaced by the new ones using the generic PHY layer.
> >
> > Depends on:
> > - next-20140813 branch of linux-next kernel
> >
> > Best regards,
> > --
> > Bartlomiej Zolnierkiewicz
> > Samsung R Institute Poland
> > Samsung Electronics
> >
> >
> > Bartlomiej Zolnierkiewicz (5):
> >   ARM: dts: remove old USB2 PHY node hook for Arndale
> >   ARM: dts: remove old USB2 PHY node for Exynos5250
> >   usb: phy: samsung: remove old USB 2.0 PHY driver
> >   usb: phy: samsung: remove old USB 3.0 PHY driver
> >   usb: phy: samsung: remove old common USB PHY code
> 
> Reviewed-by: Vivek Gautam 

Reviewed-by: Jingoo Han 

Best regards,
Jingoo Han

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Documentation: remove outdated references to the linux-next wiki

2014-08-20 Thread Frank Seidel
Am Wed, 20 Aug 2014 14:09:53 -0700 Jim Davis 
wrote:
> On Wed, Aug 20, 2014 at 2:05 PM, SeongJae Park 
> wrote:
> > On Thu, Aug 21, 2014 at 5:29 AM, Jim Davis 
> > wrote:
> >> The linux-next wiki at http://linux.f-seidel.de/linux-next/pmwiki
> >> has been gone for several months now.
> >
> > Yes, I can't load the page, too. BTW, wouldn't it be better to add
> > Frank Seidel as recipient because he was the manager of the wiki?
> 
> I did send an email query about the wiki status -- with a "thank you!"
> for the work involved -- a couple of days ago.   Didn't bounce, but
> there's not been a response.
> 
> That wiki was very helpful as I tried to get going with linux-next
> stuff.

Sorry for my late response. Due to financial reasons i had to quit
the contract for the server hosting that site.
Just moved the domain now to another hoster, but didn't find time until
now to setup things there again.
But i definitly will be doing so (hopefully within the next one or two
weeks).

Thanks,
Frank
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/5] usb: phy: samsung: remove old USB 2.0 PHY driver

2014-08-20 Thread Jingoo Han
On Thursday, August 21, 2014 1:31 PM, Vivek Gautam wrote:
> On Mon, Aug 18, 2014 at 4:52 PM, Tomasz Figa  wrote:
> > On 18.08.2014 13:02, Bartlomiej Zolnierkiewicz wrote:
> >> On Thursday, August 14, 2014 08:07:40 PM Vivek Gautam wrote:
> >>> On Thursday, August 14, 2014 7:55 PM, Bartlomiej Zolnierkiewicz
> >>>  wrote
> >>
> >>> There's one thing that I would want to comment here, since we don't have 
> >>> any
> >>> new usb-phy driver for S3C64XX,
> >>> so we can't simply remove this entire driver.
> >>> I have posted my patch-series [1], which does cleanup while keeping the
> >>> support for S3C64XX.
> >>
> >> AFAIK S3C64XX code from drivers/usb/phy/phy-samsung-usb2.c has
> >> never been used as this platform still uses its own code from
> >> arch/arm/mach-s3c64xx/setup-usb-phy.c (there are no users in
> >> the kernel tree of either s3c64xx-usb2phy platform device or
> >> "samsung,s3c64xx-usb2phy" DT compatible) .  Therefore I think
> >> that the entire drivers/usb/phy/phy-samsung-usb2.c driver
> >> should be removed (somebody with the hardware can as well add
> >> S3C64XX support to the new drivers/phy/phy-samsung-usb2.c
> >> driver and port the platform to use it).
> >>
> >
> > I agree with removal of this driver. As Bart said, it is not used for
> > S3C64xx at all. The platform was supposed to be moved to this driver,
> > but that never happened.

Yes, right. As far as I know, you're right.

> > In fact, I already have a patch adding support
> > for S3C64xx to the new driver.

Good!

> 
> Cool then, lets remove this driver completely and use the new generic
> PHY based driver
> once that comes (from Tomasz).

I agree with this opinion.

> 
> I shall drop the patches for cleaning up the usb-phy drivers from my series.

Best regards,
Jingoo Han

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 13/16] cpufreq: Add cpufreq driver for Tegra124

2014-08-20 Thread Viresh Kumar
On 21 August 2014 02:34, Tuomas Tynkkynen  wrote:
> Add a new cpufreq driver for Tegra124. Instead of using the PLLX as
> the CPU clocksource, switch immediately to the DFLL. It allows the use
> of higher clock rates, and will automatically scale the CPU voltage as
> well. Besides the CPU clocksource switch, we let the cpufreq-cpu0 driver
> for all the cpufreq operations.
>
> This driver also relies on the DFLL driver to fill the OPP table for the
> CPU0 device, so that the cpufreq-cpu0 driver knows what frequencies to
> use.
>
> Signed-off-by: Tuomas Tynkkynen 
> ---
> v4:
>  - check for get_cpu_device() return value
>  - add comment why an extra platform driver+device is required
>  - back to 'depends on GENERIC_CPUFREQ_CPU0'

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] cpufreq: powernv: Register the driver with reboot notifier

2014-08-20 Thread Shilpasri G Bhat


On 08/18/2014 01:16 PM, Viresh Kumar wrote:

On 14 August 2014 16:49, Shilpasri G Bhat
 wrote:

This patch ensures the cpus to kexec/reboot at nominal frequency.
Nominal frequency is the highest cpu frequency on PowerPC at
which the cores can run without getting throttled.

If the host kernel had set the cpus to a low pstate and then it
kexecs/reboots to a cpufreq disabled kernel it would cause the target
kernel to perform poorly. It will also increase the boot up time of
the target kernel. So set the cpus to high pstate, in this case to
nominal frequency before rebooting to avoid such scenarios.

The reboot notifier will suspend the cpufreq governor and enable
nominal frequency to be set during a reboot/kexec similar to the
suspend operartion.

Signed-off-by: Shilpasri G Bhat 
Reviewed-by: Preeti U Murthy 
---
  drivers/cpufreq/powernv-cpufreq.c | 16 
  1 file changed, 16 insertions(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c 
b/drivers/cpufreq/powernv-cpufreq.c
index 379c083..e9f3d3a 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -26,6 +26,7 @@
  #include 
  #include 
  #include 
+#include 

  #include 
  #include 
@@ -314,9 +315,21 @@ static int powernv_cpufreq_cpu_init(struct cpufreq_policy 
*policy)
 for (i = 0; i < threads_per_core; i++)
 cpumask_set_cpu(base + i, policy->cpus);

+   policy->suspend_freq = pstate_id_to_freq(powernv_pstate_info.nominal);
 return cpufreq_table_validate_and_show(policy, powernv_freqs);
  }

+static int powernv_cpufreq_reboot_notifier(struct notifier_block *nb,
+   unsigned long action, void *unused)
+{
+   cpufreq_suspend();
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block powernv_cpufreq_reboot_nb = {
+   .notifier_call = powernv_cpufreq_reboot_notifier,
+};
+
  static struct cpufreq_driver powernv_cpufreq_driver = {
 .name   = "powernv-cpufreq",
 .flags  = CPUFREQ_CONST_LOOPS,
@@ -325,6 +338,7 @@ static struct cpufreq_driver powernv_cpufreq_driver = {
 .target_index   = powernv_cpufreq_target_index,
 .get= powernv_cpufreq_get,
 .attr   = powernv_cpu_freq_attr,
+   .suspend= cpufreq_generic_suspend,

I couldn't understand why you have added a notifier here. This callback
by itself should be enough. Isn't it?

And then you have called cpufreq_suspend(), which is absolutely wrong,
from that notifier..


Hi Viresh,

The intention here is stop the cpufreq governor and then to set the cpus to
nominal frequency so as to ensure that the frequency won't be changed later.

The .suspend callback of the driver is not called during reboot/kexec.
So we need an explicit reboot notifier to call cpufreq-suspend() to
suffice the requirement.

Thanks and Regards,
Shilpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 3/4] HID:hid-logitech: Use new native switch method

2014-08-20 Thread Simon Wood
---
 drivers/hid/hid-lg4ff.c | 126 +---
 1 file changed, 55 insertions(+), 71 deletions(-)

diff --git a/drivers/hid/hid-lg4ff.c b/drivers/hid/hid-lg4ff.c
index eda07a2..0ba0838 100644
--- a/drivers/hid/hid-lg4ff.c
+++ b/drivers/hid/hid-lg4ff.c
@@ -32,21 +32,10 @@
 #include "hid-lg.h"
 #include "hid-ids.h"
 
-#define DFGT_REV_MAJ 0x13
-#define DFGT_REV_MIN 0x22
-#define DFGT2_REV_MIN 0x26
-#define DFP_REV_MAJ 0x11
-#define DFP_REV_MIN 0x06
-#define FFEX_REV_MAJ 0x21
-#define FFEX_REV_MIN 0x00
-#define G25_REV_MAJ 0x12
-#define G25_REV_MIN 0x22
-#define G27_REV_MAJ 0x12
-#define G27_REV_MIN 0x38
-#define G27_2_REV_MIN 0x39
-
 #define to_hid_device(pdev) container_of(pdev, struct hid_device, dev)
 
+#define LG4FF_FFEX_BCDDEVICE 0x2100
+
 static void hid_lg4ff_set_range_dfp(struct hid_device *hid, u16 range);
 static void hid_lg4ff_set_range_g25(struct hid_device *hid, u16 range);
 static ssize_t lg4ff_range_show(struct device *dev, struct device_attribute 
*attr, char *buf);
@@ -89,48 +78,6 @@ static const struct lg4ff_mode_switcher 
lg4ff_mode_switchers[] = {   /* Note: Orde
{0x1000, 0xf000, LG4FF_MSW_DFP, USB_DEVICE_ID_LOGITECH_DFP_WHEEL, 
hid_lg4ff_set_range_dfp},
 };
 
-struct lg4ff_native_cmd {
-   const __u8 cmd_num; /* Number of commands to send */
-   const __u8 cmd[];
-};
-
-struct lg4ff_usb_revision {
-   const __u16 rev_maj;
-   const __u16 rev_min;
-   const struct lg4ff_native_cmd *command;
-};
-
-static const struct lg4ff_native_cmd native_dfp = {
-   1,
-   {0xf8, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00}
-};
-
-static const struct lg4ff_native_cmd native_dfgt = {
-   2,
-   {0xf8, 0x0a, 0x00, 0x00, 0x00, 0x00, 0x00,  /* 1st command */
-0xf8, 0x09, 0x03, 0x01, 0x00, 0x00, 0x00}  /* 2nd command */
-};
-
-static const struct lg4ff_native_cmd native_g25 = {
-   1,
-   {0xf8, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00}
-};
-
-static const struct lg4ff_native_cmd native_g27 = {
-   2,
-   {0xf8, 0x0a, 0x00, 0x00, 0x00, 0x00, 0x00,  /* 1st command */
-0xf8, 0x09, 0x04, 0x01, 0x00, 0x00, 0x00}  /* 2nd command */
-};
-
-static const struct lg4ff_usb_revision lg4ff_revs[] = {
-   {DFGT_REV_MAJ, DFGT_REV_MIN, _dfgt}, /* Driving Force GT */
-   {DFGT_REV_MAJ, DFGT2_REV_MIN, _dfgt},/* Driving Force GT v2 
*/
-   {DFP_REV_MAJ,  DFP_REV_MIN,  _dfp},  /* Driving Force Pro */
-   {G25_REV_MAJ,  G25_REV_MIN,  _g25},  /* G25 */
-   {G27_REV_MAJ,  G27_REV_MIN,  _g27},  /* G27 */
-   {G27_REV_MAJ,  G27_2_REV_MIN,  _g27},/* G27 v2 */
-};
-
 /* Recalculates X axis value accordingly to currently selected range */
 static __s32 lg4ff_adjust_dfp_x_axis(__s32 value, __u16 range)
 {
@@ -397,19 +344,63 @@ static void hid_lg4ff_set_range_dfp(struct hid_device 
*hid, __u16 range)
hid_hw_request(hid, report, HID_REQ_SET_REPORT);
 }
 
-static void hid_lg4ff_switch_native(struct hid_device *hid, const struct 
lg4ff_native_cmd *cmd)
+static int lg4ff_switch_mode(struct hid_device *hid, __u16 type, int mode)
 {
struct list_head *report_list = 
>report_enum[HID_OUTPUT_REPORT].report_list;
struct hid_report *report = list_entry(report_list->next, struct 
hid_report, list);
-   __u8 i, j;
+   __s32 *value = report->field[0]->value;
+
+   if (mode >= LG4FF_MSW_MAX || mode <= LG4FF_MSW_NAT) mode = type;
+
+   if (type == LG4FF_MSW_G25 && mode == LG4FF_MSW_G25) {
+   value[0] = 0xf8;
+   value[1] = 0x10;
+   value[2] = 0x00;
+   value[3] = 0x00;
+   value[4] = 0x00;
+   value[5] = 0x00;
+   value[6] = 0x00;
+
+   hid_hw_request(hid, report, HID_REQ_SET_REPORT);
+   return 0;
+   }
 
-   j = 0;
-   while (j < 7*cmd->cmd_num) {
-   for (i = 0; i < 7; i++)
-   report->field[0]->value[i] = cmd->cmd[j++];
+   if (mode == LG4FF_MSW_DFP) {
+   value[0] = 0xf8;
+   value[1] = 0x01;
+   value[2] = 0x00;
+   value[3] = 0x00;
+   value[4] = 0x00;
+   value[5] = 0x00;
+   value[6] = 0x00;
 
hid_hw_request(hid, report, HID_REQ_SET_REPORT);
+   return 0;
}
+
+   /* Prevent compat mode on USB reset */
+   if (type == LG4FF_MSW_DFGT || type == LG4FF_MSW_G27) {
+   value[0] = 0xf8;
+   value[1] = 0x0a;
+   value[2] = 0x00;
+   value[3] = 0x00;
+   value[4] = 0x00;
+   value[5] = 0x00;
+   value[6] = 0x00;
+
+   hid_hw_request(hid, report, HID_REQ_SET_REPORT);
+   }
+
+   value[0] = 0xf8;
+   value[1] = 0x09;
+   value[2] = mode;
+   value[3] = 0x01;
+   value[4] = 0x00;
+   value[5] = 0x00;
+   value[6] = 0x00;
+
+   

[RFC 4/4] HID:hid-logitech: Add mode control via /sys interface

2014-08-20 Thread Simon Wood
---
 drivers/hid/hid-lg4ff.c | 81 +++--
 1 file changed, 78 insertions(+), 3 deletions(-)

diff --git a/drivers/hid/hid-lg4ff.c b/drivers/hid/hid-lg4ff.c
index 0ba0838..1be561e 100644
--- a/drivers/hid/hid-lg4ff.c
+++ b/drivers/hid/hid-lg4ff.c
@@ -41,11 +41,16 @@ static void hid_lg4ff_set_range_g25(struct hid_device *hid, 
u16 range);
 static ssize_t lg4ff_range_show(struct device *dev, struct device_attribute 
*attr, char *buf);
 static ssize_t lg4ff_range_store(struct device *dev, struct device_attribute 
*attr, const char *buf, size_t count);
 
+static ssize_t lg4ff_mode_show(struct device *dev, struct device_attribute 
*attr, char *buf);
+static ssize_t lg4ff_mode_store(struct device *dev, struct device_attribute 
*attr, const char *buf, size_t count);
+
 static DEVICE_ATTR(range, S_IRWXU | S_IRWXG | S_IROTH, lg4ff_range_show, 
lg4ff_range_store);
+static DEVICE_ATTR(mode, S_IRWXU | S_IRWXG | S_IROTH, lg4ff_mode_show, 
lg4ff_mode_store);
 
 struct lg4ff_device_entry {
__u32 product_id;
__u16 type;
+   __u16 mode;
__u16 range;
__u16 min_range;
__u16 max_range;
@@ -362,7 +367,7 @@ static int lg4ff_switch_mode(struct hid_device *hid, __u16 
type, int mode)
value[6] = 0x00;
 
hid_hw_request(hid, report, HID_REQ_SET_REPORT);
-   return 0;
+   return LG4FF_MSW_G25;
}
 
if (mode == LG4FF_MSW_DFP) {
@@ -375,7 +380,7 @@ static int lg4ff_switch_mode(struct hid_device *hid, __u16 
type, int mode)
value[6] = 0x00;
 
hid_hw_request(hid, report, HID_REQ_SET_REPORT);
-   return 0;
+   return LG4FF_MSW_DFP;
}
 
/* Prevent compat mode on USB reset */
@@ -400,7 +405,7 @@ static int lg4ff_switch_mode(struct hid_device *hid, __u16 
type, int mode)
value[6] = 0x00;
 
hid_hw_request(hid, report, HID_REQ_SET_REPORT);
-   return 0;
+   return mode;
 }
 
 /* Read current range and display it in terminal */
@@ -461,6 +466,66 @@ static ssize_t lg4ff_range_store(struct device *dev, 
struct device_attribute *at
return count;
 }
 
+/* Read current mode and display it in terminal */
+static ssize_t lg4ff_mode_show(struct device *dev, struct device_attribute 
*attr, char *buf)
+{
+   struct hid_device *hid = to_hid_device(dev);
+   struct lg4ff_device_entry *entry;
+   struct lg_drv_data *drv_data;
+   size_t count;
+
+   drv_data = hid_get_drvdata(hid);
+   if (!drv_data) {
+   hid_err(hid, "Private driver data not found!\n");
+   return 0;
+   }
+
+   entry = drv_data->device_props;
+   if (!entry) {
+   hid_err(hid, "Device properties not found!\n");
+   return 0;
+   }
+
+   count = scnprintf(buf, PAGE_SIZE, "%u\n", entry->mode);
+   return count;
+}
+
+/* Set mode to user specified value */
+static ssize_t lg4ff_mode_store(struct device *dev, struct device_attribute 
*attr, const char *buf, size_t count)
+{
+   struct hid_device *hid = to_hid_device(dev);
+   struct lg4ff_device_entry *entry;
+   struct lg_drv_data *drv_data;
+   int err;
+   __u16 mode = simple_strtoul(buf, NULL, 10);
+
+   drv_data = hid_get_drvdata(hid);
+   if (!drv_data) {
+   hid_err(hid, "Private driver data not found!\n");
+   return -EINVAL;
+   }
+
+   entry = drv_data->device_props;
+   if (!entry) {
+   hid_err(hid, "Device properties not found!\n");
+   return -EINVAL;
+   }
+
+   if (mode == entry->mode) {
+   dbg_hid("Device is already in mode %d\n", mode);
+   return count;
+   }
+
+   err = lg4ff_switch_mode(hid, entry->type, mode);
+   if (err != mode) {
+   hid_err(hid, "Unable to switch mode\n");
+   return -1;
+   }
+
+   entry->mode = mode;
+   return count;
+}
+
 #ifdef CONFIG_LEDS_CLASS
 static void lg4ff_set_leds(struct hid_device *hid, __u8 leds)
 {
@@ -583,6 +648,7 @@ int lg4ff_init(struct hid_device *hid, const int 
switch_mode)
entry->product_id = hid->product;
entry->set_range = NULL;
entry->type = LG4FF_MSW_EMU;
+   entry->mode = LG4FF_MSW_EMU;
 
/* Check which wheel has been connected */
bcdDevice = le16_to_cpu(udesc->bcdDevice);
@@ -602,6 +668,11 @@ int lg4ff_init(struct hid_device *hid, const int 
switch_mode)
entry->min_range = 40;
entry->max_range = 900;
entry->set_range = s->set_range;
+
+   if (switch_mode == LG4FF_MSW_NAT)
+   entry->mode = s->type;
+   else
+   entry->mode = switch_mode;
}
 
if (hid->product == USB_DEVICE_ID_LOGITECH_WHEEL && switch_mode 
!= 

[RFC 2/4] HID:hid-logitech: New detection of native capable devices

2014-08-20 Thread Simon Wood
---
 drivers/hid/hid-lg.h|   5 +++
 drivers/hid/hid-lg4ff.c | 115 
 2 files changed, 63 insertions(+), 57 deletions(-)

diff --git a/drivers/hid/hid-lg.h b/drivers/hid/hid-lg.h
index fc4bdae..cf442e5 100644
--- a/drivers/hid/hid-lg.h
+++ b/drivers/hid/hid-lg.h
@@ -27,6 +27,11 @@ static inline int lg3ff_init(struct hid_device *hdev) { 
return -1; }
 #ifdef CONFIG_LOGIWHEELS_FF
 #define LG4FF_MSW_NAT -1   /* allow native mode */
 #define LG4FF_MSW_EMU 0/* remain in or force emulation mode */
+#define LG4FF_MSW_DFP 1
+#define LG4FF_MSW_G25 2
+#define LG4FF_MSW_DFGT 3
+#define LG4FF_MSW_G27 4
+#define LG4FF_MSW_MAX 5/* end-stop */
 
 int lg4ff_adjust_input_event(struct hid_device *hid, struct hid_field *field,
 struct hid_usage *usage, __s32 value, struct 
lg_drv_data *drv_data);
diff --git a/drivers/hid/hid-lg4ff.c b/drivers/hid/hid-lg4ff.c
index 9247227..eda07a2 100644
--- a/drivers/hid/hid-lg4ff.c
+++ b/drivers/hid/hid-lg4ff.c
@@ -56,6 +56,7 @@ static DEVICE_ATTR(range, S_IRWXU | S_IRWXG | S_IROTH, 
lg4ff_range_show, lg4ff_r
 
 struct lg4ff_device_entry {
__u32 product_id;
+   __u16 type;
__u16 range;
__u16 min_range;
__u16 max_range;
@@ -73,23 +74,19 @@ static const signed short lg4ff_wheel_effects[] = {
-1
 };
 
-struct lg4ff_wheel {
-   const __u32 product_id;
-   const signed short *ff_effects;
-   const __u16 min_range;
-   const __u16 max_range;
+struct lg4ff_mode_switcher {
+   const u16 bcdDevice;
+   const u16 mask;
+   const u16 type;
+   const __u32 native_pid;
void (*set_range)(struct hid_device *hid, u16 range);
 };
 
-static const struct lg4ff_wheel lg4ff_devices[] = {
-   {USB_DEVICE_ID_LOGITECH_WHEEL,   lg4ff_wheel_effects, 40, 270, 
NULL},
-   {USB_DEVICE_ID_LOGITECH_MOMO_WHEEL,  lg4ff_wheel_effects, 40, 270, 
NULL},
-   {USB_DEVICE_ID_LOGITECH_DFP_WHEEL,   lg4ff_wheel_effects, 40, 900, 
hid_lg4ff_set_range_dfp},
-   {USB_DEVICE_ID_LOGITECH_G25_WHEEL,   lg4ff_wheel_effects, 40, 900, 
hid_lg4ff_set_range_g25},
-   {USB_DEVICE_ID_LOGITECH_DFGT_WHEEL,  lg4ff_wheel_effects, 40, 900, 
hid_lg4ff_set_range_g25},
-   {USB_DEVICE_ID_LOGITECH_G27_WHEEL,   lg4ff_wheel_effects, 40, 900, 
hid_lg4ff_set_range_g25},
-   {USB_DEVICE_ID_LOGITECH_MOMO_WHEEL2, lg4ff_wheel_effects, 40, 270, 
NULL},
-   {USB_DEVICE_ID_LOGITECH_WII_WHEEL,   lg4ff_wheel_effects, 40, 270, NULL}
+static const struct lg4ff_mode_switcher lg4ff_mode_switchers[] = { /* 
Note: Order is important for detection process */
+   {0x1300, 0xff00, LG4FF_MSW_DFGT, USB_DEVICE_ID_LOGITECH_DFGT_WHEEL, 
hid_lg4ff_set_range_g25},
+   {0x1230, 0xfff0, LG4FF_MSW_G27, USB_DEVICE_ID_LOGITECH_G27_WHEEL, 
hid_lg4ff_set_range_g25},
+   {0x1200, 0xff00, LG4FF_MSW_G25, USB_DEVICE_ID_LOGITECH_G25_WHEEL, 
hid_lg4ff_set_range_g25},
+   {0x1000, 0xf000, LG4FF_MSW_DFP, USB_DEVICE_ID_LOGITECH_DFP_WHEEL, 
hid_lg4ff_set_range_dfp},
 };
 
 struct lg4ff_native_cmd {
@@ -570,50 +567,12 @@ int lg4ff_init(struct hid_device *hid, const int 
switch_mode)
if (!hid_validate_values(hid, HID_OUTPUT_REPORT, 0, 0, 7))
return -1;
 
-   /* Check what wheel has been connected */
-   for (i = 0; i < ARRAY_SIZE(lg4ff_devices); i++) {
-   if (hid->product == lg4ff_devices[i].product_id) {
-   dbg_hid("Found compatible device, product ID %04X\n", 
lg4ff_devices[i].product_id);
-   break;
-   }
-   }
-
-   if (i == ARRAY_SIZE(lg4ff_devices)) {
-   hid_err(hid, "Device is not supported by lg4ff driver. If you 
think it should be, consider reporting a bug to"
-"LKML, Simon Wood  or Michal 
Maly \n");
-   return -1;
-   }
-
/* Attempt to switch wheel to native mode when applicable */
udesc = &(hid_to_usb_dev(hid)->descriptor);
if (!udesc) {
hid_err(hid, "NULL USB device descriptor\n");
return -1;
}
-   bcdDevice = le16_to_cpu(udesc->bcdDevice);
-   rev_maj = bcdDevice >> 8;
-   rev_min = bcdDevice & 0xff;
-
-   if (lg4ff_devices[i].product_id == USB_DEVICE_ID_LOGITECH_WHEEL && 
switch_mode != LG4FF_MSW_EMU) {
-   dbg_hid("Generic wheel detected, can it do native?\n");
-   dbg_hid("USB revision: %2x.%02x\n", rev_maj, rev_min);
-
-   for (j = 0; j < ARRAY_SIZE(lg4ff_revs); j++) {
-   if (lg4ff_revs[j].rev_maj == rev_maj && 
lg4ff_revs[j].rev_min == rev_min) {
-   hid_lg4ff_switch_native(hid, 
lg4ff_revs[j].command);
-   hid_info(hid, "Switched to native mode\n");
-   }
-   }
-   }
-
-   /* Set supported force feedback capabilities */
-   for (j = 0; 

[RFC 1/4] HID:hid-logitech: Add modparam to allow/disable switch to native mode

2014-08-20 Thread Simon Wood
---
 drivers/hid/hid-lg.c| 17 -
 drivers/hid/hid-lg.h|  7 +--
 drivers/hid/hid-lg4ff.c |  4 ++--
 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/hid/hid-lg.c b/drivers/hid/hid-lg.c
index a976f48..81ba24d 100644
--- a/drivers/hid/hid-lg.c
+++ b/drivers/hid/hid-lg.c
@@ -334,6 +334,16 @@ static __u8 momo2_rdesc_fixed[] = {
 };
 
 /*
+ * Certain Logitech wheels provide various compatibililty modes
+ * for games that cannot handle their advanced features properly.
+ * This switch forces the wheel into a specific compatibililty
+ * instead of its native mode
+ */
+#ifdef CONFIG_LOGIWHEELS_FF
+static int lg4ff_switch_mode = LG4FF_MSW_NAT;  /* Default to native mode */
+#endif
+
+/*
  * Certain Logitech keyboards send in report #3 keys which are far
  * above the logical maximum described in descriptor. This extends
  * the original value of 0x28c of logical maximum to 0x104d
@@ -717,7 +727,7 @@ static int lg_probe(struct hid_device *hdev, const struct 
hid_device_id *id)
if (drv_data->quirks & LG_FF3)
lg3ff_init(hdev);
if (drv_data->quirks & LG_FF4)
-   lg4ff_init(hdev);
+   lg4ff_init(hdev, lg4ff_switch_mode);
 
return 0;
 err_free:
@@ -818,4 +828,9 @@ static struct hid_driver lg_driver = {
 };
 module_hid_driver(lg_driver);
 
+#ifdef CONFIG_LOGIWHEELS_FF
+module_param_named(lg4ff_switch_mode, lg4ff_switch_mode, int, S_IRUGO);
+MODULE_PARM_DESC(lg4ff_switch_mode, "Enable switch from compatibililty mode to 
native mode (only certain devices)");
+#endif
+
 MODULE_LICENSE("GPL");
diff --git a/drivers/hid/hid-lg.h b/drivers/hid/hid-lg.h
index 142ce3f..fc4bdae 100644
--- a/drivers/hid/hid-lg.h
+++ b/drivers/hid/hid-lg.h
@@ -25,14 +25,17 @@ static inline int lg3ff_init(struct hid_device *hdev) { 
return -1; }
 #endif
 
 #ifdef CONFIG_LOGIWHEELS_FF
+#define LG4FF_MSW_NAT -1   /* allow native mode */
+#define LG4FF_MSW_EMU 0/* remain in or force emulation mode */
+
 int lg4ff_adjust_input_event(struct hid_device *hid, struct hid_field *field,
 struct hid_usage *usage, __s32 value, struct 
lg_drv_data *drv_data);
-int lg4ff_init(struct hid_device *hdev);
+int lg4ff_init(struct hid_device *hdev, const int switch_mode);
 int lg4ff_deinit(struct hid_device *hdev);
 #else
 static inline int lg4ff_adjust_input_event(struct hid_device *hid, struct 
hid_field *field,
   struct hid_usage *usage, __s32 
value, struct lg_drv_data *drv_data) { return 0; }
-static inline int lg4ff_init(struct hid_device *hdev) { return -1; }
+static inline int lg4ff_init(struct hid_device *hdev, const int switch_mode) { 
return -1; }
 static inline int lg4ff_deinit(struct hid_device *hdev) { return -1; }
 #endif
 
diff --git a/drivers/hid/hid-lg4ff.c b/drivers/hid/hid-lg4ff.c
index cc2bd20..9247227 100644
--- a/drivers/hid/hid-lg4ff.c
+++ b/drivers/hid/hid-lg4ff.c
@@ -556,7 +556,7 @@ static enum led_brightness lg4ff_led_get_brightness(struct 
led_classdev *led_cde
 }
 #endif
 
-int lg4ff_init(struct hid_device *hid)
+int lg4ff_init(struct hid_device *hid, const int switch_mode)
 {
struct hid_input *hidinput = list_entry(hid->inputs.next, struct 
hid_input, list);
struct input_dev *dev = hidinput->input;
@@ -594,7 +594,7 @@ int lg4ff_init(struct hid_device *hid)
rev_maj = bcdDevice >> 8;
rev_min = bcdDevice & 0xff;
 
-   if (lg4ff_devices[i].product_id == USB_DEVICE_ID_LOGITECH_WHEEL) {
+   if (lg4ff_devices[i].product_id == USB_DEVICE_ID_LOGITECH_WHEEL && 
switch_mode != LG4FF_MSW_EMU) {
dbg_hid("Generic wheel detected, can it do native?\n");
dbg_hid("USB revision: %2x.%02x\n", rev_maj, rev_min);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] namespace updates for v3.17-rc1

2014-08-20 Thread Eric W. Biederman
Richard Weinberger  writes:

> On Wed, Aug 6, 2014 at 2:57 AM, Eric W. Biederman  
> wrote:
>
> This commit breaks libvirt-lxc.
> libvirt does in lxcContainerMountBasicFS():

The bugs fixed are security issues, so if we have to break a small
number of userspace applications we will.  Anything that we can
reasonably do to avoid regressions will be done.

Could you please look at my user-namespace.git#for-next branch I have a
fix for at least one regresion causing issue in there.  I think it may
fix your issues but I am not fully certain more comments below.

> /*
>  * We can't immediately set the MS_RDONLY flag when mounting 
> filesystems
>  * because (in at least some kernel versions) this will propagate back
>  * to the original mount in the host OS, turning it readonly too. Thus
>  * we mount the filesystem in read-write mode initially, and then do a
>  * separate read-only bind mount on top of that.
>  */
> bindOverReadonly = !!(mnt_mflags & MS_RDONLY);
>
> VIR_DEBUG("Mount %s on %s type=%s flags=%x",
>   mnt_src, mnt->dst, mnt->type, mnt_mflags & ~MS_RDONLY);
> if (mount(mnt_src, mnt->dst, mnt->type, mnt_mflags &
> ~MS_RDONLY, NULL) < 0) {
>
>  Here it fails for sysfs because with user namespaces we bind the
> existing /sys into the container
> and would have to read out all existing mount flags from the current /sys 
> mount.
> Otherwise mount() fails with EPERM.
> On my test system /sys is mounted with
> "rw,nosuid,nodev,noexec,relatime" and libvirt
> misses the realtime...

Not specifying any atime flags to mount should be safe as that asks for
the default atime flags which for remount I have made the default atime
flags the existing atime flags.  So I am scratching my head a little on
this one.

>
> virReportSystemError(errno,
>  _("Failed to mount %s on %s type %s 
> flags=%x"),
>  mnt_src, mnt->dst, NULLSTR(mnt->type),
>  mnt_mflags & ~MS_RDONLY);
> goto cleanup;
> }
>
> if (bindOverReadonly &&
> mount(mnt_src, mnt->dst, NULL,
>   MS_BIND|MS_REMOUNT|MS_RDONLY, NULL) < 0) {
>
> ^^^ Here it fails because now we'd have to specify all flags as used
> for the first
> mount. For the procfs case MS_NOSUID|MS_NOEXEC|MS_NODEV.
> See lxcBasicMounts[].
> In this case the fix is easy, add mnt_mflags to the mount flags.

That has always been a bug in general because remount has always
required specifying the complete set of mount flags you want to have.

That fact that flags such as nosuid are now properly locked so you can
not change them if you are not the global root user just makes this
obvious.

Andy Lutermorski has observed that statvfs will return the mount flags
making reading them simple.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PATCH hid: Implement mode switching on Logitech gaming wheels accordingly to the documentation

2014-08-20 Thread simon

> Whilst it is my intention to submit them, I might not achieve the 'very
> soon' part earliest I think would be mid next week as they still need
> a little tweaking.

I've sent in patches as 'RFC' as I think they still need a little more
testing and I'm tied up with work stuff for the next week or so (and won't
have access to the hardware).

If people can test/comment that would be great,
Simon

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 02/11] power/restart: Call machine_restart instead of arm_pm_restart

2014-08-20 Thread Guenter Roeck
On Wed, Aug 20, 2014 at 09:10:31PM -0700, Doug Anderson wrote:
> Guenter,
> 
> On Tue, Aug 19, 2014 at 5:45 PM, Guenter Roeck  wrote:
> > machine_restart is supported on non-ARM platforms, and and ultimately calls
> > arm_pm_restart, so dont call arm_pm_restart directly but use the more
> > generic function.
> >
> > Cc: Russell King 
> 
> Do you need to submit this to his patch tracker to get him to pick it
> up?  How are you envisioning that this series land?  It crosses a lot
> of boundaries so I guess will need a reasonable amount of coordination
> between maintainers...
> 
> 
If I get an Acked-by: from all maintainers, I could send a pull request
to Linus directly. How do I send a patch to Russell's patch tracker ?
I thought I copied all mailing lists suggested by get_maintainer.pl,
but maybe I missed one.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] powerpc/pseries: Drop unnecessary continue

2014-08-20 Thread Michael Ellerman
On Wed, 2014-08-13 at 14:48 +0530, Himangi Saraogi wrote:
> Continue is not needed at the bottom of a loop.
 
True.

I wonder though, is the code trying to continue to the outer loop? I stared at
it for a minute but it wasn't obvious.

I wonder if Robert still remembers?

cheers

> diff --git a/arch/powerpc/platforms/pseries/cmm.c 
> b/arch/powerpc/platforms/pseries/cmm.c
> index 2d8bf15..fc44ad0 100644
> --- a/arch/powerpc/platforms/pseries/cmm.c
> +++ b/arch/powerpc/platforms/pseries/cmm.c
> @@ -555,7 +555,6 @@ static int cmm_mem_going_offline(void *arg)
>   pa_last = pa_last->next;
>   free_page((unsigned long)cmm_page_list);
>   cmm_page_list = pa_last;
> - continue;
>   }
>   }
>   pa_curr = pa_curr->next;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] usb-phy: samsung: Cleanup the unused drivers

2014-08-20 Thread Vivek Gautam
Hi Felipe,


On Wed, Aug 20, 2014 at 11:44 PM, Felipe Balbi  wrote:
> Hi,
>
> On Thu, Aug 14, 2014 at 07:53:53PM +0530, Vivek Gautam wrote:
>> - This series is based on 'usb-next' branch.
>>
>> Now that we have support for USB PHY controllers for Exynos SoC series,
>> we are free to remove the older usb-phy support.
>> In the process, we are removing the entire phy-samsung-usb3 driver, and
>> besides that all support for Exynos4x and Exynos5 from phy-samsung-usb2
>> driver.
>>
>> We have also removed the older USB-PHY support from ehci-exynos and 
>> ohci-exynos
>> since those drivers now can use newer GENERIC-PHYs, and don't need the older
>> USB-PHYs and related code anymore. These patches for ohci-exynos and 
>> ehci-exynos
>> now replaces the older sent patch for phy sequence cleanup for them.[1]
>>
>> I have build-tested this series for exynos_defconfig and s3c64xx_defconfig,
>> and have tested the EHCI and OHCI operations on smdk5250 board, but
>> have not tested the actual working due to unavailability of S3C64XX
>> board. So can someone please help me testing this series.
>>
>> [1] https://lkml.org/lkml/2014/8/5/142
>>
>> Vivek Gautam (7):
>>   usb-phy: samsung-usb3: Remove older phy-samsung-usb3 driver
>>   usb-phy: samsung-usb2: Remove support for Exynos5250
>>   usb-phy: samsung-usb2: Remove support for Exynos4X12
>>   usb-phy: samsung-usb2: Remove support for Exynos4210
>>   usb-phy: samsung-usb2: Clean up to leave only S3C64XX support
>>   usb: host: ehci-exynos: Remove unnecessary usb-phy support
>>   usb: host: ohci-exynos: Remove unnecessary usb-phy support
>
> some of these patches are still RFC, can you resend without RFC and all
> proper Acks in place ? Also rebased on top of v3.17-rc1.

As per the discussion in thread [1], i am dropping the [Patch 1/7]
till [Patch 5/7] since
these are part of the cleanup series by Bart: [PATCH 0/5] usb: phy:
samsung: remove old USB PHY code;
and will send out the rebased remaining two patches with Alan's Ack.


[1] https://lkml.org/lkml/2014/8/21/11



-- 
Best Regards
Vivek Gautam
Samsung R Institute, Bangalore
India
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] usb: phy: samsung: remove old USB PHY code

2014-08-20 Thread Vivek Gautam
On Thu, Aug 14, 2014 at 7:55 PM, Bartlomiej Zolnierkiewicz
 wrote:
> Hi,
>
> This patch series removes the old Samsung USB PHY drivers that
> got replaced by the new ones using the generic PHY layer.
>
> Depends on:
> - next-20140813 branch of linux-next kernel
>
> Best regards,
> --
> Bartlomiej Zolnierkiewicz
> Samsung R Institute Poland
> Samsung Electronics
>
>
> Bartlomiej Zolnierkiewicz (5):
>   ARM: dts: remove old USB2 PHY node hook for Arndale
>   ARM: dts: remove old USB2 PHY node for Exynos5250
>   usb: phy: samsung: remove old USB 2.0 PHY driver
>   usb: phy: samsung: remove old USB 3.0 PHY driver
>   usb: phy: samsung: remove old common USB PHY code

Reviewed-by: Vivek Gautam 



-- 
Best Regards
Vivek Gautam
Samsung R Institute, Bangalore
India
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] zram: report maximum used memory

2014-08-20 Thread Minchan Kim
On Wed, Aug 20, 2014 at 11:03:06PM -0400, David Horner wrote:
> On Wed, Aug 20, 2014 at 10:41 PM, Minchan Kim  wrote:
> > On Wed, Aug 20, 2014 at 10:20:07PM -0400, David Horner wrote:
> >> On Wed, Aug 20, 2014 at 8:27 PM, Minchan Kim  wrote:
> >> > Normally, zram user could get maximum memory usage zram consumed
> >> > via polling mem_used_total with sysfs in userspace.
> >> >
> >> > But it has a critical problem because user can miss peak memory
> >> > usage during update inverval of polling. For avoiding that,
> >> > user should poll it with shorter interval(ie, 0.01s)
> >> > with mlocking to avoid page fault delay when memory pressure
> >> > is heavy. It would be troublesome.
> >> >
> >> > This patch adds new knob "mem_used_max" so user could see
> >> > the maximum memory usage easily via reading the knob and reset
> >> > it via "echo 0 > /sys/block/zram0/mem_used_max".
> >> >
> >> > Signed-off-by: Minchan Kim 
> >> > ---
> >> >  Documentation/ABI/testing/sysfs-block-zram | 10 ++
> >> >  Documentation/blockdev/zram.txt|  1 +
> >> >  drivers/block/zram/zram_drv.c  | 57 
> >> > --
> >> >  drivers/block/zram/zram_drv.h  |  1 +
> >> >  4 files changed, 67 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/Documentation/ABI/testing/sysfs-block-zram 
> >> > b/Documentation/ABI/testing/sysfs-block-zram
> >> > index 025331c19045..ffd1ea7443dd 100644
> >> > --- a/Documentation/ABI/testing/sysfs-block-zram
> >> > +++ b/Documentation/ABI/testing/sysfs-block-zram
> >> > @@ -120,6 +120,16 @@ Description:
> >> > statistic.
> >> > Unit: bytes
> >> >
> >> > +What:  /sys/block/zram/mem_used_max
> >> > +Date:  August 2014
> >> > +Contact:   Minchan Kim 
> >> > +Description:
> >> > +   The mem_used_max file is read/write and specifies the 
> >> > amount
> >> > +   of maximum memory zram have consumed to store compressed 
> >> > data.
> >> > +   For resetting the value, you should do "echo 0". 
> >> > Otherwise,
> >> > +   you could see -EINVAL.
> >> > +   Unit: bytes
> >> > +
> >> >  What:  /sys/block/zram/mem_limit
> >> >  Date:  August 2014
> >> >  Contact:   Minchan Kim 
> >> > diff --git a/Documentation/blockdev/zram.txt 
> >> > b/Documentation/blockdev/zram.txt
> >> > index 9f239ff8c444..3b2247c2d4cf 100644
> >> > --- a/Documentation/blockdev/zram.txt
> >> > +++ b/Documentation/blockdev/zram.txt
> >> > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is 
> >> > wasteful.
> >> > orig_data_size
> >> > compr_data_size
> >> > mem_used_total
> >> > +   mem_used_max
> >> >
> >> >  8) Deactivate:
> >> > swapoff /dev/zram0
> >> > diff --git a/drivers/block/zram/zram_drv.c 
> >> > b/drivers/block/zram/zram_drv.c
> >> > index adc91c7ecaef..138787579478 100644
> >> > --- a/drivers/block/zram/zram_drv.c
> >> > +++ b/drivers/block/zram/zram_drv.c
> >> > @@ -149,6 +149,41 @@ static ssize_t mem_limit_store(struct device *dev,
> >> > return len;
> >> >  }
> >> >
> >> > +static ssize_t mem_used_max_show(struct device *dev,
> >> > +   struct device_attribute *attr, char *buf)
> >> > +{
> >> > +   u64 val = 0;
> >> > +   struct zram *zram = dev_to_zram(dev);
> >> > +
> >> > +   down_read(>init_lock);
> >> > +   if (init_done(zram))
> >> > +   val = atomic64_read(>stats.max_used_pages);
> >> > +   up_read(>init_lock);
> >> > +
> >> > +   return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> >> > +}
> >> > +
> >> > +static ssize_t mem_used_max_store(struct device *dev,
> >> > +   struct device_attribute *attr, const char *buf, size_t 
> >> > len)
> >> > +{
> >> > +   int err;
> >> > +   unsigned long val;
> >> > +   struct zram *zram = dev_to_zram(dev);
> >> > +   struct zram_meta *meta = zram->meta;
> >> > +
> >> > +   err = kstrtoul(buf, 10, );
> >> > +   if (err || val != 0)
> >> > +   return -EINVAL;
> >> > +
> >>
> >> Yes - this works better for the user than explicit single "0" check
> >> Thanks for testing.
> >>
> >> > +   down_read(>init_lock);
> >> > +   if (init_done(zram))
> >> > +   atomic64_set(>stats.max_used_pages,
> >> > +   zs_get_total_size(meta->mem_pool));
> >> > +   up_read(>init_lock);
> >> > +
> >> > +   return len;
> >> > +}
> >> > +
> >> >  static ssize_t max_comp_streams_store(struct device *dev,
> >> > struct device_attribute *attr, const char *buf, size_t 
> >> > len)
> >> >  {
> >> > @@ -461,6 +496,18 @@ out_cleanup:
> >> > return ret;
> >> >  }
> >> >
> >> > +static inline void update_used_max(struct zram *zram, const unsigned 
> >> > long pages)
> >> > +{
> >> > +   u64 old_max, cur_max;
> >> > +
> >> > +   do {

Re: [RFC 2/4] tuntap: Publish tuntap maximum number of queues as module_param

2014-08-20 Thread Jason Wang
On 08/20/2014 07:17 PM, Michael S. Tsirkin wrote:
> On Wed, Aug 20, 2014 at 12:58:17PM +0200, Jiri Pirko wrote:
>> > Mon, Aug 18, 2014 at 03:37:18PM CEST, pagu...@redhat.com wrote:
>>> > > This patch publishes maximum number of tun/tap queues allocated as a
>>> > > read_only module parameter which a user space application like libvirt
>>> > > can make use of to limit maximum number of queues. Value of read_only
>>> > > module parameter can be writable only at module load time. If no value 
>>> > > is set
>>> > > at module load time a default value 256 is used which is equal to 
>>> > > maximum number
>>> > > of vCPUS allowed by KVM.
>>> > >
>>> > > Administrator can specify maximum number of queues only at the driver
>>> > > module load time.
>>> > >
>>> > >Signed-off-by: Pankaj Gupta 
>>> > >---
>>> > > drivers/net/tun.c |   13 +++--
>>> > > 1 files changed, 11 insertions(+), 2 deletions(-)
>>> > >
>>> > >diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>>> > >index acaaf67..1f518e2 100644
>>> > >--- a/drivers/net/tun.c
>>> > >+++ b/drivers/net/tun.c
>>> > >@@ -119,6 +119,9 @@ struct tap_filter {
>>> > > 
>>> > > #define TUN_FLOW_EXPIRE (3 * HZ)
>>> > > 
>>> > >+static int max_tap_queues = MAX_TAP_QUEUES;
>>> > >+module_param(max_tap_queues, int, S_IRUGO);
>> > 
>> > Please do not introduce new module paramaters. Please other ways to
>> > interchange values with userspace.
> I suggested this initially, but thinking more about it, I agree.
>
> It's a global limit (necessary to limit memory utilization by
> userspace), but it should be possible to change it
> after module load.

How about pass this limit through ifr during TUNSETIFF, then
alloc_netdev_mq() can use this limit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/5] usb: phy: samsung: remove old USB 2.0 PHY driver

2014-08-20 Thread Vivek Gautam
Hi Tomasz and Bartlomiej,


On Mon, Aug 18, 2014 at 4:52 PM, Tomasz Figa  wrote:
> On 18.08.2014 13:02, Bartlomiej Zolnierkiewicz wrote:
>> On Thursday, August 14, 2014 08:07:40 PM Vivek Gautam wrote:
>>> On Thursday, August 14, 2014 7:55 PM, Bartlomiej Zolnierkiewicz
>>>  wrote
>>
>>> There's one thing that I would want to comment here, since we don't have any
>>> new usb-phy driver for S3C64XX,
>>> so we can't simply remove this entire driver.
>>> I have posted my patch-series [1], which does cleanup while keeping the
>>> support for S3C64XX.
>>
>> AFAIK S3C64XX code from drivers/usb/phy/phy-samsung-usb2.c has
>> never been used as this platform still uses its own code from
>> arch/arm/mach-s3c64xx/setup-usb-phy.c (there are no users in
>> the kernel tree of either s3c64xx-usb2phy platform device or
>> "samsung,s3c64xx-usb2phy" DT compatible) .  Therefore I think
>> that the entire drivers/usb/phy/phy-samsung-usb2.c driver
>> should be removed (somebody with the hardware can as well add
>> S3C64XX support to the new drivers/phy/phy-samsung-usb2.c
>> driver and port the platform to use it).
>>
>
> I agree with removal of this driver. As Bart said, it is not used for
> S3C64xx at all. The platform was supposed to be moved to this driver,
> but that never happened. In fact, I already have a patch adding support
> for S3C64xx to the new driver.

Cool then, lets remove this driver completely and use the new generic
PHY based driver
once that comes (from Tomasz).

I shall drop the patches for cleaning up the usb-phy drivers from my series.


-- 
Best Regards
Vivek Gautam
Samsung R Institute, Bangalore
India
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pinctrl: spear: Make of_device_id array const

2014-08-20 Thread Viresh Kumar
On Wed, Aug 20, 2014 at 7:56 PM, Kiran Padwal
 wrote:
> Make of_device_id array const, because all OF functions handle it as const.
>
> Signed-off-by: Kiran Padwal 
> ---
>  drivers/pinctrl/spear/pinctrl-spear1310.c |2 +-
>  drivers/pinctrl/spear/pinctrl-spear1340.c |2 +-
>  drivers/pinctrl/spear/pinctrl-spear300.c  |2 +-
>  drivers/pinctrl/spear/pinctrl-spear310.c  |2 +-
>  drivers/pinctrl/spear/pinctrl-spear320.c  |2 +-
>  5 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/pinctrl/spear/pinctrl-spear1310.c 
> b/drivers/pinctrl/spear/pinctrl-spear1310.c
> index 1a8bbfe..6d57d43 100644
> --- a/drivers/pinctrl/spear/pinctrl-spear1310.c
> +++ b/drivers/pinctrl/spear/pinctrl-spear1310.c
> @@ -2692,7 +2692,7 @@ static struct spear_pinctrl_machdata spear1310_machdata 
> = {
> .modes_supported = false,
>  };
>
> -static struct of_device_id spear1310_pinctrl_of_match[] = {
> +static const struct of_device_id spear1310_pinctrl_of_match[] = {
> {
> .compatible = "st,spear1310-pinmux",
> },
> diff --git a/drivers/pinctrl/spear/pinctrl-spear1340.c 
> b/drivers/pinctrl/spear/pinctrl-spear1340.c
> index 873966e..d243e43 100644
> --- a/drivers/pinctrl/spear/pinctrl-spear1340.c
> +++ b/drivers/pinctrl/spear/pinctrl-spear1340.c
> @@ -2008,7 +2008,7 @@ static struct spear_pinctrl_machdata spear1340_machdata 
> = {
> .modes_supported = false,
>  };
>
> -static struct of_device_id spear1340_pinctrl_of_match[] = {
> +static const struct of_device_id spear1340_pinctrl_of_match[] = {
> {
> .compatible = "st,spear1340-pinmux",
> },
> diff --git a/drivers/pinctrl/spear/pinctrl-spear300.c 
> b/drivers/pinctrl/spear/pinctrl-spear300.c
> index 4777c0d..9db83e9 100644
> --- a/drivers/pinctrl/spear/pinctrl-spear300.c
> +++ b/drivers/pinctrl/spear/pinctrl-spear300.c
> @@ -646,7 +646,7 @@ static struct spear_function *spear300_functions[] = {
> _function,
>  };
>
> -static struct of_device_id spear300_pinctrl_of_match[] = {
> +static const struct of_device_id spear300_pinctrl_of_match[] = {
> {
> .compatible = "st,spear300-pinmux",
> },
> diff --git a/drivers/pinctrl/spear/pinctrl-spear310.c 
> b/drivers/pinctrl/spear/pinctrl-spear310.c
> index ed1d360..db775a4 100644
> --- a/drivers/pinctrl/spear/pinctrl-spear310.c
> +++ b/drivers/pinctrl/spear/pinctrl-spear310.c
> @@ -371,7 +371,7 @@ static struct spear_function *spear310_functions[] = {
> _function,
>  };
>
> -static struct of_device_id spear310_pinctrl_of_match[] = {
> +static const struct of_device_id spear310_pinctrl_of_match[] = {
> {
> .compatible = "st,spear310-pinmux",
> },
> diff --git a/drivers/pinctrl/spear/pinctrl-spear320.c 
> b/drivers/pinctrl/spear/pinctrl-spear320.c
> index b8e290a..80fbd68 100644
> --- a/drivers/pinctrl/spear/pinctrl-spear320.c
> +++ b/drivers/pinctrl/spear/pinctrl-spear320.c
> @@ -3410,7 +3410,7 @@ static struct spear_function *spear320_functions[] = {
> _function,
>  };
>
> -static struct of_device_id spear320_pinctrl_of_match[] = {
> +static const struct of_device_id spear320_pinctrl_of_match[] = {
> {
> .compatible = "st,spear320-pinmux",
> },

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Handling commit change logs

2014-08-20 Thread Viresh Kumar
On Thu, Aug 21, 2014 at 2:00 AM, Stephen Warren  wrote:
> On 08/20/2014 02:02 PM, Andreas Färber wrote:
>> Am 20.08.2014 17:39, schrieb Javier Martinez Canillas:
>>> If this not the correct workflow and you have a better way to manage
>>> this, I would love to know about it.

Oh yes, this was certainly useful. I was missing this (mis)feature :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 11/11] clk: rockchip: add restart handler

2014-08-20 Thread Doug Anderson
Guenter / Heiko,

On Tue, Aug 19, 2014 at 5:45 PM, Guenter Roeck  wrote:
> From: Heiko Stübner 
>
> Add infrastructure to write the correct value to the restart register and
> register the restart notifier for both rk3188 (including rk3066) and rk3288.
>
> Signed-off-by: Heiko Stuebner 
> Signed-off-by: Guenter Roeck 
> ---
> v7: Added patch to series.
>
>  drivers/clk/rockchip/clk-rk3188.c |  2 ++
>  drivers/clk/rockchip/clk-rk3288.c |  2 ++
>  drivers/clk/rockchip/clk.c| 25 +
>  drivers/clk/rockchip/clk.h|  1 +
>  4 files changed, 30 insertions(+)

This patch doesn't apply cleanly with the in-flight (clk: rockchip:
protect critical clocks from getting disabled) patch from Heiko.  It's
trivial to resolve and unclear which will land first, so I think it's
fine...

Reviewed-by: Doug Anderson 
Tested-by: Doug Anderson 

(FYI: all patches tested by me were tested on rk3288-evb)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 03/11] arm64: Support restart through restart handler call chain

2014-08-20 Thread Doug Anderson
Guenter,

On Tue, Aug 19, 2014 at 5:45 PM, Guenter Roeck  wrote:
> The kernel core now supports a restart handler call chain to restart
> the system. Call it if arm_pm_restart is not set.
>
> Signed-off-by: Guenter Roeck 
> Acked-by: Catalin Marinas 
> Acked-by: Heiko Stuebner 
> ---
> v7: No change.
> v6: No change.
> v5: Renamed restart function to do_kernel_restart
> v4: No change.
> v3: Use wrapper function to execute notifier call chain.
> v2: Only call notifier call chain if arm_pm_restart is not set.
> Do not include linux/watchdog.h.
>
>  arch/arm64/kernel/process.c | 2 ++
>  1 file changed, 2 insertions(+)

Reviewed-by: Doug Anderson 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 08/11] arm/arm64: Unexport restart handlers

2014-08-20 Thread Doug Anderson
Guenter,

On Tue, Aug 19, 2014 at 5:45 PM, Guenter Roeck  wrote:
> Implementing a restart handler in a module don't make sense
> as there would be no guarantee that the module is loaded when
> a restart is needed. Unexport arm_pm_restart to ensure that
> no one gets the idea to do it anyway.
>
> Signed-off-by: Guenter Roeck 
> Acked-by: Catalin Marinas 
> Acked-by: Heiko Stuebner 
> ---
> v7: No change
> v6: No change
> v5: No change
> v4: No change
> v3: No change
> v2: No change
>
>  arch/arm/kernel/process.c   | 1 -
>  arch/arm64/kernel/process.c | 1 -
>  2 files changed, 2 deletions(-)

Reviewed-by: Doug Anderson 
Tested-by: Doug Anderson 

(FYI: all patches tested by me were tested on rk3288-evb)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 02/11] power/restart: Call machine_restart instead of arm_pm_restart

2014-08-20 Thread Doug Anderson
Guenter,

On Tue, Aug 19, 2014 at 5:45 PM, Guenter Roeck  wrote:
> machine_restart is supported on non-ARM platforms, and and ultimately calls
> arm_pm_restart, so dont call arm_pm_restart directly but use the more
> generic function.
>
> Cc: Russell King 

Do you need to submit this to his patch tracker to get him to pick it
up?  How are you envisioning that this series land?  It crosses a lot
of boundaries so I guess will need a reasonable amount of coordination
between maintainers...


> Signed-off-by: Guenter Roeck 
> Acked-by: Catalin Marinas 
> Acked-by: Heiko Stuebner 
> ---
> v7: No change.
> v6: No change.
> v5: No change.
> v4: No change.
> v3: No change.
> v2: Added patch.
>
>  drivers/power/reset/restart-poweroff.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Reviewed-by: Doug Anderson 
Tested-by: Doug Anderson 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 04/11] arm: Support restart through restart handler call chain

2014-08-20 Thread Doug Anderson
Guenter,

On Tue, Aug 19, 2014 at 5:45 PM, Guenter Roeck  wrote:
> The kernel core now supports a restart handler call chain for system
> restart functions.
>
> With this change, the arm_pm_restart callback is now optional, so
> drop its initialization and check if it is set before calling it.
> Only call the kernel restart handler if arm_pm_restart is not set.
>
> Signed-off-by: Guenter Roeck 
> Acked-by: Catalin Marinas 
> Acked-by: Heiko Stuebner 
> ---
> v7: Dropped null_restart and made arm_pm_restart truly optional.
> v6: No change.
> v5: Renamed restart function to do_kernel_restart
> v4: No change.
> v3: Use wrapper function to execute notifier call chain.
> v2: Only call notifier call chain if arm_pm_restart is not set.
> Do not include linux/watchdog.h.
>
>  arch/arm/kernel/process.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)

Reviewed-by: Doug Anderson 
Tested-by: Doug Anderson 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 01/11] kernel: Add support for kernel restart handler call chain

2014-08-20 Thread Doug Anderson
Guenter,

On Tue, Aug 19, 2014 at 5:45 PM, Guenter Roeck  wrote:
> Various drivers implement architecture and/or device specific means
> to restart (reset) the system. Various mechanisms have been implemented
> to support those schemes. The best known mechanism is arm_pm_restart,
> which is a function pointer to be set either from platform specific code
> or from drivers. Another mechanism is to use hardware watchdogs to issue
> a reset; this mechanism is used if there is no other method available
> to reset a board or system. Two examples are alim7101_wdt, which currently
> uses the reboot notifier to trigger a reset, and moxart_wdt, which registers
> the arm_pm_restart function.
>
> The existing mechanisms have a number of drawbacks. Typically only one scheme
> to restart the system is supported (at least if arm_pm_restart is used).
> At least in theory there can be multiple means to restart the system, some of
> which may be less desirable (for example one mechanism may only reset the CPU,
> while another may reset the entire system). Using arm_pm_restart can also be
> racy if the function pointer is set from a driver, as the driver may be in
> the process of being unloaded when arm_pm_restart is called.
> Using the reboot notifier is always racy, as it is unknown if and when
> other functions using the reboot notifier have completed execution
> by the time the watchdog fires.
>
> Introduce a system restart handler call chain to solve the described problems.
> This call chain is expected to be executed from the architecture specific
> machine_restart() function. Drivers providing system restart functionality
> (such as the watchdog drivers mentioned above) are expected to register
> with this call chain. By using the priority field in the notifier block,
> callers can control restart handler execution sequence and thus ensure that
> the restart handler with the optimal restart capabilities for a given system
> is called first.
>
> Signed-off-by: Guenter Roeck 
> Acked-by: Catalin Marinas 
> Acked-by: Heiko Stuebner 
> ---
> v7: Rebased to v3.17-rc1
> v6: Use atomic notifier call chain
> v5: Function renames:
> register_restart_notifier -> register_restart_handler
> unregister_restart_notifier -> unregister_restart_handler
> kernel_restart_notify -> do_kernel_restart
> v4: Document and suggest values for notifier priorities
> v3: Add kernel_restart_notify wrapper function to execute notifier.
> Improve documentation.
> Move restart_notifier_list into kernel/reboot.c and make it static.
> v2: No change.
>
>  include/linux/reboot.h |  3 ++
>  kernel/reboot.c| 81 
> ++
>  2 files changed, 84 insertions(+)

Reviewed-by: Doug Anderson 
Tested-by: Doug Anderson 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: amd_mce.c redundant if check?

2014-08-20 Thread Chip

On Wed, Aug 20, 2014 at 11:18:21AM -0600, Adam Duskett wrote:


I have recently come upon this section of code in
arch/x86/kernel/cpu/mcheck/mce_amd.c that seems to be a redundant
unnecessary if check.


From line 170 - 176:

if (tr->set_lvt_off) {
if (lvt_off_valid(tr->b, tr->lvt_off, lo, hi)) {
/* set new lvt offset */
hi &= ~MASK_LVTOFF_HI;
hi |= tr->lvt_off << 20;
}
}


This seems like it's not actually doing anything because it's setting
the same value that the bit-field already has to itself.


I brought this up to Adam the other day, so he posted the question to this 
list today to elicit a response from the original developer(s).  I realize 
the quickest response is to ask the original poster (Adam) to investigate 
further, such as with pen and paper, but that is not a proper response to a 
legitimate question.  Here is the #define that is referenced, and the two 
routines in question.  This is current in kernel version 3.16 in 
arch/x86/kernel/cpu/mcheck/mce_amd.c.


#define MASK_LVTOFF_HI0x00F0

static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 
hi)

{
   int msr = (hi & MASK_LVTOFF_HI) >> 20;

   if (apic < 0) {
   pr_err(FW_BUG "cpu %d, failed to setup threshold interrupt "
  "for bank %d, block %d (MSR%08X=0x%x%08x)\n", b->cpu,
  b->bank, b->block, b->address, hi, lo);
   return 0;
   }

   if (apic != msr) {
   pr_err(FW_BUG "cpu %d, invalid threshold interrupt offset %d 
"

  "for bank %d, block %d (MSR%08X=0x%x%08x)\n",
  b->cpu, apic, b->bank, b->block, b->address, hi, lo);
   return 0;
   }

   return 1;
};

/*
* Called via smp_call_function_single(), must be called with correct
* cpu affinity.
*/
static void threshold_restart_bank(void *_tr)
{
   struct thresh_restart *tr = _tr;
   u32 hi, lo;

   rdmsr(tr->b->address, lo, hi);

   if (tr->b->threshold_limit < (hi & THRESHOLD_MAX))
   tr->reset = 1;  /* limit cannot be lower than err count */

   if (tr->reset) {/* reset err count and overflow bit 
*/

   hi =
   (hi & ~(MASK_ERR_COUNT_HI | MASK_OVERFLOW_HI)) |
   (THRESHOLD_MAX - tr->b->threshold_limit);
   } else if (tr->old_limit) { /* change limit w/o reset */
   int new_count = (hi & THRESHOLD_MAX) +
   (tr->old_limit - tr->b->threshold_limit);

   hi = (hi & ~MASK_ERR_COUNT_HI) |
   (new_count & THRESHOLD_MAX);
   }

   /* clear IntType */
   hi &= ~MASK_INT_TYPE_HI;

   if (!tr->b->interrupt_capable)
   goto done;

   if (tr->set_lvt_off) {
   if (lvt_off_valid(tr->b, tr->lvt_off, lo, hi)) {
   /* set new lvt offset */
   hi &= ~MASK_LVTOFF_HI;
   hi |= tr->lvt_off << 20;
   }
   }

   if (tr->b->interrupt_enable)
   hi |= INT_TYPE_APIC;

done:

   hi |= MASK_COUNT_EN_HI;
   wrmsr(tr->b->address, lo, hi);
}


If one were to actually analyze the source file from which this snippet 
comes (lines 117 - 185), one would realize the call to lvt_off_valid() is 
given tr->lvt_off as the input "apic" value that is compared to the content 
in "hi" at bit positions 23:20 (MSR bits 55:52); this field is called LVT 
Offset (LVTOFF).  The value for tr->lvt_off is usually from 0 to 4, 
inclusive.  If this value is equal to the LVTOFF value in "hi", then 
lvt_off_valid() returns 1 for true.  If the value for tr->lvt_off differs 
from the LVTOFF value in "hi", then lvt_off_valid() returns 0 for false.


Now, if the return from lvt_off_valid() is false, then nothing is changed in 
"hi".  However, if the return is true, which means the value in tr->lvt_off 
is equal to the LVTOFF value in "hi", then the LVTOFF value in "hi" is 
replaced with the value in tr->lvt_off.  One has to wonder, then, why bother 
actually calling lvt_off_valid() in the first place when the end result is 
that "hi" does not change.  What is the rationale for having the code 
snippet at lines 170 - 176 when that condition check does nothing to change 
"hi"?


--
Chip 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 19/19] Documentation: ACPI for ARM64

2014-08-20 Thread Hanjun Guo
On 2014-8-21 6:17, Olof Johansson wrote:
> On Mon, Aug 18, 2014 at 05:29:26PM +0800, Hanjun Guo wrote:
>> On 2014-8-15 18:01, Catalin Marinas wrote:
>>> Hanjun,
>>
>> Hi Catalin,
>>
>>>
>>> On Fri, Aug 15, 2014 at 10:09:42AM +0100, Hanjun Guo wrote:
 On 2014-8-14 18:27, Catalin Marinas wrote:
> On Thu, Aug 14, 2014 at 04:21:25AM +0100, Hanjun Guo wrote:
>> On 2014-8-14 7:41, Rafael J. Wysocki wrote:
>>> On Tuesday, August 12, 2014 07:23:47 PM Catalin Marinas wrote:
 If we consider ACPI unusable on ARM but we still want to start merging
 patches, we should rather make the config option depend on BROKEN
 (though if it is that unusable that no real platform can use it, I 
 would
 rather not merge it at all at this stage).
>>>
>>> I agree here.
>>>
>>> I would recommend creating a separate branch for that living outside of 
>>> the
>>> mainline kernel and merging it when there are real users.
>>
>> Real users will coming soon, we already tested this patch set on real 
>> hardware
>> (ARM64 Juno platform),
>
> I don't consider Juno a server platform ;) (but it's good enough for
> development).
>
>> and I think ARM64 server chips and platforms will show up before 3.18
>> is released.
>
> That's what I've heard/seen. The questions I have are (a) whether
> current ACPI patchset is enough to successfully run Linux on such
> _hardware_ platform (maybe not fully optimised, for example just WFI
> cpuidle) and (b) whether we still want to mandate a DT in the kernel for
> such platforms.

 For (a), this patch set is only for ARM64 core, not including platform
 specific device drivers, it will be covered by the binding of _DSD or
 explicit definition of PNP ID/ACPI ID(s).
>>>
>>> So we go back to the discussions we had few months ago in Macau. I'm not
>>> concerned about the core ARM and architected peripherals covered by ACPI
>>> 5.1 (as long as the current patches get positive technical review). But
>>> I'm concerned about the additional bits needed for a real SoC like _DSD
>>> definitions, how they get reviewed/accepted (or is it just the vendor's
>>> problem?).
>>
>> As the _DSD patch set sent out by Intel folks, _DSD definitions are just
>> DT definitions. To use _DSD or not, I think it depends on OEM use cases,
>> we can bring up Juno without _DSD (Graeme is working on that, still need
>> some time to clean up the code).
>>
>>>
>>> I think SBSA is too vague to guarantee a kernel image running on a
>>> compliant platform without additional (vendor-specific) tweaks. So what
>>> I asked for is (1) a document (guide) to define the strict set of ACPI
>>> features and bindings needed for a real SoC and (2) proof that the
>>> guidelines are enough for real hardware. I think we have (1) under
>>> review with some good feedback so far. As for (2), we can probably only
>>> discuss Juno openly. I think you could share the additional Juno patches
>>> on this list so that reviewers can assess the suitability. If we deem
>>> ACPI not (yet) suitable for Juno, is there other platform we could see
>>> patches for?
>>
>> Ok, we will send out all the patches for Juno in next version for review,
>> as mentioned above, we still need more time to clean up the code.
>>
>>>
> Given the answer to (a) and what other features are needed, we may or
> may not mandate (b). We were pretty clear few months ago that (b) is
> still required but at the time we were only openly talking about ACPI
> 5.0 which was lacking many features. I think we need to revisit that
> position based on how usable ACPI 5.1 for ARM (and current kernel
> implementation) is. Would you mind elaborating what an ACPI-only
> platform miss?

 Do you mean something still missing? We still miss some features for
 ARM in ACPI, but I think they are not critical, here is the list I can
 remember:
 - ITS for GICv3/4;
 - SMMU support;
 - CPU idle control.
>>>
>>> I agree, these are not critical at this stage. But they only refer to
>>> architected peripherals. Is there anything else missing for an SoC? Do
>>> we need to define clocks?
>>
>> No, I prefer not. As we discussed in this thread before, we don't need
>> clock definition if we use SBSA compatible UART on Juno.
>>
>>>
 For ACPI 5.1, it fixes many problems for ARM:
 - weak definition for GIC, so we introduce visualization, v2m and
   part of GICv3/4 (redistributors) support.
 - No support for PSCI. Fix it to support PSCI 0.2+;
 - Not support for Always-on timer and SBSA-L1 watchdog.
>>>
>>> These are all good, that's why we shouldn't even talk about ACPI 5.0 in
>>> the ARM context.
>>>
 - How to describe device properties, so _DSD is introduced for
   device probe.
>>>
>>> For the last bullet, is there any review process (at least like what we
>>> have for DT bindings)? On top of such 

[PATCH] cgroup: add tracepoints to track cgroup events

2014-08-20 Thread Andrea Righi
This patch adds the following tracepoints:
 o trace_cgroup_create   when a new cgroup is created
 o trace_cgroup_destroy  when a cgroup is removed
 o trace_cgroup_task_migrate when a task/thread is moved from a cgroup to 
another

The purpose of these tracepoints is to identify and help cgroup "managers" to
diagnose problems and detect when they are doing an excessive amount of work.

Signed-off-by: Matt Heaton 
Signed-off-by: Andrea Righi 
---
 include/trace/events/cgroup.h | 95 +++
 kernel/cgroup.c   | 14 ++-
 2 files changed, 108 insertions(+), 1 deletion(-)
 create mode 100644 include/trace/events/cgroup.h

diff --git a/include/trace/events/cgroup.h b/include/trace/events/cgroup.h
new file mode 100644
index 000..937b41e
--- /dev/null
+++ b/include/trace/events/cgroup.h
@@ -0,0 +1,95 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM cgroup
+
+#if !defined(_TRACE_CGROUP_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_CGROUP_H
+
+#include 
+#include 
+
+#define TRACE_CGROUP_PATH_MAX  256
+
+#ifdef CREATE_TRACE_POINTS
+static inline void cgroup_safe_path(struct cgroup *cgrp, char *buf,
+   size_t buflen)
+{
+   char *path = cgroup_path(cgrp, buf, buflen);
+   size_t len;
+
+   if (likely(path)) {
+   /* NOTE: path is always NULL terminated */
+   len = strlen(path);
+   memmove(buf, path, len);
+   buf[len] = '\0';
+   } else {
+   strncpy(buf, "(NULL)", buflen);
+   }
+}
+#endif
+
+TRACE_EVENT(cgroup_create,
+
+   TP_PROTO(struct cgroup *cgrp),
+
+   TP_ARGS(cgrp),
+
+   TP_STRUCT__entry(
+   __array(char, name, TRACE_CGROUP_PATH_MAX)
+   ),
+
+   TP_fast_assign(
+   cgroup_safe_path(cgrp, __entry->name, TRACE_CGROUP_PATH_MAX);
+   ),
+
+   TP_printk("%s", __entry->name)
+);
+
+TRACE_EVENT(cgroup_destroy,
+
+   TP_PROTO(struct cgroup *cgrp),
+
+   TP_ARGS(cgrp),
+
+   TP_STRUCT__entry(
+   __array(char, name, TRACE_CGROUP_PATH_MAX)
+   ),
+
+   TP_fast_assign(
+   cgroup_safe_path(cgrp, __entry->name, TRACE_CGROUP_PATH_MAX);
+   ),
+
+   TP_printk("%s", __entry->name)
+);
+
+TRACE_EVENT(cgroup_task_migrate,
+
+   TP_PROTO(struct cgroup *old_cgrp, struct cgroup *new_cgrp,
+const struct task_struct *p),
+
+   TP_ARGS(old_cgrp, new_cgrp, p),
+
+   TP_STRUCT__entry(
+   __field(pid_t, pid)
+   __array(char, old_name, TRACE_CGROUP_PATH_MAX)
+   __array(char, new_name, TRACE_CGROUP_PATH_MAX)
+   __array(char, comm, TASK_COMM_LEN)
+   ),
+
+   TP_fast_assign(
+   __entry->pid = p->pid;
+   memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
+   cgroup_safe_path(old_cgrp, __entry->old_name,
+TRACE_CGROUP_PATH_MAX);
+   cgroup_safe_path(new_cgrp, __entry->new_name,
+TRACE_CGROUP_PATH_MAX);
+   ),
+
+   TP_printk("pid=%d comm=%s from=%s to=%s",
+ __entry->pid, __entry->comm,
+ __entry->old_name, __entry->new_name)
+);
+
+#endif /* _TRACE_CGROUP_H */
+
+/* This part must be outside protection */
+#include 
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 7dc8788..00a50b9 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -60,6 +60,9 @@
 
 #include 
 
+#define CREATE_TRACE_POINTS
+#include 
+
 /*
  * pidlists linger the following amount before being destroyed.  The goal
  * is avoiding frequent destruction in the middle of consecutive read calls
@@ -2014,6 +2017,7 @@ struct task_struct *cgroup_taskset_next(struct 
cgroup_taskset *tset)
  * Must be called with cgroup_mutex, threadgroup and css_set_rwsem locked.
  */
 static void cgroup_task_migrate(struct cgroup *old_cgrp,
+   struct cgroup *new_cgrp,
struct task_struct *tsk,
struct css_set *new_cset)
 {
@@ -2022,6 +2026,8 @@ static void cgroup_task_migrate(struct cgroup *old_cgrp,
lockdep_assert_held(_mutex);
lockdep_assert_held(_set_rwsem);
 
+   trace_cgroup_task_migrate(old_cgrp, new_cgrp, tsk);
+
/*
 * We are synchronized through threadgroup_lock() against PF_EXITING
 * setting such that we can't race against cgroup_exit() changing the
@@ -2274,7 +2280,7 @@ static int cgroup_migrate(struct cgroup *cgrp, struct 
task_struct *leader,
down_write(_set_rwsem);
list_for_each_entry(cset, _csets, mg_node) {
list_for_each_entry_safe(task, tmp_task, >mg_tasks, 
cg_list)
-   cgroup_task_migrate(cset->mg_src_cgrp, task,
+   cgroup_task_migrate(cset->mg_src_cgrp, cgrp, task,
cset->mg_dst_cset);
}

Re: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu

2014-08-20 Thread Daniel Lezcano

On 08/21/2014 04:12 AM, Liu, Chuansheng wrote:

Hello Daniel,


-Original Message-
From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org]
Sent: Thursday, August 21, 2014 9:54 AM
To: Liu, Chuansheng; l...@amacapital.net; pet...@infradead.org;
r...@rjwysocki.net; mi...@redhat.com
Cc: linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Changcheng;
Wang, Xiaoming; Chakravarty, Souvik K
Subject: Re: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the
idle cpu

On 08/18/2014 10:37 AM, Chuansheng Liu wrote:

Implementing one new API wake_up_if_idle(), which is used to
wake up the idle CPU.


Is this patchset tested ? Did you check it solves the issue you were
facing ?

We have done the basic test, and found the cores can exit C0 quickly with this 
patchset.
Basically once the _TIF_NEED_RESCHED is set, then the poll_idle() can be broken.

Please correct me if something is wrong, thanks.


Actually, it was not clear for me if this patch was a proposal or not.

I will review it.

Thanks

  -- Daniel


Suggested-by: Andy Lutomirski 
Signed-off-by: Chuansheng Liu 
---
   include/linux/sched.h |1 +
   kernel/sched/core.c   |   16 
   2 files changed, 17 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 857ba40..3f89ac1 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1024,6 +1024,7 @@ struct sched_domain_topology_level {
   extern struct sched_domain_topology_level *sched_domain_topology;

   extern void set_sched_topology(struct sched_domain_topology_level *tl);
+extern void wake_up_if_idle(int cpu);

   #ifdef CONFIG_SCHED_DEBUG
   # define SD_INIT_NAME(type)  .name = #type
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1211575..adf104f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1620,6 +1620,22 @@ static void ttwu_queue_remote(struct

task_struct *p, int cpu)

}
   }

+void wake_up_if_idle(int cpu)
+{
+   struct rq *rq = cpu_rq(cpu);
+   unsigned long flags;
+
+   if (set_nr_if_polling(rq->idle)) {
+   trace_sched_wake_idle_without_ipi(cpu);
+   } else {
+   raw_spin_lock_irqsave(>lock, flags);
+   if (rq->curr == rq->idle)
+   smp_send_reschedule(cpu);
+   /* Else cpu is not in idle, do nothing here */
+   raw_spin_unlock_irqrestore(>lock, flags);
+   }
+}
+
   bool cpus_share_cache(int this_cpu, int that_cpu)
   {
return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);




--
    Linaro.org │ Open source software for ARM
SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog





--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] spi: spi-imx: add DMA support

2014-08-20 Thread Robin Gong
After enable DMA

spi-nor read speed is
dd if=/dev/mtd0 of=/dev/null bs=1M count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.720402 s, 1.5 MB/s

spi-nor write speed is
dd if=/dev/zero of=/dev/mtd0 bs=1M count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 3.56044 s, 295 kB/s

Before enable DMA

spi-nor read speed is
dd if=/dev/mtd0 of=/dev/null bs=1M count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 2.37717 s, 441 kB/s

spi-nor write speed is

dd if=/dev/zero of=/dev/mtd0 bs=1M count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 4.83181 s, 217 kB/s

Signed-off-by: Frank Li 
Signed-off-by: Robin Gong 

---
Change from v2:
http://thread.gmane.org/gmane.linux.ports.arm.kernel/291722/focus=294363
1. dma setup only for imx51-ecspi
2. use one small dummy buffer(1 bd size) to templiy store data
   for meanless rx/tx, instead of malloc the actual transfer size.
3. split spi_mx_sdma_transfer to smaller and easily to read.
4. fix some code indent.
---
 drivers/spi/spi-imx.c |  398 -
 1 file changed, 392 insertions(+), 6 deletions(-)

diff --git a/drivers/spi/spi-imx.c b/drivers/spi/spi-imx.c
index a5474ef..0c81a66 100644
--- a/drivers/spi/spi-imx.c
+++ b/drivers/spi/spi-imx.c
@@ -39,6 +39,9 @@
 #include 
 
 #include 
+#include 
+#include 
+#include 
 
 #define DRIVER_NAME "spi_imx"
 
@@ -52,6 +55,10 @@
 #define MXC_INT_RR (1 << 0) /* Receive data ready interrupt */
 #define MXC_INT_TE (1 << 1) /* Transmit FIFO empty interrupt */
 
+/* The maximum  bytes that a sdma BD can transfer.*/
+#define MAX_SDMA_BD_BYTES  (1 << 15)
+#define IMX_DMA_TIMEOUT (msecs_to_jiffies(3000))
+
 struct spi_imx_config {
unsigned int speed_hz;
unsigned int bpw;
@@ -84,6 +91,7 @@ struct spi_imx_data {
 
struct completion xfer_done;
void __iomem *base;
+   phys_addr_t pbase;
int irq;
struct clk *clk_per;
struct clk *clk_ipg;
@@ -92,6 +100,27 @@ struct spi_imx_data {
unsigned int count;
void (*tx)(struct spi_imx_data *);
void (*rx)(struct spi_imx_data *);
+   int (*txrx_bufs)(struct spi_device *spi, struct spi_transfer *t);
+   struct dma_chan *dma_chan_rx;
+   struct dma_chan *dma_chan_tx;
+   unsigned int dma_is_inited;
+   struct device *dev;
+
+   struct completion dma_rx_completion;
+   struct completion dma_tx_completion;
+
+   void *dummy_buf;
+   dma_addr_t dummy_dma;
+   dma_addr_t dma_rx_phy_addr;
+   dma_addr_t dma_tx_phy_addr;
+
+   unsigned int usedma;
+   unsigned int dma_finished;
+   /* SDMA wartermark */
+   u32 rx_wml;
+   u32 tx_wml;
+   u32 rxt_wml;
+
void *rx_buf;
const void *tx_buf;
unsigned int txfifo; /* number of words pushed in tx FIFO */
@@ -185,6 +214,7 @@ static unsigned int spi_imx_clkdiv_2(unsigned int fin,
 #define MX51_ECSPI_CTRL0x08
 #define MX51_ECSPI_CTRL_ENABLE (1 <<  0)
 #define MX51_ECSPI_CTRL_XCH(1 <<  2)
+#define MX51_ECSPI_CTRL_SMC(1 << 3)
 #define MX51_ECSPI_CTRL_MODE_MASK  (0xf << 4)
 #define MX51_ECSPI_CTRL_POSTDIV_OFFSET 8
 #define MX51_ECSPI_CTRL_PREDIV_OFFSET  12
@@ -202,6 +232,18 @@ static unsigned int spi_imx_clkdiv_2(unsigned int fin,
 #define MX51_ECSPI_INT_TEEN(1 <<  0)
 #define MX51_ECSPI_INT_RREN(1 <<  3)
 
+#define MX51_ECSPI_DMA  0x14
+#define MX51_ECSPI_DMA_TX_WML_OFFSET   0
+#define MX51_ECSPI_DMA_TX_WML_MASK 0x3F
+#define MX51_ECSPI_DMA_RX_WML_OFFSET   16
+#define MX51_ECSPI_DMA_RX_WML_MASK (0x3F << 16)
+#define MX51_ECSPI_DMA_RXT_WML_OFFSET  24
+#define MX51_ECSPI_DMA_RXT_WML_MASK(0x3F << 24)
+
+#define MX51_ECSPI_DMA_TEDEN_OFFSET7
+#define MX51_ECSPI_DMA_RXDEN_OFFSET23
+#define MX51_ECSPI_DMA_RXTDEN_OFFSET   31
+
 #define MX51_ECSPI_STAT0x18
 #define MX51_ECSPI_STAT_RR (1 <<  3)
 
@@ -258,17 +300,22 @@ static void __maybe_unused mx51_ecspi_intctrl(struct 
spi_imx_data *spi_imx, int
 
 static void __maybe_unused mx51_ecspi_trigger(struct spi_imx_data *spi_imx)
 {
-   u32 reg;
-
-   reg = readl(spi_imx->base + MX51_ECSPI_CTRL);
-   reg |= MX51_ECSPI_CTRL_XCH;
+   u32 reg = readl(spi_imx->base + MX51_ECSPI_CTRL);
+
+   if (!spi_imx->usedma)
+   reg |= MX51_ECSPI_CTRL_XCH;
+   else if (!spi_imx->dma_finished)
+   reg |= MX51_ECSPI_CTRL_SMC;
+   else
+   reg &= ~MX51_ECSPI_CTRL_SMC;
writel(reg, spi_imx->base + MX51_ECSPI_CTRL);
 }
 
 static int __maybe_unused mx51_ecspi_config(struct spi_imx_data *spi_imx,
struct spi_imx_config *config)
 {
-   u32 ctrl = MX51_ECSPI_CTRL_ENABLE, cfg = 0;
+   u32 ctrl = MX51_ECSPI_CTRL_ENABLE, cfg = 0, dma = 0;
+   u32 tx_wml_cfg, rx_wml_cfg, rxt_wml_cfg;
u32 clk = config->speed_hz, delay;
 
/*
@@ -320,6 +367,30 

[PATCH] ACPI / scan: Allow ACPI drivers to bind to PNP device objects

2014-08-20 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

We generally don't allow ACPI drivers to bind to ACPI device objects
that companion "physical" device objects are created for to avoid
situations in which two different drivers may attempt to handle one
device at the same time.  Recent ACPI device enumeration rework
extended that approach to ACPI PNP devices by starting to use a scan
handler for enumerating them.  However, we previously allowed ACPI
drivers to bind to ACPI device objects with existing PNP device
companions and changing that led to functional regressions on some
systems.

For this reason, add a special check for PNP devices in
acpi_device_probe() so that ACPI drivers can bind to ACPI device
objects having existing PNP device companions as before.

Fixes: eec15edbb0e1 (ACPI / PNP: use device ID list for PNPACPI device 
enumeration)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=81511
Link: https://bugzilla.kernel.org/show_bug.cgi?id=81971
Reported-and-tested-by: Gabriele Mazzotta 
Reported-and-tested-by: Dirk Griesbach 
Signed-off-by: Rafael J. Wysocki 
---
 drivers/acpi/acpi_pnp.c |5 +
 drivers/acpi/internal.h |1 +
 drivers/acpi/scan.c |2 +-
 3 files changed, 7 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/acpi/acpi_pnp.c
===
--- linux-pm.orig/drivers/acpi/acpi_pnp.c
+++ linux-pm/drivers/acpi/acpi_pnp.c
@@ -396,3 +396,8 @@ void __init acpi_pnp_init(void)
 {
acpi_scan_add_handler(_pnp_handler);
 }
+
+bool is_acpi_pnp_device(struct acpi_device *adev)
+{
+   return adev->handler == _pnp_handler;
+}
Index: linux-pm/drivers/acpi/internal.h
===
--- linux-pm.orig/drivers/acpi/internal.h
+++ linux-pm/drivers/acpi/internal.h
@@ -86,6 +86,7 @@ void acpi_device_add_finalize(struct acp
 void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
 bool acpi_device_is_present(struct acpi_device *adev);
 bool acpi_device_is_battery(struct acpi_device *adev);
+bool is_acpi_pnp_device(struct acpi_device *adev);
 
 /* --
   Power Resource
Index: linux-pm/drivers/acpi/scan.c
===
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -975,7 +975,7 @@ static int acpi_device_probe(struct devi
struct acpi_driver *acpi_drv = to_acpi_driver(dev->driver);
int ret;
 
-   if (acpi_dev->handler)
+   if (acpi_dev->handler && !is_acpi_pnp_device(acpi_dev))
return -EINVAL;
 
if (!acpi_drv->ops.add)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 5/9] block: loop: convert to blk-mq

2014-08-20 Thread Ming Lei
On Thu, Aug 21, 2014 at 10:58 AM, Jens Axboe  wrote:
> On 2014-08-20 21:54, Ming Lei wrote:

   From my investigation, context switch increases almost 50% with
 workqueue compared with kthread in loop in a quad-core VM. With
 kthread, requests may be handled as batch in cases which won't be
 blocked in read()/write()(like null_blk, tmpfs, ...), but it is
 impossible
 with
 workqueue any more.  Also block plug should have been used
 with kthread to optimize the case, especially when kernel AIO is
 applied,
 still impossible with work queue too.
>>>
>>>
>>>
>>> OK, that one is actually a good point, since one need not do per-item
>>> queueing. We could handle different units, though. And we should have
>>> proper
>>> marking of the last item in a chain of stuff, so we might even be able to
>>> offload based on that instead of doing single items. It wont help the
>>> sync
>>> case, but for that, workqueue and kthread would be identical.
>>
>>
>> We may do that by introducing callback of queue_rq_list in blk_mq_ops,
>> and I will figure out one patch today to see if it can help the case.
>
>
> I don't think we should add to the interface, I prefer keeping it clean like
> it is right now. At least not if we can get around it. My point is that the
> driver already knows when the chain is complete, when REQ_LAST is set. So
> before that event triggers, it need not kick off IO, or at least i could do
> it in batches before that. That may not be fully reliable in case of
> queueing errors, but if REQ_LAST or 'error return' is used as the way to
> kick off pending IO, then that should be good enough. Haven't audited this
> in a while, but at least that is the intent of REQ_LAST.

Another point is that running N queue_work(rq) may cost more
than running one time queue_work(N rqs) since context still may
switch back and forth when executing queue_work().

Anyway I need to run test first to see if it can bring back throughout
on sequential read by handling them as batch.


Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] ARM: at91/tclib: mask interruptions at shutdown and probe

2014-08-20 Thread Arnd Bergmann
On Wednesday 20 August 2014, Gaël PORTAY wrote:
> +static void tc_shutdown (struct platform_device *pdev)
> +{
> +   int i;
> +   struct atmel_tc *tc = platform_get_drvdata(pdev);
> +
> +   for (i = 0; i < 3; i++)
> +   __raw_writel(0xff, tc->regs + ATMEL_TC_REG(i, IDR));
> +}

In general, __raw_readl/__raw_writel are not meant to be called by device 
drivers.

Just use readl/writel by default, or readl_relaxed/writel_relaxed if the code is
performance critical and you are sure it is safe to use them.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 09/18] ACPI / processor: Make it possible to get CPU hardware ID via GICC

2014-08-20 Thread Hanjun Guo
On 2014-8-20 22:56, Catalin Marinas wrote:
> On Tue, Aug 19, 2014 at 09:37:34AM +0100, Hanjun Guo wrote:
>> On 2014-8-18 22:27, Catalin Marinas wrote:
>>> On Mon, Aug 04, 2014 at 04:28:16PM +0100, Hanjun Guo wrote:
 diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
 index e32321c..4007313 100644
 --- a/drivers/acpi/processor_core.c
 +++ b/drivers/acpi/processor_core.c
 @@ -64,6 +64,38 @@ static int map_lsapic_id(struct acpi_subtable_header 
 *entry,
return 0;
  }
  
 +/*
 + * On ARM platform, MPIDR value is the hardware ID as apic ID
 + * on Intel platforms
 + */
 +static int map_gicc_mpidr(struct acpi_subtable_header *entry,
 +  int device_declaration, u32 acpi_id, int *mpidr)
 +{
 +  struct acpi_madt_generic_interrupt *gicc =
 +  container_of(entry, struct acpi_madt_generic_interrupt, header);
 +
 +  if (!(gicc->flags & ACPI_MADT_ENABLED))
 +  return -ENODEV;
 +
 +  /* In the GIC interrupt model, logical processors are
 +   * required to have a Processor Device object in the DSDT,
 +   * so we should check device_declaration here
 +   */
 +  if (device_declaration && (gicc->uid == acpi_id)) {
 +  /*
 +   * Only bits [0:7] Aff0, bits [8:15] Aff1, bits [16:23] Aff2
 +   * and bits [32:39] Aff3 are meaningful, so pack the Affx
 +   * fields into a single 32 bit identifier to accommodate the
 +   * acpi processor drivers.
 +   */
 +  *mpidr = ((gicc->arm_mpidr & 0xff) >> 8)
 +   | gicc->arm_mpidr;
>>>
>>> You can use pack_mpidr_into_32_bits().
>>
>> processor_core.c will be used by x86 and ia64 too, it will cause
>> compile error on !ARM64 platforms.
> 
> Oh. So we do we have an ARM-specific function in core ACPI code?

Yes, GICC is ARM-specific, but all the mapping functions (get apic_id/mpidr
via acpi_id in MADT) including x86/ia64 are all there, so it's better to put it
here to keep consistency.

Thanks
Hanjun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 8/8] staging: et131x: Implement NAPI support

2014-08-20 Thread Stephen Hemminger
On Wed, 20 Aug 2014 23:17:58 +0100
Mark Einon  wrote:

>  
> + if (budget > MAX_PACKETS_HANDLED)
> + limit = MAX_PACKETS_HANDLED;

Why this artificial restriction?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kbuild: Make scripts executable

2014-08-20 Thread Masahiro Yamada
Hi Michal,


On Wed, 20 Aug 2014 16:10:48 +0200
Michal Marek  wrote:

> The Makefiles call the respective interpreter explicitly, but this makes
> it easier to use the scripts manually.
> 
> Signed-off-by: Michal Marek 


I am not sure at all, but
it seems scripts/checkpatch.pl has a rule
to ban execute permissions.


# Check for incorrect file permissions
if ($line =~ /^new (file )?mode.*[7531]\d{0,2}$/) {
my $permhere = $here . "FILE: $realfile\n";
if ($realfile !~ m@scripts/@ &&
$realfile !~ /\.(py|pl|awk|sh)$/) {
ERROR("EXECUTE_PERMISSIONS",
  "do not set execute permissions for 
source files\n" . $permhere);
}
}



Best Regards
Masahiro Yamada
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PULL REQUEST] i2c for 3.17

2014-08-20 Thread Wolfram Sang
Linus,

here is the fixup for the 'lowlight' of my last pull request. I2C is not
selected anymore by I2C_ACPI. Instead, the code in question now depends
on I2C=y. Also, Mika has agreed to support me and be the maintainer for
I2C-ACPI related patches. Finally, a new-ID-patch came along last week.

Please pull,

   Wolfram


The following changes since commit 7d1311b93e58ed55f3a31cc8f94c4b8fe988a2b9:

  Linux 3.17-rc1 (2014-08-16 10:40:26 -0600)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git i2c/for-next

for you to fetch changes up to 4560d67722816cca4b2f3dfb1d7c5b902fd2075b:

  MAINTAINERS: add maintainer for ACPI parts of I2C (2014-08-19 10:34:08 -0500)


Alan Cox (1):
  i2c: i801: Add PCI ID for Intel Braswell

Lan Tianyu (1):
  i2c: rework kernel config I2C_ACPI

Wolfram Sang (1):
  MAINTAINERS: add maintainer for ACPI parts of I2C

 MAINTAINERS   |  7 +++
 drivers/i2c/Kconfig   | 15 ++-
 drivers/i2c/Makefile  |  2 +-
 drivers/i2c/busses/i2c-i801.c |  2 ++
 drivers/i2c/i2c-acpi.c|  2 ++
 include/linux/i2c.h   | 12 
 6 files changed, 26 insertions(+), 14 deletions(-)


signature.asc
Description: Digital signature


[PATCH] perf: Fallback to MAP__FUNCTION if daddr maps are NULL

2014-08-20 Thread Don Zickus
As we run "perf c2c" on more applications, we noticed we're missing significant
samples from a common customer's application.  Looking at the
/proc//maps file for the app, we see "rwxs" and "rwxp" permissions on many
of the shared memory & heap regions, and on all the thread stacks.

Because those regions have the "x" bit set, perf marks them with a MAP_FUNCTION
type.  Hence ip_resolve_data() never finds load or store events coming from
them.

We fixed this by re-calling thread__find_addr_location with MAP__FUNCTION in
the case where map is NULL as a last ditch effort to map the sample before
giving up and dropping it.

Reported-by: Joe Mario 
Tested-by: Joe Mario 
Signed-off-by: Don Zickus 
---
 tools/perf/util/machine.c |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0e5fea9..e62bd87 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1241,6 +1241,16 @@ static void ip__resolve_data(struct machine *machine, 
struct thread *thread,
 
thread__find_addr_location(thread, machine, m, MAP__VARIABLE, addr,
   );
+   if (al.map == NULL) {
+   /*
+* some shared data regions have execute bit set which puts
+* their mapping in the MAP__FUNCTION type array.
+* Check there as a fallback option before dropping the sample.
+*/
+   thread__find_addr_location(thread, machine, m, MAP__FUNCTION, 
addr,
+  );
+   }
+
ams->addr = addr;
ams->al_addr = al.addr;
ams->sym = al.sym;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 8/8] staging: et131x: Implement NAPI support

2014-08-20 Thread Stephen Hemminger
On Wed, 20 Aug 2014 23:17:58 +0100
Mark Einon  wrote:

> - bool done = true;
> + int count = 0;
> + int limit = budget;
> + bool not_done = false;

Don't use negative variables. Better to keep the original done variable.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/1] ipv4: net namespace does not inherit network configurations

2014-08-20 Thread Stephen Hemminger
On Thu, 21 Aug 2014 10:32:00 +0800
Zhu Yanjun  wrote:

> V2: Following the advice from Cong Wang, I submit a patch as normal.
> 
> Hi,all
> 
> I did a test on kernel3.16 rc6:
> 
> root@qemu1:~# echo 1 > /proc/sys/net/ipv6/conf/all/forwarding
> root@qemu1:~# echo 1 > /proc/sys/net/ipv4/conf/all/forwarding
> root@qemu1:~# ip netns list
> root@qemu1:~# ip netns add fib1
> root@qemu1:~# ip netns exec fib1 bash
> root@qemu1:~# cat /proc/sys/net/ipv6/conf/all/forwarding
> 0
> root@qemu1:~# cat /proc/sys/net/ipv4/conf/all/forwarding
> 1
> 
> The behavior of ipv4 and ipv6 is very inconsistent. I checked
> the kernel source code. I found that from this patch
> [ipv6: fix bad free of addrconf_init_net], the above difference
> appeared.
> 
> Since a net namespace is independent to another. That is, there
> is no any relationship between the net namespaces. So the behavior
> of ipv4 is not correct.
> 
> Based on this patch [ipv6: fix bad free of addrconf_init_net], I made
> a new patch to fix this problem on ipv4.
> 
> Any reply is appreciated. 
> 
> Zhu Yanjun (1):
>   ipv4: net namespace does not inherit network configurations
> 
>  net/ipv4/devinet.c | 29 -
>  1 file changed, 12 insertions(+), 17 deletions(-)
> 

This a semantic change to network namespaces and therefore is
likely to break existing applications using network namespaces.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 5/9] block: loop: convert to blk-mq

2014-08-20 Thread Ming Lei
On Thu, Aug 21, 2014 at 11:13 AM, Ming Lei  wrote:
> On Thu, Aug 21, 2014 at 10:58 AM, Jens Axboe  wrote:
>> On 2014-08-20 21:54, Ming Lei wrote:
>
>   From my investigation, context switch increases almost 50% with
> workqueue compared with kthread in loop in a quad-core VM. With
> kthread, requests may be handled as batch in cases which won't be
> blocked in read()/write()(like null_blk, tmpfs, ...), but it is
> impossible
> with
> workqueue any more.  Also block plug should have been used
> with kthread to optimize the case, especially when kernel AIO is
> applied,
> still impossible with work queue too.



 OK, that one is actually a good point, since one need not do per-item
 queueing. We could handle different units, though. And we should have
 proper
 marking of the last item in a chain of stuff, so we might even be able to
 offload based on that instead of doing single items. It wont help the
 sync
 case, but for that, workqueue and kthread would be identical.
>>>
>>>
>>> We may do that by introducing callback of queue_rq_list in blk_mq_ops,
>>> and I will figure out one patch today to see if it can help the case.
>>
>>
>> I don't think we should add to the interface, I prefer keeping it clean like
>> it is right now. At least not if we can get around it. My point is that the
>> driver already knows when the chain is complete, when REQ_LAST is set. So
>> before that event triggers, it need not kick off IO, or at least i could do
>> it in batches before that. That may not be fully reliable in case of
>> queueing errors, but if REQ_LAST or 'error return' is used as the way to
>> kick off pending IO, then that should be good enough. Haven't audited this
>> in a while, but at least that is the intent of REQ_LAST.
>
> Yes, I thought of too, but driver need another context for handling that,
> either workqueue or kthread, which may cause the introduced per-device
> workqueue useless.

Hmmm, a list should be enough, will do that.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 5/9] block: loop: convert to blk-mq

2014-08-20 Thread Ming Lei
On Thu, Aug 21, 2014 at 10:58 AM, Jens Axboe  wrote:
> On 2014-08-20 21:54, Ming Lei wrote:

   From my investigation, context switch increases almost 50% with
 workqueue compared with kthread in loop in a quad-core VM. With
 kthread, requests may be handled as batch in cases which won't be
 blocked in read()/write()(like null_blk, tmpfs, ...), but it is
 impossible
 with
 workqueue any more.  Also block plug should have been used
 with kthread to optimize the case, especially when kernel AIO is
 applied,
 still impossible with work queue too.
>>>
>>>
>>>
>>> OK, that one is actually a good point, since one need not do per-item
>>> queueing. We could handle different units, though. And we should have
>>> proper
>>> marking of the last item in a chain of stuff, so we might even be able to
>>> offload based on that instead of doing single items. It wont help the
>>> sync
>>> case, but for that, workqueue and kthread would be identical.
>>
>>
>> We may do that by introducing callback of queue_rq_list in blk_mq_ops,
>> and I will figure out one patch today to see if it can help the case.
>
>
> I don't think we should add to the interface, I prefer keeping it clean like
> it is right now. At least not if we can get around it. My point is that the
> driver already knows when the chain is complete, when REQ_LAST is set. So
> before that event triggers, it need not kick off IO, or at least i could do
> it in batches before that. That may not be fully reliable in case of
> queueing errors, but if REQ_LAST or 'error return' is used as the way to
> kick off pending IO, then that should be good enough. Haven't audited this
> in a while, but at least that is the intent of REQ_LAST.

Yes, I thought of too, but driver need another context for handling that,
either workqueue or kthread, which may cause the introduced per-device
workqueue useless.

thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 08/18] ARM64 / ACPI: Get the enable method for SMP initialization in ACPI way

2014-08-20 Thread Hanjun Guo
On 2014-8-20 22:52, Catalin Marinas wrote:
> On Tue, Aug 19, 2014 at 09:32:25AM +0100, Hanjun Guo wrote:
>> On 2014-8-18 22:27, Catalin Marinas wrote:
>>> On Mon, Aug 04, 2014 at 04:28:15PM +0100, Hanjun Guo wrote:
 +#ifdef CONFIG_ACPI
 +/*
 + * Get a cpu's boot method in the ACPI way.
 + */
 +static char * __init acpi_get_cpu_boot_method(void)
 +{
 +  /*
 +   * For ACPI 5.1, only two kind of methods are provided,
 +   * Parking protocol and PSCI, but Parking protocol is
 +   * specified for ARMv7 only, so make PSCI as the only method
 +   * for SMP initialization before the ACPI spec or Parking
 +   * protocol spec is updated.
 +   */
 +  switch (smp_boot_protocol()) {
 +  case ACPI_SMP_BOOT_PSCI:
 +  return "psci";
 +  case ACPI_SMP_BOOT_PARKING_PROTOCOL:
 +  default:
 +  return NULL;
 +  }
 +}
>>>
>>> Actually, do we even need to define smp_boot_protocol()? Is it used
>>> anywhere else apart from this patch (I still haven't gone through all
>>> patches)?
>>
>> It is just used in this patch. I think we can make the ACPI boot protocol
>> scalable in this way, if we support another boot protocol in ACPI in the
>> future, we can easily update the function to support it, does it make sense?
> 
> Not really. You just add additional code, enums, functions when all you
> do is check for acpi_psci_present() (or whatever new protocol you would
> get). If the enum is never going to be used outside this file, don't
> bother with additional functions.
> 
> BTW, it would be nicer if the acpi related functions are contained in as
> fewer files as possible. So here you could keep
> acpi_get_cpu_boot_method() in the acpi.c file. It only returns a string.

ok, I will update them.

Thanks
Hanjun


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Revert "platform/x86/toshiba-apci.c possible bad if test?"

2014-08-20 Thread Anton Altaparmakov
Hi Matthew,

There is no doubt that the revert was needed but I think the original check 
which is now reinstated is also wrong.

I do not know what the intention actually is so cannot say what the correct 
check is but just logically speaking the check makes no sense:

if (sscanf(buf, "%i", ) != 1 && (mode != 2 || mode != 1))

The condition "(mode != 2 || mode != 1)" will always be true given that mode 
can only have a single value: if mode == 1 then mode != 2 is true and thus the 
condition is true and vice versa if mode == 2 then mode != 1 is true and thus 
the condition is true, too.  And if mode is neither 1 nor 2 then both mode != 2 
and mode != 1 are true and thus the condition is true, too.  Thus no matter 
what value "mode" has, the condition is true thus there is no point in it being 
there thus the if clause is exactly the same as:

if (sscanf(buf, "%i", ) != 1)

Presumably that was not the original intention...

Perhaps the intention was to check that mode is either 1 or 2 and other values 
are not allowed?  If so the correct statement would be:

if (sscanf(buf, "%i", ) != 1 || (mode != 2 && mode != 1))

But as I said above I do not know if that was the original intention but it 
looks like that this may have been the intention...

Best regards,

Anton

On 21 Aug 2014, at 00:53, Linux Kernel Mailing List 
 wrote:

> Gitweb: 
> http://git.kernel.org/linus/;a=commit;h=8039aabb6c9f802bca04cc77ca210060a5b53916
> Commit: 8039aabb6c9f802bca04cc77ca210060a5b53916
> Parent: 186e4e89a0922d75fba476f15b723e541cc34bea
> Refname:refs/heads/master
> Author: Matthew Garrett 
> AuthorDate: Wed Aug 20 08:18:18 2014 -0700
> Committer:  Matthew Garrett 
> CommitDate: Wed Aug 20 08:18:18 2014 -0700
> 
>Revert "platform/x86/toshiba-apci.c possible bad if test?"
> 
>This reverts commit bdc3ae7221213963f438faeaa69c8b4a2195f491.
> 
>Signed-off-by: Matthew Garrett 
> ---
> drivers/platform/x86/toshiba_acpi.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/platform/x86/toshiba_acpi.c 
> b/drivers/platform/x86/toshiba_acpi.c
> index e4da61b..b062d3d 100644
> --- a/drivers/platform/x86/toshiba_acpi.c
> +++ b/drivers/platform/x86/toshiba_acpi.c
> @@ -1258,7 +1258,7 @@ static ssize_t toshiba_kbd_bl_mode_store(struct device 
> *dev,
>   int mode = -1;
>   int time = -1;
> 
> - if (sscanf(buf, "%i", ) != 1  || (mode != 2 || mode != 1))
> + if (sscanf(buf, "%i", ) != 1 && (mode != 2 || mode != 1))
>   return -EINVAL;
> 
>   /* Set the Keyboard Backlight Mode where:
-- 
Anton Altaparmakov  (replace at with @)
University of Cambridge Information Services, Roger Needham Building
7 JJ Thomson Avenue, Cambridge, CB3 0RB, UK

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] zram: report maximum used memory

2014-08-20 Thread David Horner
On Wed, Aug 20, 2014 at 10:41 PM, Minchan Kim  wrote:
> On Wed, Aug 20, 2014 at 10:20:07PM -0400, David Horner wrote:
>> On Wed, Aug 20, 2014 at 8:27 PM, Minchan Kim  wrote:
>> > Normally, zram user could get maximum memory usage zram consumed
>> > via polling mem_used_total with sysfs in userspace.
>> >
>> > But it has a critical problem because user can miss peak memory
>> > usage during update inverval of polling. For avoiding that,
>> > user should poll it with shorter interval(ie, 0.01s)
>> > with mlocking to avoid page fault delay when memory pressure
>> > is heavy. It would be troublesome.
>> >
>> > This patch adds new knob "mem_used_max" so user could see
>> > the maximum memory usage easily via reading the knob and reset
>> > it via "echo 0 > /sys/block/zram0/mem_used_max".
>> >
>> > Signed-off-by: Minchan Kim 
>> > ---
>> >  Documentation/ABI/testing/sysfs-block-zram | 10 ++
>> >  Documentation/blockdev/zram.txt|  1 +
>> >  drivers/block/zram/zram_drv.c  | 57 
>> > --
>> >  drivers/block/zram/zram_drv.h  |  1 +
>> >  4 files changed, 67 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/Documentation/ABI/testing/sysfs-block-zram 
>> > b/Documentation/ABI/testing/sysfs-block-zram
>> > index 025331c19045..ffd1ea7443dd 100644
>> > --- a/Documentation/ABI/testing/sysfs-block-zram
>> > +++ b/Documentation/ABI/testing/sysfs-block-zram
>> > @@ -120,6 +120,16 @@ Description:
>> > statistic.
>> > Unit: bytes
>> >
>> > +What:  /sys/block/zram/mem_used_max
>> > +Date:  August 2014
>> > +Contact:   Minchan Kim 
>> > +Description:
>> > +   The mem_used_max file is read/write and specifies the 
>> > amount
>> > +   of maximum memory zram have consumed to store compressed 
>> > data.
>> > +   For resetting the value, you should do "echo 0". Otherwise,
>> > +   you could see -EINVAL.
>> > +   Unit: bytes
>> > +
>> >  What:  /sys/block/zram/mem_limit
>> >  Date:  August 2014
>> >  Contact:   Minchan Kim 
>> > diff --git a/Documentation/blockdev/zram.txt 
>> > b/Documentation/blockdev/zram.txt
>> > index 9f239ff8c444..3b2247c2d4cf 100644
>> > --- a/Documentation/blockdev/zram.txt
>> > +++ b/Documentation/blockdev/zram.txt
>> > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is 
>> > wasteful.
>> > orig_data_size
>> > compr_data_size
>> > mem_used_total
>> > +   mem_used_max
>> >
>> >  8) Deactivate:
>> > swapoff /dev/zram0
>> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>> > index adc91c7ecaef..138787579478 100644
>> > --- a/drivers/block/zram/zram_drv.c
>> > +++ b/drivers/block/zram/zram_drv.c
>> > @@ -149,6 +149,41 @@ static ssize_t mem_limit_store(struct device *dev,
>> > return len;
>> >  }
>> >
>> > +static ssize_t mem_used_max_show(struct device *dev,
>> > +   struct device_attribute *attr, char *buf)
>> > +{
>> > +   u64 val = 0;
>> > +   struct zram *zram = dev_to_zram(dev);
>> > +
>> > +   down_read(>init_lock);
>> > +   if (init_done(zram))
>> > +   val = atomic64_read(>stats.max_used_pages);
>> > +   up_read(>init_lock);
>> > +
>> > +   return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
>> > +}
>> > +
>> > +static ssize_t mem_used_max_store(struct device *dev,
>> > +   struct device_attribute *attr, const char *buf, size_t len)
>> > +{
>> > +   int err;
>> > +   unsigned long val;
>> > +   struct zram *zram = dev_to_zram(dev);
>> > +   struct zram_meta *meta = zram->meta;
>> > +
>> > +   err = kstrtoul(buf, 10, );
>> > +   if (err || val != 0)
>> > +   return -EINVAL;
>> > +
>>
>> Yes - this works better for the user than explicit single "0" check
>> Thanks for testing.
>>
>> > +   down_read(>init_lock);
>> > +   if (init_done(zram))
>> > +   atomic64_set(>stats.max_used_pages,
>> > +   zs_get_total_size(meta->mem_pool));
>> > +   up_read(>init_lock);
>> > +
>> > +   return len;
>> > +}
>> > +
>> >  static ssize_t max_comp_streams_store(struct device *dev,
>> > struct device_attribute *attr, const char *buf, size_t len)
>> >  {
>> > @@ -461,6 +496,18 @@ out_cleanup:
>> > return ret;
>> >  }
>> >
>> > +static inline void update_used_max(struct zram *zram, const unsigned long 
>> > pages)
>> > +{
>> > +   u64 old_max, cur_max;
>> > +
>> > +   do {
>> > +   old_max = cur_max = 
>> > atomic64_read(>stats.max_used_pages);
>> > +   if (pages > cur_max)
>> > +   old_max = 
>> > atomic64_cmpxchg(>stats.max_used_pages,
>> > +   cur_max, pages);
>> > +   } while (old_max != cur_max);
>> > +}
>> > +
>>
>> 

Re: [PATCH 1/1] jump_label: tidy jump_label_ratelimit.h

2014-08-20 Thread Zhouyi Zhou
Thanks Jason for reviewing it 
> -Original Messages-
> From: "Jason Baron" 
> Sent Time: Thursday, August 21, 2014
> To: "Zhouyi Zhou" 
> Cc: drjo...@redhat.com, konrad.w...@oracle.com, 
> raghavendra...@linux.vnet.ibm.com, mi...@kernel.org, da...@davemloft.net, 
> han...@stressinduktion.org, linux-kernel@vger.kernel.org, "Zhouyi Zhou" 
> 
> Subject: Re: [PATCH 1/1] jump_label: tidy jump_label_ratelimit.h
> 
> Yes, that looks good. While at it I grep'd the tree for
> 'CONFIG_JUMP_LABEL', and found some uses in the
> netfilter code which should probably be
> 'HAVE_JUMP_LABEL' as well.
I have submitted two patches according to you suggestions:
https://lkml.org/lkml/2014/8/20/885
https://lkml.org/lkml/2014/8/20/883
Hope I have made them right

Thanks 
Zhouyi
> 
> Thanks,
> 
> -Jason
> 
> On 08/20/2014 05:29 AM, Zhouyi Zhou wrote:
> > jump_label_ratelimit.h is split from jump_label.h to enable the 
> > includers who don't want linux/workqueue.h.
> > As HAVE_JUMP_LABEL is only defined in jump_label.h, will following  
> > patch makes jump_labe_ratelimit.h more tidy?
> >
> > Compiled and Tested in x86_64
> > Signed-off-by: Zhouyi Zhou 
> > ---
> >  include/linux/jump_label_ratelimit.h |5 +
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> >
> > diff --git a/include/linux/jump_label_ratelimit.h 
> > b/include/linux/jump_label_ratelimit.h
> > index 089f70f..0d34d7e 100644
> > --- a/include/linux/jump_label_ratelimit.h
> > +++ b/include/linux/jump_label_ratelimit.h
> > @@ -4,15 +4,12 @@
> >  #include 
> >  #include 
> >  
> > -#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
> > +#ifdef HAVE_JUMP_LABEL
> >  struct static_key_deferred {
> > struct static_key key;
> > unsigned long timeout;
> > struct delayed_work work;
> >  };
> > -#endif
> > -
> > -#ifdef HAVE_JUMP_LABEL
> >  extern void static_key_slow_dec_deferred(struct static_key_deferred *key);
> >  extern void
> >  jump_label_rate_limit(struct static_key_deferred *key, unsigned long rl);
> 




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 5/9] block: loop: convert to blk-mq

2014-08-20 Thread Ming Lei
On Thu, Aug 21, 2014 at 12:09 AM, Jens Axboe  wrote:
> On 2014-08-19 20:23, Ming Lei wrote:
>>
>> On Wed, Aug 20, 2014 at 4:50 AM, Jens Axboe  wrote:
>>>
>>> On 2014-08-18 06:53, Ming Lei wrote:


 On Mon, Aug 18, 2014 at 9:22 AM, Ming Lei 
 wrote:
>
>
> On Mon, Aug 18, 2014 at 1:48 AM, Jens Axboe  wrote:
>>
>>
>> On 2014-08-16 02:06, Ming Lei wrote:
>>>
>>>
>>>
>>> On 8/16/14, Jens Axboe  wrote:



 On 08/15/2014 10:36 AM, Jens Axboe wrote:
>
>
>
> On 08/15/2014 10:31 AM, Christoph Hellwig wrote:
>>>
>>>
>>>
>>> +static void loop_queue_work(struct work_struct *work)
>>
>>
>>
>>
>> Offloading work straight to a workqueue dosn't make much sense
>> in the blk-mq model as we'll usually be called from one.  If you
>> need to avoid the cases where we are called directly a flag for
>> the blk-mq code to always schedule a workqueue sounds like a much
>> better plan.
>
>
>
>
> That's a good point - would clean up this bit, and be pretty close
> to
> a
> one-liner to support in blk-mq for the drivers that always need
> blocking
> context.




 Something like this should do the trick - totally untested. But with
 that, loop would just need to add BLK_MQ_F_WQ_CONTEXT to it's tag
 set
 flags and it could always do the work inline from ->queue_rq().
>>>
>>>
>>>
>>>
>>> I think it is a good idea.
>>>
>>> But for loop, there may be two problems:
>>>
>>> - default max_active for bound workqueue is 256, which means several
>>> slow
>>> loop devices might slow down whole block system. With kernel AIO, it
>>> won't
>>> be a big deal, but some block/fs may not support direct I/O and still
>>> fallback to
>>> workqueue
>>>
>>> - 6. Guidelines of Documentation/workqueue.txt
>>> If there is dependency among multiple work items used during memory
>>> reclaim, they should be queued to separate wq each with
>>> WQ_MEM_RECLAIM.
>>
>>
>>
>>
>> Both are good points. But I think this mainly means that we should
>> support
>> this through a potentially per-dispatch queue workqueue, separate from
>> kblockd. There's no reason blk-mq can't support this with a per-hctx
>> workqueue, for drivers that need it.
>
>
>
> Good idea, and per-device workqueue should be enough if
> BLK_MQ_F_WQ_CONTEXT flag is set.



 Maybe for most of cases per-device class(driver) workqueue should be
 enough since dependency between devices driven by same driver
 isn't common, for example, loop over loop is absolutely insane.
>>>
>>>
>>>
>>> It's insane, but it can happen. And given how cheap it is to do a
>>> workqueue,
>>
>>
>> Workqueue with WQ_MEM_RECLAIM need to create a standalone kthread
>> for the queue, so at default there will be 8 kthreads created even no one
>> uses loop at all.  From current implementation the per-device thread is
>> created only when one file or blk device is attached to the loop device,
>> which
>> may not be possible when blk-mq supports per-device workqueue.
>
>
> That is true, but I don't see this as a huge problem. And idle kthread is
> pretty much free...

OK, I am fine with that too if no one complains that, :-)

BTW, loop over loop won't be a problem since loop driver can cut the
dependency and just use the original back file, so one workqueue should
be enough for all loop devices.

>
>
>>> I don't see a reason why we should not. Loop over loop might seem nutty,
>>> but
>>> it's not that far out into the realm of nutty things that people end up
>>> doing.
>>
>>
>> Another reason I am still not sure if workqueue is good for loop, though I
>> do really like workqueue for sake of simplicity, :-)
>>
>> - sequential read becomes a bit slow with workqueue, especially for some
>> fast block(such as null_blk)
>>
>> - random read becomes a bit slow too for some fast devices(such as
>> null_blk)
>> in some environment(It is reproduced in my server, but can't in my laptop)
>> even
>> it can improve throughout quite much for common devices(HDD., SSD,..)
>
>
> Thread offloading will always slow down some use cases, like sync(ish) IO.
> Not sure this is a case against kthread vs workqueue, performance and
> behavior should be identical here?

Looks no sync is involved because I just test randread with fio, and
the cause should be same with below.

>
>
>>  From my investigation, context switch increases almost 50% with
>> workqueue compared with kthread in loop in a quad-core VM. With
>> kthread, requests may be handled as batch in cases which won't be
>> blocked in read()/write()(like null_blk, tmpfs, ...), but it is 

[PATCH 1/1] netfilter/jump_label: use HAVE_JUMP_LABEL?

2014-08-20 Thread Zhouyi Zhou

CONFIG_JUMP_LABEL doesn't ensure HAVE_JUMP_LABEL, if it
is not the case use maintainers's own mutex to guard
the modification of global values.

Signed-off-by: Zhouyi Zhou 
---
 include/linux/netfilter.h |5 +++--
 net/netfilter/core.c  |6 +++---
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 2077489..83a1952 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #ifdef CONFIG_NETFILTER
 static inline int NF_DROP_GETERR(int verdict)
@@ -99,8 +100,8 @@ void nf_unregister_sockopt(struct nf_sockopt_ops *reg);
 
 extern struct list_head nf_hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS];
 
-#if defined(CONFIG_JUMP_LABEL)
-#include 
+#ifdef HAVE_JUMP_LABEL
+
 extern struct static_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS];
 static inline bool nf_hooks_active(u_int8_t pf, unsigned int hook)
 {
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index a93c97f..024a2e2 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -54,7 +54,7 @@ EXPORT_SYMBOL_GPL(nf_unregister_afinfo);
 struct list_head nf_hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS] __read_mostly;
 EXPORT_SYMBOL(nf_hooks);
 
-#if defined(CONFIG_JUMP_LABEL)
+#ifdef HAVE_JUMP_LABEL
 struct static_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS];
 EXPORT_SYMBOL(nf_hooks_needed);
 #endif
@@ -72,7 +72,7 @@ int nf_register_hook(struct nf_hook_ops *reg)
}
list_add_rcu(>list, elem->list.prev);
mutex_unlock(_hook_mutex);
-#if defined(CONFIG_JUMP_LABEL)
+#ifdef HAVE_JUMP_LABEL
static_key_slow_inc(_hooks_needed[reg->pf][reg->hooknum]);
 #endif
return 0;
@@ -84,7 +84,7 @@ void nf_unregister_hook(struct nf_hook_ops *reg)
mutex_lock(_hook_mutex);
list_del_rcu(>list);
mutex_unlock(_hook_mutex);
-#if defined(CONFIG_JUMP_LABEL)
+#ifdef HAVE_JUMP_LABEL
static_key_slow_dec(_hooks_needed[reg->pf][reg->hooknum]);
 #endif
synchronize_net();
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 06/18] ARM64 / ACPI: Parse MADT to map logical cpu to MPIDR and get cpu_possible/present_map

2014-08-20 Thread Hanjun Guo
On 2014-8-20 22:38, Catalin Marinas wrote:
> On Tue, Aug 19, 2014 at 08:36:46AM +0100, Hanjun Guo wrote:
>> On 2014-8-18 22:27, Catalin Marinas wrote:
>>> On Mon, Aug 04, 2014 at 04:28:13PM +0100, Hanjun Guo wrote:
 diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
 index 6e04868..e877967 100644
 --- a/arch/arm64/include/asm/acpi.h
 +++ b/arch/arm64/include/asm/acpi.h
 @@ -64,6 +64,8 @@ static inline void arch_fix_phys_package_id(int num, u32 
 slot) { }
  extern int (*acpi_suspend_lowlevel)(void);
  #define acpi_wakeup_address 0
  
 +#define MAX_GIC_CPU_INTERFACE 65535
>>>
>>> Does this need to be more than NR_CPUS?
>>
>> Sometimes yes, CPU structure entries in MADT just like CPU nodes in
>> device tree, the number of them may more than NR_CPUS.
> 
> I have a more general question here. In ACPI, is MADT the only way to
> build a CPU topology? 

Unfortunately yes as far as I can tell.

> It looks weird that we use GIC properties to
> create the cpu_logical_map(). 

Actually information in GICC structures in MADT will both used
for GIC init and SMP init, GICC structures represents CPUs in the
system.

> A side-effect is that the GIC-related
> functions are now scattered all over the kernel rather than being
> contained in the GIC driver itself.

As patch 12/18 shows, all the GIC related code all contained in the
GIC driver, GICC structure is more than GIC-related but also CPUs in the
system (I have to admit that the name of GICC in ACPI is confusing).

Thanks
Hanjun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] powerpc/jump_label: use HAVE_JUMP_LABEL?

2014-08-20 Thread Zhouyi Zhou
CONFIG_JUMP_LABEL doesn't ensure HAVE_JUMP_LABEL, if it
is not the case use maintainers's own mutex to guard
the modification of global values.


Signed-off-by: Zhouyi Zhou 
---
 arch/powerpc/platforms/powernv/opal-tracepoints.c |2 +-
 arch/powerpc/platforms/pseries/lpar.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/opal-tracepoints.c 
b/arch/powerpc/platforms/powernv/opal-tracepoints.c
index d8a000a..ae14c40 100644
--- a/arch/powerpc/platforms/powernv/opal-tracepoints.c
+++ b/arch/powerpc/platforms/powernv/opal-tracepoints.c
@@ -2,7 +2,7 @@
 #include 
 #include 
 
-#ifdef CONFIG_JUMP_LABEL
+#ifdef HAVE_JUMP_LABEL
 struct static_key opal_tracepoint_key = STATIC_KEY_INIT;
 
 void opal_tracepoint_regfunc(void)
diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index 34e6423..059cfe0 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -642,7 +642,7 @@ EXPORT_SYMBOL(arch_free_page);
 #endif
 
 #ifdef CONFIG_TRACEPOINTS
-#ifdef CONFIG_JUMP_LABEL
+#ifdef HAVE_JUMP_LABEL
 struct static_key hcall_tracepoint_key = STATIC_KEY_INIT;
 
 void hcall_tracepoint_regfunc(void)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] zram: report maximum used memory

2014-08-20 Thread Minchan Kim
On Wed, Aug 20, 2014 at 10:20:07PM -0400, David Horner wrote:
> On Wed, Aug 20, 2014 at 8:27 PM, Minchan Kim  wrote:
> > Normally, zram user could get maximum memory usage zram consumed
> > via polling mem_used_total with sysfs in userspace.
> >
> > But it has a critical problem because user can miss peak memory
> > usage during update inverval of polling. For avoiding that,
> > user should poll it with shorter interval(ie, 0.01s)
> > with mlocking to avoid page fault delay when memory pressure
> > is heavy. It would be troublesome.
> >
> > This patch adds new knob "mem_used_max" so user could see
> > the maximum memory usage easily via reading the knob and reset
> > it via "echo 0 > /sys/block/zram0/mem_used_max".
> >
> > Signed-off-by: Minchan Kim 
> > ---
> >  Documentation/ABI/testing/sysfs-block-zram | 10 ++
> >  Documentation/blockdev/zram.txt|  1 +
> >  drivers/block/zram/zram_drv.c  | 57 
> > --
> >  drivers/block/zram/zram_drv.h  |  1 +
> >  4 files changed, 67 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-block-zram 
> > b/Documentation/ABI/testing/sysfs-block-zram
> > index 025331c19045..ffd1ea7443dd 100644
> > --- a/Documentation/ABI/testing/sysfs-block-zram
> > +++ b/Documentation/ABI/testing/sysfs-block-zram
> > @@ -120,6 +120,16 @@ Description:
> > statistic.
> > Unit: bytes
> >
> > +What:  /sys/block/zram/mem_used_max
> > +Date:  August 2014
> > +Contact:   Minchan Kim 
> > +Description:
> > +   The mem_used_max file is read/write and specifies the amount
> > +   of maximum memory zram have consumed to store compressed 
> > data.
> > +   For resetting the value, you should do "echo 0". Otherwise,
> > +   you could see -EINVAL.
> > +   Unit: bytes
> > +
> >  What:  /sys/block/zram/mem_limit
> >  Date:  August 2014
> >  Contact:   Minchan Kim 
> > diff --git a/Documentation/blockdev/zram.txt 
> > b/Documentation/blockdev/zram.txt
> > index 9f239ff8c444..3b2247c2d4cf 100644
> > --- a/Documentation/blockdev/zram.txt
> > +++ b/Documentation/blockdev/zram.txt
> > @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is 
> > wasteful.
> > orig_data_size
> > compr_data_size
> > mem_used_total
> > +   mem_used_max
> >
> >  8) Deactivate:
> > swapoff /dev/zram0
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index adc91c7ecaef..138787579478 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -149,6 +149,41 @@ static ssize_t mem_limit_store(struct device *dev,
> > return len;
> >  }
> >
> > +static ssize_t mem_used_max_show(struct device *dev,
> > +   struct device_attribute *attr, char *buf)
> > +{
> > +   u64 val = 0;
> > +   struct zram *zram = dev_to_zram(dev);
> > +
> > +   down_read(>init_lock);
> > +   if (init_done(zram))
> > +   val = atomic64_read(>stats.max_used_pages);
> > +   up_read(>init_lock);
> > +
> > +   return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> > +}
> > +
> > +static ssize_t mem_used_max_store(struct device *dev,
> > +   struct device_attribute *attr, const char *buf, size_t len)
> > +{
> > +   int err;
> > +   unsigned long val;
> > +   struct zram *zram = dev_to_zram(dev);
> > +   struct zram_meta *meta = zram->meta;
> > +
> > +   err = kstrtoul(buf, 10, );
> > +   if (err || val != 0)
> > +   return -EINVAL;
> > +
> 
> Yes - this works better for the user than explicit single "0" check
> Thanks for testing.
> 
> > +   down_read(>init_lock);
> > +   if (init_done(zram))
> > +   atomic64_set(>stats.max_used_pages,
> > +   zs_get_total_size(meta->mem_pool));
> > +   up_read(>init_lock);
> > +
> > +   return len;
> > +}
> > +
> >  static ssize_t max_comp_streams_store(struct device *dev,
> > struct device_attribute *attr, const char *buf, size_t len)
> >  {
> > @@ -461,6 +496,18 @@ out_cleanup:
> > return ret;
> >  }
> >
> > +static inline void update_used_max(struct zram *zram, const unsigned long 
> > pages)
> > +{
> > +   u64 old_max, cur_max;
> > +
> > +   do {
> > +   old_max = cur_max = 
> > atomic64_read(>stats.max_used_pages);
> > +   if (pages > cur_max)
> > +   old_max = 
> > atomic64_cmpxchg(>stats.max_used_pages,
> > +   cur_max, pages);
> > +   } while (old_max != cur_max);
> > +}
> > +
> 
> This can be tightened up some:

How many does it make tight?
If it's not a big, I'd like to stick my version.

> 
> +static inline void update_used_max(struct zram *zram, const unsigned
> 

RE: [PATCH v2] flush_icache_range: Export symbol to fix build errors

2014-08-20 Thread Tony Lu
>-Original Message-
>Fix building errors occuring due to a missing export of flush_icache_range()
>in
>
>kisskb.ellerman.id.au/kisskb/buildresult/11677809/
>
>ERROR: "flush_icache_range" [drivers/misc/lkdtm.ko] undefined!
>
>Signed-off-by: Pranith Kumar 
>Reported-by: Geert Uytterhoeven 
>CC: Andrew Morton 
>---
> arch/arc/mm/cache_arc700.c | 1 +
> arch/hexagon/mm/cache.c| 1 +
> arch/sh/mm/cache.c | 1 +
> arch/tile/kernel/smp.c | 1 +
> arch/xtensa/kernel/smp.c   | 1 +
> 5 files changed, 5 insertions(+)
>

For Tile,

Acked-by: Zhigang Lu 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 0/1] ipv4: net namespace does not inherit network configurations

2014-08-20 Thread Zhu Yanjun
V2: Following the advice from Cong Wang, I submit a patch as normal.

Hi,all

I did a test on kernel3.16 rc6:

root@qemu1:~# echo 1 > /proc/sys/net/ipv6/conf/all/forwarding
root@qemu1:~# echo 1 > /proc/sys/net/ipv4/conf/all/forwarding
root@qemu1:~# ip netns list
root@qemu1:~# ip netns add fib1
root@qemu1:~# ip netns exec fib1 bash
root@qemu1:~# cat /proc/sys/net/ipv6/conf/all/forwarding
0
root@qemu1:~# cat /proc/sys/net/ipv4/conf/all/forwarding
1

The behavior of ipv4 and ipv6 is very inconsistent. I checked
the kernel source code. I found that from this patch
[ipv6: fix bad free of addrconf_init_net], the above difference
appeared.

Since a net namespace is independent to another. That is, there
is no any relationship between the net namespaces. So the behavior
of ipv4 is not correct.

Based on this patch [ipv6: fix bad free of addrconf_init_net], I made
a new patch to fix this problem on ipv4.

Any reply is appreciated. 

Zhu Yanjun (1):
  ipv4: net namespace does not inherit network configurations

 net/ipv4/devinet.c | 29 -
 1 file changed, 12 insertions(+), 17 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] ipv4: net namespace does not inherit network configurations

2014-08-20 Thread Zhu Yanjun
Ipv4 net namespace requires a similar logic change as commit c900a800
[ipv6: fix bad free of addrconf_init_net] introduces for newer kernels.

Since a net namespace is independent to another. That is, there
is no any relationship between the net namespaces. So a new net
namespace should not inherit network configurations from another
net namespace including the host.

CC: Hong Zhiguo 
CC: David S. Miller 
Suggested-by: Cong Wang 
Signed-off-by: Zhu Yanjun 
---
 net/ipv4/devinet.c | 29 -
 1 file changed, 12 insertions(+), 17 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index e944937..a16aa39 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -2220,28 +2220,23 @@ static __net_init int devinet_init_net(struct net *net)
 #endif
 
err = -ENOMEM;
-   all = _devconf;
-   dflt = _devconf_dflt;
 
-   if (!net_eq(net, _net)) {
-   all = kmemdup(all, sizeof(ipv4_devconf), GFP_KERNEL);
-   if (all == NULL)
-   goto err_alloc_all;
-
-   dflt = kmemdup(dflt, sizeof(ipv4_devconf_dflt), GFP_KERNEL);
-   if (dflt == NULL)
-   goto err_alloc_dflt;
+   all = kmemdup(_devconf, sizeof(ipv4_devconf), GFP_KERNEL);
+   if (all == NULL)
+   goto err_alloc_all;
 
+   dflt = kmemdup(_devconf_dflt, sizeof(ipv4_devconf_dflt), 
GFP_KERNEL);
+   if (dflt == NULL)
+   goto err_alloc_dflt;
 #ifdef CONFIG_SYSCTL
-   tbl = kmemdup(tbl, sizeof(ctl_forward_entry), GFP_KERNEL);
-   if (tbl == NULL)
-   goto err_alloc_ctl;
+   tbl = kmemdup(tbl, sizeof(ctl_forward_entry), GFP_KERNEL);
+   if (tbl == NULL)
+   goto err_alloc_ctl;
 
-   tbl[0].data = >data[IPV4_DEVCONF_FORWARDING - 1];
-   tbl[0].extra1 = all;
-   tbl[0].extra2 = net;
+   tbl[0].data = >data[IPV4_DEVCONF_FORWARDING - 1];
+   tbl[0].extra1 = all;
+   tbl[0].extra2 = net;
 #endif
-   }
 
 #ifdef CONFIG_SYSCTL
err = __devinet_sysctl_register(net, "all", all);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] regulator: hi6421: Remove unused fields from struct hi6421_regulator_info

2014-08-20 Thread Axel Lin
The valid_modes_mask and *dev are not used in this driver, remove them.
Current code uses devm_regulator_register, so we don't need *regulator in
hi6421_regulator_info. Use a local variable instead.

Also removes a few unnecessary inclusion of header files.

Signed-off-by: Axel Lin 
---
 drivers/regulator/hi6421-regulator.c | 31 ---
 1 file changed, 4 insertions(+), 27 deletions(-)

diff --git a/drivers/regulator/hi6421-regulator.c 
b/drivers/regulator/hi6421-regulator.c
index b0de92b..e389920 100644
--- a/drivers/regulator/hi6421-regulator.c
+++ b/drivers/regulator/hi6421-regulator.c
@@ -17,19 +17,13 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 #include 
 #include 
-#include 
-#include 
 #include 
 #include 
 #include 
 #include 
 #include 
-#include 
-#include 
 
 /*
  * struct hi6421_regulator_pdata - Hi6421 regulator data of platform device
@@ -41,20 +35,14 @@ struct hi6421_regulator_pdata {
 
 /*
  * struct hi6421_regulator_info - hi6421 regulator information
- * @dev: device pointer
  * @desc: regulator description
- * @regulator: regulator device
  * @mode_mask: ECO mode bitmask of LDOs; for BUCKs, this masks sleep
  * @eco_microamp: eco mode load upper limit (in mA), valid for LDOs only
- * @valid_modes_mask: valid operating modes
  */
 struct hi6421_regulator_info {
-   struct device   *dev;
struct regulator_desc   desc;
-   struct regulator_dev*regulator;
u8  mode_mask;
u32 eco_microamp;
-   unsigned intvalid_modes_mask;
 };
 
 /* HI6421 regulators */
@@ -198,8 +186,6 @@ static const struct regulator_ops hi6421_buck345_ops;
},  \
.mode_mask  = ecomask,  \
.eco_microamp   = ecoamp,   \
-   .valid_modes_mask   = (REGULATOR_MODE_NORMAL\
-  | REGULATOR_MODE_IDLE),  \
}
 
 /* HI6421 LDO1~3 are linear voltage regulators at fixed uV_step
@@ -237,8 +223,6 @@ static const struct regulator_ops hi6421_buck345_ops;
},  \
.mode_mask  = ecomask,  \
.eco_microamp   = ecoamp,   \
-   .valid_modes_mask   = (REGULATOR_MODE_NORMAL\
-  | REGULATOR_MODE_IDLE),  \
}
 
 /* HI6421 LDOAUDIO is a linear voltage regulator with two 4-step ranges
@@ -276,8 +260,6 @@ static const struct regulator_ops hi6421_buck345_ops;
},  \
.mode_mask  = ecomask,  \
.eco_microamp   = ecoamp,   \
-   .valid_modes_mask   = (REGULATOR_MODE_NORMAL\
-  | REGULATOR_MODE_IDLE),  \
}
 
 /* HI6421 BUCK0/1/2 are linear voltage regulators at fixed uV_step
@@ -311,8 +293,6 @@ static const struct regulator_ops hi6421_buck345_ops;
.off_on_delay   = odelay,   \
},  \
.mode_mask  = sleepmask,\
-   .valid_modes_mask   = (REGULATOR_MODE_NORMAL\
-  | REGULATOR_MODE_STANDBY),   \
}
 
 /* HI6421 BUCK3/4/5 share similar configurations as LDOs, with exception
@@ -346,8 +326,6 @@ static const struct regulator_ops hi6421_buck345_ops;
.off_on_delay   = odelay,   \
},  \
.mode_mask  = sleepmask,\
-   .valid_modes_mask   = (REGULATOR_MODE_NORMAL\
-  | REGULATOR_MODE_STANDBY),   \
}
 
 /* HI6421 regulator information */
@@ -580,10 +558,10 @@ static int hi6421_regulator_register(struct 
platform_device *pdev,
 {
struct hi6421_regulator_info *info = NULL;
struct regulator_config config = { };
+   struct regulator_dev *rdev;
 
/* assign per-regulator data */
info = _regulator_info[id];
-   info->dev = >dev;
 
config.dev = >dev;
config.init_data = init_data;
@@ -592,12 +570,11 @@ static int hi6421_regulator_register(struct 
platform_device *pdev,
config.of_node = np;
 
/* register regulator with framework */
-   info->regulator = devm_regulator_register(>dev, >desc,
-   );
-   if (IS_ERR(info->regulator)) {
+   rdev = devm_regulator_register(>dev, >desc, );
+   if 

Re: [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu

2014-08-20 Thread Don Zickus
On Thu, Aug 21, 2014 at 09:37:04AM +0800, Chai Wen wrote:
> On 08/19/2014 09:36 AM, Chai Wen wrote:
> 
> > On 08/19/2014 04:38 AM, Don Zickus wrote:
> > 
> >> On Mon, Aug 18, 2014 at 09:02:00PM +0200, Ingo Molnar wrote:
> >>>
> >>> * Don Zickus  wrote:
> >>>
> >>> So I agree with the motivation of this improvement, but 
> >>> is this implementation namespace-safe?
> >>
> >> What namespace are you worried about colliding with?  I 
> >> thought softlockup_ would provide the safety??  Maybe I 
> >> am missing something obvious. :-(
> >
> > I meant PID namespaces - a PID in itself isn't guaranteed 
> > to be unique across the system.
> 
>  Ah, I don't think we thought about that.  Is there a better 
>  way to do this?  Is there a domain id or something that can 
>  be OR'd with the pid?
> >>>
> >>> What is always unique is the task pointer itself. We use pids 
> >>> when we interface with user-space - but we don't really do that 
> >>> here, right?
> >>
> >> No, I don't believe so.  Ok, so saving 'current' and comparing that should
> >> be enough, correct?
> >>
> > 
> > 
> > I am not sure of the safety about using pid here with namespace.
> > But as to the pointer of process, is there a chance that we got a 
> > 'historical'
> > address saved in the 'softlockup_warn_pid(or address)_saved' and the current
> > hogging process happened to get the same task pointer address?
> > If it never happens, I think the comparing of address is ok.
> > 
> 
> 
> Hi Ingo
> 
> what do you think of Don's solution- 'comparing of task pointer' ?
> Anyway this is just an additional check about some very special cases,
> so I think the issue that I am concerned above is not a problem at all.
> And after learning some concepts about PID namespace, I think comparing
> of task pointer is reliable dealing with PID namespace here.
> 
> And Don, If you want me to re-post this patch, please let me know that.

Sure, just quickly test with the task pointer to make sure it still works
and then re-post.

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: LPC IOMMU and VFIO MicroConference - Call for Participation

2014-08-20 Thread Jiang Liu
Hi Alex and Joerg,
I have my travel request approved but missed the registration window.
Hope I will be lucky:)
Regards!
Gerry

On 2014/8/21 1:10, Alex Williamson wrote:
> 
> Ok folks, it's time to submit your discussion proposals for the LPC
> IOMMU and VFIO uconf.  If you added an idea to the wiki, now is the time
> to formally propose it as a discussion topic.  If you have ideas how to
> make the IOMMU or VFIO subsystems better, now is the time to propose it.
> If you can't figure out how to make something work in the current
> infrastructure, now is the time to propose a discussion.  If you're
> adding new features and want to make sure we can support them, now is
> the time to propose a discussion.
> 
> I don't think we've seen a formal schedule yet, but many of us have
> conflicts with KVM Forum this year and I expect the LPC planning
> committee to take that into account, so please submit your proposals
> anyway and feel free to note your availability/conflicts in the "Note to
> organizers" section.
> 
> LPC is full, but there is a waiting list and the sooner you can get on
> it, the more likely you are to be registered.  I expect uconf discussion
> leads to have an advantage in moving through the queue and we may be
> able to provide discounted registration for discussion leads.  Thanks,
> 
> Alex
> 
> On Tue, 2014-08-12 at 11:20 +0200, Joerg Roedel wrote:
>> LPC IOMMU and VFIO MicroConference - Call for Participation
>> ===
>>
>> We are pleased to announce that this year there will be the first IOMMU
>> and VFIO MicroConference held at Linux Plumbers Conference in
>> Düsseldorf. An initial request for support of this micro conference
>> generated, among others, the following possible topic ideas:
>>
>>  * Improving generic IOMMU code and move code out of drivers
>>  * IOMMU device error handling
>>  * IOMMU Power Management
>>  * Virtualizing IOMMUs
>>  * Interface between IOMMUs an memory management
>>
>> More suggested topics can be found at the wiki page of the micro
>> conference:
>>
>>  http://wiki.linuxplumbersconf.org/2014:iommu_microconference
>>
>> We now ask for formal proposals for these discussions along with any
>> other topics or problems that need to be discussed in this area.
>>
>> The format of the micro conference will be roughly half-hour slots for
>> each topic, where the discussion lead gives a short introduction to the
>> problem and maybe sketches possible solutions. The rest of the slot is
>> open for discussions so that we come to an agreement how to move
>> forward.
>>
>> Please submit your formal proposal on the Linux Plumbers website (OpenID
>> login required) until August 31st at:
>>
>>  
>> http://www.linuxplumbersconf.org/2014/how-to-submit-microconference-discussions-topics/
>>
>> Hope to see you in Düsseldorf!
>>
>>
>>  Joerg Roedel and Alex Williamson
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] zram: report maximum used memory

2014-08-20 Thread David Horner
On Wed, Aug 20, 2014 at 8:27 PM, Minchan Kim  wrote:
> Normally, zram user could get maximum memory usage zram consumed
> via polling mem_used_total with sysfs in userspace.
>
> But it has a critical problem because user can miss peak memory
> usage during update inverval of polling. For avoiding that,
> user should poll it with shorter interval(ie, 0.01s)
> with mlocking to avoid page fault delay when memory pressure
> is heavy. It would be troublesome.
>
> This patch adds new knob "mem_used_max" so user could see
> the maximum memory usage easily via reading the knob and reset
> it via "echo 0 > /sys/block/zram0/mem_used_max".
>
> Signed-off-by: Minchan Kim 
> ---
>  Documentation/ABI/testing/sysfs-block-zram | 10 ++
>  Documentation/blockdev/zram.txt|  1 +
>  drivers/block/zram/zram_drv.c  | 57 
> --
>  drivers/block/zram/zram_drv.h  |  1 +
>  4 files changed, 67 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-block-zram 
> b/Documentation/ABI/testing/sysfs-block-zram
> index 025331c19045..ffd1ea7443dd 100644
> --- a/Documentation/ABI/testing/sysfs-block-zram
> +++ b/Documentation/ABI/testing/sysfs-block-zram
> @@ -120,6 +120,16 @@ Description:
> statistic.
> Unit: bytes
>
> +What:  /sys/block/zram/mem_used_max
> +Date:  August 2014
> +Contact:   Minchan Kim 
> +Description:
> +   The mem_used_max file is read/write and specifies the amount
> +   of maximum memory zram have consumed to store compressed data.
> +   For resetting the value, you should do "echo 0". Otherwise,
> +   you could see -EINVAL.
> +   Unit: bytes
> +
>  What:  /sys/block/zram/mem_limit
>  Date:  August 2014
>  Contact:   Minchan Kim 
> diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> index 9f239ff8c444..3b2247c2d4cf 100644
> --- a/Documentation/blockdev/zram.txt
> +++ b/Documentation/blockdev/zram.txt
> @@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is 
> wasteful.
> orig_data_size
> compr_data_size
> mem_used_total
> +   mem_used_max
>
>  8) Deactivate:
> swapoff /dev/zram0
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index adc91c7ecaef..138787579478 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -149,6 +149,41 @@ static ssize_t mem_limit_store(struct device *dev,
> return len;
>  }
>
> +static ssize_t mem_used_max_show(struct device *dev,
> +   struct device_attribute *attr, char *buf)
> +{
> +   u64 val = 0;
> +   struct zram *zram = dev_to_zram(dev);
> +
> +   down_read(>init_lock);
> +   if (init_done(zram))
> +   val = atomic64_read(>stats.max_used_pages);
> +   up_read(>init_lock);
> +
> +   return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> +}
> +
> +static ssize_t mem_used_max_store(struct device *dev,
> +   struct device_attribute *attr, const char *buf, size_t len)
> +{
> +   int err;
> +   unsigned long val;
> +   struct zram *zram = dev_to_zram(dev);
> +   struct zram_meta *meta = zram->meta;
> +
> +   err = kstrtoul(buf, 10, );
> +   if (err || val != 0)
> +   return -EINVAL;
> +

Yes - this works better for the user than explicit single "0" check
Thanks for testing.

> +   down_read(>init_lock);
> +   if (init_done(zram))
> +   atomic64_set(>stats.max_used_pages,
> +   zs_get_total_size(meta->mem_pool));
> +   up_read(>init_lock);
> +
> +   return len;
> +}
> +
>  static ssize_t max_comp_streams_store(struct device *dev,
> struct device_attribute *attr, const char *buf, size_t len)
>  {
> @@ -461,6 +496,18 @@ out_cleanup:
> return ret;
>  }
>
> +static inline void update_used_max(struct zram *zram, const unsigned long 
> pages)
> +{
> +   u64 old_max, cur_max;
> +
> +   do {
> +   old_max = cur_max = 
> atomic64_read(>stats.max_used_pages);
> +   if (pages > cur_max)
> +   old_max = 
> atomic64_cmpxchg(>stats.max_used_pages,
> +   cur_max, pages);
> +   } while (old_max != cur_max);
> +}
> +

This can be tightened up some:

+static inline void update_used_max(struct zram *zram, const unsigned
long pages)
+{
+   u64 prev_max, old_max = 0;
+
+   prev_max = atomic64_read(>stats.max_used_pages);
+   do while (pages > prev_max && prev_max != old_max) {
+   old_max = prev_max;
+   prev_max = atomic64_cmpxchg(>stats.max_used_pages,
+   old_max, pages);
+   };
+}
+

And then can be generalized to:


+static inline void 

Re: [PATCH 2/2] extcon: sm5502: EXTCON_SM5502 should depend on I2C

2014-08-20 Thread Chanwoo Choi
Hi Geert

Thanks for your report. I already sent a patch[1] to fix this build break
and I'll send pull request to includec this patch in 3.17-rc2.

[1] https://lkml.org/lkml/2014/8/13/761

Best Regards,
Chanwoo Choi


On 08/17/2014 07:08 PM, Geert Uytterhoeven wrote:
> EXTCON_SM5502 selects REGMAP_I2C, but if I2C=n:
> 
> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_byte_reg_read’:
> drivers/base/regmap/regmap-i2c.c:28: error: implicit declaration of function 
> ‘i2c_smbus_read_byte_data’
> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_byte_reg_write’:
> drivers/base/regmap/regmap-i2c.c:46: error: implicit declaration of function 
> ‘i2c_smbus_write_byte_data’
> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_word_reg_read’:
> drivers/base/regmap/regmap-i2c.c:64: error: implicit declaration of function 
> ‘i2c_smbus_read_word_data’
> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_smbus_word_reg_write’:
> drivers/base/regmap/regmap-i2c.c:82: error: implicit declaration of function 
> ‘i2c_smbus_write_word_data’
> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_i2c_write’:
> drivers/base/regmap/regmap-i2c.c:96: error: implicit declaration of function 
> ‘i2c_master_send’
> drivers/base/regmap/regmap-i2c.c: In function ‘regmap_i2c_gather_write’:
> drivers/base/regmap/regmap-i2c.c:117: error: implicit declaration of function 
> ‘i2c_check_functionality’
> drivers/base/regmap/regmap-i2c.c:130: error: implicit declaration of function 
> ‘i2c_transfer’
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
>  drivers/extcon/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/extcon/Kconfig b/drivers/extcon/Kconfig
> index 6f2f4727de2c..764f3a113e0a 100644
> --- a/drivers/extcon/Kconfig
> +++ b/drivers/extcon/Kconfig
> @@ -72,6 +72,7 @@ config EXTCON_PALMAS
>  
>  config EXTCON_SM5502
>   tristate "SM5502 EXTCON support"
> + depends on I2C
>   select IRQ_DOMAIN
>   select REGMAP_I2C
>   select REGMAP_IRQ
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 0/8] Per-user clock constraints

2014-08-20 Thread Andrew Lunn
On Mon, Aug 18, 2014 at 05:30:26PM +0200, Tomeu Vizoso wrote:
> Hi,
> 
> in this v7 of the patchset I have only rebased on top of 3.17rc1, with no 
> other
> changes. I have had to do a fair amount of fixing due to the rebase, more
> details below. Follows the original cover letter blurb:

Hi Tomeu

I would like to test this patchset, but patch 5 never made it to the
list. Do you have a git tree i can clone?

Thanks
Andrew
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[V0 PATCH 1/2] AMD-PVH: set EFER.NX and EFER.SCE for the boot vcpu

2014-08-20 Thread Mukesh Rathor
On AMD, NX feature must be enabled in the efer for NX to be honored in
the pte entries, otherwise protection fault. We also set SC for
system calls to be enabled.

Signed-off-by: Mukesh Rathor 
---
 arch/x86/xen/enlighten.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index c0cb11f..4af512d 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1499,6 +1499,17 @@ void __ref xen_pvh_secondary_vcpu_init(int cpu)
xen_pvh_set_cr_flags(cpu);
 }
 
+/* This is done in secondary_startup_64 for hvm guests. */
+static void __init xen_configure_efer(void)
+{
+   u64 efer;
+
+   rdmsrl(MSR_EFER, efer);
+   efer |= EFER_SCE;
+   efer |= (cpuid_edx(0x8001) & (1 << 20)) ? EFER_NX : 0;
+   wrmsrl(MSR_EFER, efer);
+}
+
 static void __init xen_pvh_early_guest_init(void)
 {
if (!xen_feature(XENFEAT_auto_translated_physmap))
@@ -1508,6 +1519,7 @@ static void __init xen_pvh_early_guest_init(void)
return;
 
xen_have_vector_callback = 1;
+   xen_configure_efer();
xen_pvh_set_cr_flags(0);
 
 #ifdef CONFIG_X86_32
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[V0 PATCH 0/2] AMD PVH domU support

2014-08-20 Thread Mukesh Rathor
Hi,

Here's first stab at AMD PVH domU support. Pretty much the only thing
needed is EFER bits set. Please review.

thanks,
Mukesh


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[V0 PATCH 2/2] AMD-PVH: set EFER.NX and EFER.SCE for secondary vcpus

2014-08-20 Thread Mukesh Rathor
The secondary vcpus come on kernel page tables which have the NX bit set
in pte entries for DS/SS. On AMD, EFER.NX must be set to avoid protection
fault.

Signed-off-by: Mukesh Rathor 
---
 arch/x86/xen/smp.c  | 28 
 arch/x86/xen/smp.h  |  1 +
 arch/x86/xen/xen-head.S | 21 +
 3 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 7005974..66058b9 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -37,6 +37,7 @@
 #include 
 #include "xen-ops.h"
 #include "mmu.h"
+#include "smp.h"
 
 cpumask_var_t xen_cpu_initialized_map;
 
@@ -99,8 +100,12 @@ static void cpu_bringup(void)
wmb();  /* make sure everything is out */
 }
 
-/* Note: cpu parameter is only relevant for PVH */
-static void cpu_bringup_and_idle(int cpu)
+/*
+ * Note: cpu parameter is only relevant for PVH. The reason for passing it
+ * is we can't do smp_processor_id until the percpu segments are loaded, for
+ * which we need the cpu number! So we pass it in rdi as first parameter.
+ */
+asmlinkage __visible void cpu_bringup_and_idle(int cpu)
 {
 #ifdef CONFIG_X86_64
if (xen_feature(XENFEAT_auto_translated_physmap) &&
@@ -374,11 +379,10 @@ cpu_initialize_context(unsigned int cpu, struct 
task_struct *idle)
ctxt->user_regs.fs = __KERNEL_PERCPU;
ctxt->user_regs.gs = __KERNEL_STACK_CANARY;
 #endif
-   ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle;
-
memset(>fpu_ctxt, 0, sizeof(ctxt->fpu_ctxt));
 
if (!xen_feature(XENFEAT_auto_translated_physmap)) {
+   ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle;
ctxt->flags = VGCF_IN_KERNEL;
ctxt->user_regs.eflags = 0x1000; /* IOPL_RING1 */
ctxt->user_regs.ds = __USER_DS;
@@ -416,12 +420,20 @@ cpu_initialize_context(unsigned int cpu, struct 
task_struct *idle)
 #ifdef CONFIG_X86_32
}
 #else
-   } else
-   /* N.B. The user_regs.eip (cpu_bringup_and_idle) is called with
-* %rdi having the cpu number - which means are passing in
-* as the first parameter the cpu. Subtle!
+   } else {
+   /*
+* The vcpu comes on kernel page tables which have the NX pte
+* bit set on AMD. This means before DS/SS is touched, NX in
+* EFER must be set. Hence the following assembly glue code.
+*/
+   ctxt->user_regs.eip = (unsigned long)pvh_cpu_bringup;
+
+   /* N.B. The bringup function cpu_bringup_and_idle is called with
+* %rdi having the cpu number - which means we are passing it in
+* as the first parameter. Subtle!
 */
ctxt->user_regs.rdi = cpu;
+   }
 #endif
ctxt->user_regs.esp = idle->thread.sp0 - sizeof(struct pt_regs);
ctxt->ctrlreg[3] = xen_pfn_to_cr3(virt_to_mfn(swapper_pg_dir));
diff --git a/arch/x86/xen/smp.h b/arch/x86/xen/smp.h
index c7c2d89..b20ba68 100644
--- a/arch/x86/xen/smp.h
+++ b/arch/x86/xen/smp.h
@@ -7,5 +7,6 @@ extern void xen_send_IPI_mask_allbutself(const struct cpumask 
*mask,
 extern void xen_send_IPI_allbutself(int vector);
 extern void xen_send_IPI_all(int vector);
 extern void xen_send_IPI_self(int vector);
+extern void pvh_cpu_bringup(int cpu);
 
 #endif
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 485b695..db8dca5 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -47,6 +47,27 @@ ENTRY(startup_xen)
 
__FINIT
 
+#ifdef CONFIG_XEN_PVH
+#ifdef CONFIG_X86_64
+/* Note that rdi contains the cpu number and must be preserved */
+ENTRY(pvh_cpu_bringup)
+   /* Gather features to see if NX implemented. (no EFER.NX on intel) */
+   movl$0x8001, %eax
+   cpuid
+   movl%edx,%esi
+
+   movl$MSR_EFER, %ecx
+   rdmsr
+   btsl$_EFER_SCE, %eax
+
+   btl $20,%esi
+   jnc 1f  /* No NX, skip it */
+   btsl$_EFER_NX, %eax
+1: wrmsr
+   jmp cpu_bringup_and_idle
+#endif /* CONFIG_X86_64 */
+#endif /* CONFIG_XEN_PVH */
+
 .pushsection .text
.balign PAGE_SIZE
 ENTRY(hypercall_page)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu

2014-08-20 Thread Liu, Chuansheng
Hello Daniel,

> -Original Message-
> From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org]
> Sent: Thursday, August 21, 2014 9:54 AM
> To: Liu, Chuansheng; l...@amacapital.net; pet...@infradead.org;
> r...@rjwysocki.net; mi...@redhat.com
> Cc: linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Changcheng;
> Wang, Xiaoming; Chakravarty, Souvik K
> Subject: Re: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the
> idle cpu
> 
> On 08/18/2014 10:37 AM, Chuansheng Liu wrote:
> > Implementing one new API wake_up_if_idle(), which is used to
> > wake up the idle CPU.
> 
> Is this patchset tested ? Did you check it solves the issue you were
> facing ?
We have done the basic test, and found the cores can exit C0 quickly with this 
patchset.
Basically once the _TIF_NEED_RESCHED is set, then the poll_idle() can be broken.

Please correct me if something is wrong, thanks.

> 
> > Suggested-by: Andy Lutomirski 
> > Signed-off-by: Chuansheng Liu 
> > ---
> >   include/linux/sched.h |1 +
> >   kernel/sched/core.c   |   16 
> >   2 files changed, 17 insertions(+)
> >
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 857ba40..3f89ac1 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1024,6 +1024,7 @@ struct sched_domain_topology_level {
> >   extern struct sched_domain_topology_level *sched_domain_topology;
> >
> >   extern void set_sched_topology(struct sched_domain_topology_level *tl);
> > +extern void wake_up_if_idle(int cpu);
> >
> >   #ifdef CONFIG_SCHED_DEBUG
> >   # define SD_INIT_NAME(type)   .name = #type
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 1211575..adf104f 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1620,6 +1620,22 @@ static void ttwu_queue_remote(struct
> task_struct *p, int cpu)
> > }
> >   }
> >
> > +void wake_up_if_idle(int cpu)
> > +{
> > +   struct rq *rq = cpu_rq(cpu);
> > +   unsigned long flags;
> > +
> > +   if (set_nr_if_polling(rq->idle)) {
> > +   trace_sched_wake_idle_without_ipi(cpu);
> > +   } else {
> > +   raw_spin_lock_irqsave(>lock, flags);
> > +   if (rq->curr == rq->idle)
> > +   smp_send_reschedule(cpu);
> > +   /* Else cpu is not in idle, do nothing here */
> > +   raw_spin_unlock_irqrestore(>lock, flags);
> > +   }
> > +}
> > +
> >   bool cpus_share_cache(int this_cpu, int that_cpu)
> >   {
> > return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
> >
> 
> 
> --
>    Linaro.org │ Open source software for ARM
> SoCs
> 
> Follow Linaro:   Facebook |
>  Twitter |
>  Blog



RE: [PATCH net-next 3/4] r8152: remove clear_bp function

2014-08-20 Thread Hayes Wang
: Sergei Shtylyov [mailto:sergei.shtyl...@cogentembedded.com] 
> Sent: Wednesday, August 20, 2014 8:01 PM
> To: Hayes Wang; net...@vger.kernel.org
> Cc: nic_swsd; linux-kernel@vger.kernel.org; linux-...@vger.kernel.org
> Subject: Re: [PATCH net-next 3/4] r8152: remove clear_bp function
[...]
> > r8152b_disable_aldps(tp);
> >
> > -   rtl_clear_bp(tp);
> >
> 
> Why leave 2 empty lines? One is enough.

The next patch would use another fucntion at the
same location. I skip removing the empty line and
re-adding it again. Is that better to do so? I would
resend the patches if the answer is yes.
 
Best Regards,
Hayes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] regulator: core: Add back the const qualifier for ops of struct regulator_desc

2014-08-20 Thread Axel Lin
Fix below build warning:
CC [M]  drivers/regulator/hi6421-regulator.o
drivers/regulator/hi6421-regulator.c:356:2: warning: initialization discards 
'const' qualifier from pointer target type [enabled by default]

This is a revert of commit 716845ebeb50 ("regulator: core: Fix build error due
to const qualifier for ops"). The build error was fixed by commit 39f5460d7f9c
("regulator: core: add const to regulator_ops and fix build error in mc13892").

Signed-off-by: Axel Lin 
---
 include/linux/regulator/driver.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/regulator/driver.h b/include/linux/regulator/driver.h
index 3abda75..efe058f 100644
--- a/include/linux/regulator/driver.h
+++ b/include/linux/regulator/driver.h
@@ -246,7 +246,7 @@ struct regulator_desc {
int id;
bool continuous_voltage_range;
unsigned n_voltages;
-   struct regulator_ops *ops;
+   const struct regulator_ops *ops;
int irq;
enum regulator_type type;
struct module *owner;
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] sctp: not send SCTP_PEER_ADDR_CHANGE notifications with failed probe

2014-08-20 Thread Vlad Yasevich
On 08/20/2014 05:31 AM, Zhu Yanjun wrote:
> Since the transport has always been in state SCTP_UNCONFIRMED, it
> therefore wasn't active before and hasn't been used before, and it
> always has been, so it is unnecessary to bug the user with a 
> notification.
> 
> Reported-by: Deepak Khandelwal   
> Suggested-by: Vlad Yasevich  
> Suggested-by: Michael Tuexen 
> Suggested-by: Daniel Borkmann 
> Signed-off-by: Zhu Yanjun 

Acked-by: Vlad Yasevich 

Thanks
-vlad
> ---
>  net/sctp/associola.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> index 9de23a2..2e23f6b 100644
> --- a/net/sctp/associola.c
> +++ b/net/sctp/associola.c
> @@ -813,6 +813,7 @@ void sctp_assoc_control_transport(struct sctp_association 
> *asoc,
>   else {
>   dst_release(transport->dst);
>   transport->dst = NULL;
> + ulp_notify = false;
>   }
>  
>   spc_state = SCTP_ADDR_UNREACHABLE;
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] acct: eliminate compile warning

2014-08-20 Thread Ying Xue
If ACCT_VERSION is not defined to 3, below warning appears:
  CC  kernel/acct.o
  kernel/acct.c: In function ‘do_acct_process’:
  kernel/acct.c:475:24: warning: unused variable ‘ns’ [-Wunused-variable]

Signed-off-by: Ying Xue 
---
 kernel/acct.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/acct.c b/kernel/acct.c
index b4c667d..bb52701 100644
--- a/kernel/acct.c
+++ b/kernel/acct.c
@@ -472,7 +472,6 @@ static void do_acct_process(struct bsd_acct_struct *acct)
acct_t ac;
unsigned long flim;
const struct cred *orig_cred;
-   struct pid_namespace *ns = acct->ns;
struct file *file = acct->file;
 
/*
@@ -500,9 +499,10 @@ static void do_acct_process(struct bsd_acct_struct *acct)
ac.ac_gid16 = ac.ac_gid;
 #endif
 #if ACCT_VERSION == 3
-   ac.ac_pid = task_tgid_nr_ns(current, ns);
+   ac.ac_pid = task_tgid_nr_ns(current, acct->ns);
rcu_read_lock();
-   ac.ac_ppid = task_tgid_nr_ns(rcu_dereference(current->real_parent), ns);
+   ac.ac_ppid = task_tgid_nr_ns(rcu_dereference(current->real_parent),
+acct->ns);
rcu_read_unlock();
 #endif
/*
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu

2014-08-20 Thread Daniel Lezcano

On 08/18/2014 10:37 AM, Chuansheng Liu wrote:

Implementing one new API wake_up_if_idle(), which is used to
wake up the idle CPU.


Is this patchset tested ? Did you check it solves the issue you were 
facing ?



Suggested-by: Andy Lutomirski 
Signed-off-by: Chuansheng Liu 
---
  include/linux/sched.h |1 +
  kernel/sched/core.c   |   16 
  2 files changed, 17 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 857ba40..3f89ac1 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1024,6 +1024,7 @@ struct sched_domain_topology_level {
  extern struct sched_domain_topology_level *sched_domain_topology;

  extern void set_sched_topology(struct sched_domain_topology_level *tl);
+extern void wake_up_if_idle(int cpu);

  #ifdef CONFIG_SCHED_DEBUG
  # define SD_INIT_NAME(type)   .name = #type
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1211575..adf104f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1620,6 +1620,22 @@ static void ttwu_queue_remote(struct task_struct *p, int 
cpu)
}
  }

+void wake_up_if_idle(int cpu)
+{
+   struct rq *rq = cpu_rq(cpu);
+   unsigned long flags;
+
+   if (set_nr_if_polling(rq->idle)) {
+   trace_sched_wake_idle_without_ipi(cpu);
+   } else {
+   raw_spin_lock_irqsave(>lock, flags);
+   if (rq->curr == rq->idle)
+   smp_send_reschedule(cpu);
+   /* Else cpu is not in idle, do nothing here */
+   raw_spin_unlock_irqrestore(>lock, flags);
+   }
+}
+
  bool cpus_share_cache(int this_cpu, int that_cpu)
  {
return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);




--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mmc:sdhci: handle busy-end interrupt during command

2014-08-20 Thread Chanho Min
It is fully legal for a controller to start handling busy-end interrupt
before it has signaled that the command has completed. So make sure
we do things in the proper order, Or it results that command interrupt
is ignored so it can cause unexpected operations. This is founded at some
toshiba emmc with the bellow warning.

"mmc0: Got command interrupt 0x0001 even though
no command operation was in progress."

This issue has been also reported by Youssef TRIKI:
It is not specific to Toshiba devices, and happens with eMMC devices
as well as SD card which support Auto-CMD12 rather than CMD23.

Also, similar patch is submitted by:
Gwendal Grignou 

Signed-off-by: Hankyung Yu 
Signed-off-by: Chanho Min 
Tested-by: Youssef TRIKI 
---
 drivers/mmc/host/sdhci.c  |   17 +++--
 include/linux/mmc/sdhci.h |1 +
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 47055f3..2383be0 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1007,6 +1007,7 @@ void sdhci_send_command(struct sdhci_host *host, struct 
mmc_command *cmd)
mod_timer(>timer, timeout);
 
host->cmd = cmd;
+   host->busy_handle = 0;
 
sdhci_prepare_data(host, cmd);
 
@@ -2238,8 +2239,12 @@ static void sdhci_cmd_irq(struct sdhci_host *host, u32 
intmask)
if (host->cmd->data)
DBG("Cannot wait for busy signal when also "
"doing a data transfer");
-   else if (!(host->quirks & SDHCI_QUIRK_NO_BUSY_IRQ))
+   else if (!(host->quirks & SDHCI_QUIRK_NO_BUSY_IRQ)
+   && !host->busy_handle) {
+   /* Mark that command complete before busy is ended */
+   host->busy_handle = 1;
return;
+   }
 
/* The controller does not support the end-of-busy IRQ,
 * fall through and take the SDHCI_INT_RESPONSE */
@@ -2302,7 +2307,15 @@ static void sdhci_data_irq(struct sdhci_host *host, u32 
intmask)
 */
if (host->cmd && (host->cmd->flags & MMC_RSP_BUSY)) {
if (intmask & SDHCI_INT_DATA_END) {
-   sdhci_finish_command(host);
+   /*
+* Some cards handle busy-end interrupt
+* before the command completed, so make
+* sure we do things in the proper order.
+*/
+   if (host->busy_handle)
+   sdhci_finish_command(host);
+   else
+   host->busy_handle = 1;
return;
}
}
diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h
index 08abe99..f91085b 100644
--- a/include/linux/mmc/sdhci.h
+++ b/include/linux/mmc/sdhci.h
@@ -149,6 +149,7 @@ struct sdhci_host {
struct mmc_command *cmd;/* Current command */
struct mmc_data *data;  /* Current data request */
unsigned int data_early:1;  /* Data finished before cmd */
+   unsigned int busy_handle:1; /* Handling the order of Busy-end */
 
struct sg_mapping_iter sg_miter;/* SG state for PIO */
unsigned int blocks;/* remaining PIO blocks */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mtd: fsl_ifc_nand: Recover corrupted empty page for preventing read-only mount in UBIFS

2014-08-20 Thread Scott Wood
On Tue, 2014-04-01 at 01:49 +, Eunbong Song wrote:
> Even if the meaning of EUCLEAN was changed by commit edbc4540.
> There is still possibility of read-only mount in UBIFS with ubifs_scan() 
> "corrupt empty space at LEB".
> So i made this patch for fix that problem.

Please elaborate on the nature of the problem.

> This patch do as follow.
>  - If there are ecc errors which is equal to or less than chip->ecc.strength 
> in page.
>  - Check that page has how many zero bits, and if zero bits are equal to or 
> less than
>chip->ecc.strength then overwrite 1 to zero bits in buf.

This is difficult to parse, with no mention in this sentence that you're
talking about corrupted empty pages.

> ubifs_scan() cannot detect corrupted empty space because buf is recovered by 
> this patch.
> And this is safe because ecc controller can correct up to chip->ecc.strength 
> bits.

So the concern is that is_blank is failing to report a page that has not
been written to but has errors that would have been correctable if the
page had been written?

Do most drivers handle this?

> Signed-off-by: Eunbong Song 
> ---
>  drivers/mtd/nand/fsl_ifc_nand.c |   41 
> +++
>  1 files changed, 41 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/mtd/nand/fsl_ifc_nand.c b/drivers/mtd/nand/fsl_ifc_nand.c
> index 90ca7e7..2129c39 100644
> --- a/drivers/mtd/nand/fsl_ifc_nand.c
> +++ b/drivers/mtd/nand/fsl_ifc_nand.c
> @@ -277,6 +277,42 @@ static int is_blank(struct mtd_info *mtd, unsigned int 
> bufnum)
>   return 1;
>  }
>  
> +static int num_zero_bits(uint8_t val)
> +{
> + int i, ret=0;
> +
> + for(i=7; i>=0 ; i--)
> + if(!(0x1 & (val >> i)))
> + ret++;

Whitespace (here and elsewhere)

Also, use hweight8(~val) instead of reimplementing it.  Or better, use
hweight64() and process the data in larger chunks.

> + return ret;
> +}
> +
> +static int is_corrupted_blank(struct mtd_info *mtd, uint8_t * buf)
> +{
> + struct nand_chip *chip = mtd->priv;
> + int i;
> + int zero_bits = 0;
> +
> + for (i = 0; i < mtd->writesize ; i++) {
> + if(buf[i] != 0xff) {
> + zero_bits += num_zero_bits(buf[i]); 
> + }
> + }
> +
> + if(zero_bits && (zero_bits <= chip->ecc.strength)){
> + return 1;
> + }
> +
> + return 0;
> +}

What if it's a page that legitimately has only a handful of zero bits?
You need to count zero bits in the ECC as well.

Also, this could be combined with is_blank().

> +static void recover_corrupted_blank(struct mtd_info *mtd, uint8_t * buf)
> +{
> + memset(buf, 0xff, mtd->writesize);
> + return;
> +}
> +
>  /* returns nonzero if entire page is blank */
>  static int check_read_ecc(struct mtd_info *mtd, struct fsl_ifc_ctrl *ctrl,
> u32 *eccstat, unsigned int bufnum)
> @@ -760,6 +796,11 @@ static int fsl_ifc_read_page(struct mtd_info *mtd, 
> struct nand_chip *chip,
>   if (ctrl->nand_stat != IFC_NAND_EVTER_STAT_OPC)
>   mtd->ecc_stats.failed++;
>  
> + if(nctrl->max_bitflips && (nctrl->max_bitflips <= chip->ecc.strength)){
> + if(is_corrupted_blank(mtd, buf))
> + recover_corrupted_blank(mtd, buf);
> + }

If the page is blank except for errors, most likely max_bitflips will be
zero because fsl_ifc_run_command() already considered it an
uncorrectable error and set ECCER instead.  Moving corrupted blank page
detection into is_blank() wouldn't have this problem.

How did you test this patch?

-Scott


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v4 0/14] input: cyapa: re-architecture driver to support multi-trackpads in one driver

2014-08-20 Thread Dudley Du
Hi Dmitry, Patrik,

Is there any update or feedback on the re-submitted v4 cyapa driver patches?

Thanks,
Dudley

> -Original Message-
> From: linux-input-ow...@vger.kernel.org [mailto:linux-input-
> ow...@vger.kernel.org] On Behalf Of Dudley Du
> Sent: Thursday, July 17, 2014 2:45 PM
> To: Dmitry Torokhov; Rafael J. Wysocki
> Cc: Benson Leung; Patrik Fimml; linux-in...@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: [PATCH v4 0/14] input: cyapa: re-architecture driver to support
> multi-trackpads in one driver
>
> This patch set is made based on kernel 3.16.0-rc5.
> It's aimed to re-architecture the cyapa driver to support
> old gen3 trackpad device and new gen5 trackpad device in one
> cyapa driver for easily productions support based on
> customers' requirements, and add sysfs functions and interfaces
> supported that required by users and customers.
> Because the earlier gen3 and the latest gen5 trackpad devies using
> two different chipsets, and have different protocol and interfaces.
> If supported these two trackpad devices in two different drivers, then
> it will be difficult to manage productions and later firmware updates.
> it will cause customer don't know which one to use and update
> because these two trackpad devices have been used and integrated
> in same one productions at a time, so must to support these two trackpad
> devices in same on driver.
>
> Compare to v3, it has below changes:
> 1) Eliminate irq-state-remembering logic, remove irq help functions;
> 2) Remove state sync help functions instead of mutex_lock/mutex_unlock;
> 3) Fix comments and charaters errors and not consistent issues.
> 4) Fix other issues that pointed out in the review.
>
>
> The new architecture is made of:
> cyapa.c - the core of the architecture, supply interfaces and
> functions to system and read trackpad devices.
> cyapa_gen3.c - functions support for gen3 trackpad devices,
> cyapa_gen5.c - functions support for gen5 trackpad devices.
>
> Beside this introduction patch, it has 14 patches listed as below.
> For these patches each one is patched based on previous one.
>
> patch 1/14: re-architecture cyapa driver with core functions,
> and applying the device detecting function in async thread to speed
> up system boot time.
>
> patch 2/14: add cyapa driver power management interfaces support.
>
> patch 3/14: add cyapa driver runtime power management interfaces support.
>
> patch 4/14: add cyapa key function interfaces in sysfs system.
> Including read firmware version, get production ID, read baseline,
> re-calibrate trackpad baselines and do trackpad firmware update.
>
> patch 5/14: add read firmware image and read raw trackpad device'
> sensors' raw data interface in debugfs system.
>
> patch 6/14: add gen3 trackpad device basic functions support.
>
> patch 7/14: add gen3 trackpad device firmware update function support.
>
> patch 8/14: add gen3 trackpad device report baseline and do force
> re-calibrate functions support.
>
> patch 9/14: add gen3 trackpad device read firmware image function support.
>
> patch 10/14: add gen5 trackpad device basic functions support.
>
> patch 11/14: add gen5 trackpad device firmware update function support.
>
> patch 12/14: add gen5 trackpad device report baseline and do force
> re-calibrate functions support.
>
> patch 13/14: add gen5 trackpad device read firmware image and report
> sensors' raw data values functions support.
>
> patch 14/14: add function to monitor LID close event to off trackpad device.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-input" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
This message and any attachments may contain Cypress (or its subsidiaries) 
confidential information. If it has been received in error, please advise the 
sender and immediately delete this message.
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu

2014-08-20 Thread Chai Wen
On 08/19/2014 09:36 AM, Chai Wen wrote:

> On 08/19/2014 04:38 AM, Don Zickus wrote:
> 
>> On Mon, Aug 18, 2014 at 09:02:00PM +0200, Ingo Molnar wrote:
>>>
>>> * Don Zickus  wrote:
>>>
>>> So I agree with the motivation of this improvement, but 
>>> is this implementation namespace-safe?
>>
>> What namespace are you worried about colliding with?  I 
>> thought softlockup_ would provide the safety??  Maybe I 
>> am missing something obvious. :-(
>
> I meant PID namespaces - a PID in itself isn't guaranteed 
> to be unique across the system.

 Ah, I don't think we thought about that.  Is there a better 
 way to do this?  Is there a domain id or something that can 
 be OR'd with the pid?
>>>
>>> What is always unique is the task pointer itself. We use pids 
>>> when we interface with user-space - but we don't really do that 
>>> here, right?
>>
>> No, I don't believe so.  Ok, so saving 'current' and comparing that should
>> be enough, correct?
>>
> 
> 
> I am not sure of the safety about using pid here with namespace.
> But as to the pointer of process, is there a chance that we got a 'historical'
> address saved in the 'softlockup_warn_pid(or address)_saved' and the current
> hogging process happened to get the same task pointer address?
> If it never happens, I think the comparing of address is ok.
> 


Hi Ingo

what do you think of Don's solution- 'comparing of task pointer' ?
Anyway this is just an additional check about some very special cases,
so I think the issue that I am concerned above is not a problem at all.
And after learning some concepts about PID namespace, I think comparing
of task pointer is reliable dealing with PID namespace here.

And Don, If you want me to re-post this patch, please let me know that.

thanks
chai wen

> thanks
> chai wen
> 
>> Cheers,
>> Don
>> .
>>
> 
> 
> 



-- 
Regards

Chai Wen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3] thermal: exynos: Add support for TRIM_RELOAD feature at Exynos3250

2014-08-20 Thread Chanwoo Choi
Dear Eduardo,

On 08/20/2014 10:38 PM, edubez...@gmail.com wrote:
> Hello Chanwoo,
> 
> On Tue, Aug 19, 2014 at 7:52 PM, Chanwoo Choi  wrote:
>> This patch add support for TRIM_RELOAD feature at Exynos3250. The TMU of
>> Exynos3250 has two TRIMINFO_CON register.
> 
> Can you please split the two changes above into two patches? Meaning,
> one that adds TRIMINFO_CON2 and another that adds TRIM_RELOAD?

OK, I'll split this patch as two patches.

> 
>>
>> Signed-off-by: Chanwoo Choi 
>> Acked-by: Kyungmin Park 
>> Cc: Zhang Rui 
>> Cc: Eduardo Valentin 
>> Cc: Amit Daniel Kachhap 
>> ---
>> Changes from v2:
>> - Fix build break because of missing 'or' operation.
>> Changes from v1:
>> - Add missing 'TMU_SUPPORT_TRIM_RELOAD' feature
>>
>>  drivers/thermal/samsung/exynos_tmu.c  |  7 +--
>>  drivers/thermal/samsung/exynos_tmu.h  |  5 +++--
>>  drivers/thermal/samsung/exynos_tmu_data.c | 11 +--
>>  drivers/thermal/samsung/exynos_tmu_data.h |  7 +--
>>  4 files changed, 22 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/thermal/samsung/exynos_tmu.c 
>> b/drivers/thermal/samsung/exynos_tmu.c
>> index acbff14..ed01606 100644
>> --- a/drivers/thermal/samsung/exynos_tmu.c
>> +++ b/drivers/thermal/samsung/exynos_tmu.c
>> @@ -164,8 +164,11 @@ static int exynos_tmu_initialize(struct platform_device 
>> *pdev)
>> }
>> }
>>
>> -   if (TMU_SUPPORTS(pdata, TRIM_RELOAD))
>> -   __raw_writel(1, data->base + reg->triminfo_ctrl);
>> +   if (TMU_SUPPORTS(pdata, TRIM_RELOAD)) {
>> +   for (i = 0; i < pdata->triminfo_reload_count; i++)
>> +   __raw_writel(pdata->triminfo_reload[i],
>> +   data->base + reg->triminfo_ctrl[i]);
>> +   }
> 
> What is the logic behind the trim reload feature? Which SoCs support it?

TRIMINFO_CONTROL register has 'RELOAD' field. TMU of Exynos SOC have to set
'RELOAD' field of TRIMINFO_CONTROL register before reading TRIMINFO register.

As I know, Exynos4412/Exynos4212 and Exynos3250 SoC need RELOAD feature.

> 
>>
>> if (pdata->cal_mode == HW_MODE)
>> goto skip_calib_data;
>> diff --git a/drivers/thermal/samsung/exynos_tmu.h 
>> b/drivers/thermal/samsung/exynos_tmu.h
>> index 1b4a644..72cb54e 100644
>> --- a/drivers/thermal/samsung/exynos_tmu.h
>> +++ b/drivers/thermal/samsung/exynos_tmu.h
>> @@ -151,8 +151,7 @@ struct exynos_tmu_registers {
>> u32 triminfo_25_shift;
>> u32 triminfo_85_shift;
>>
>> -   u32 triminfo_ctrl;
>> -   u32 triminfo_ctrl1;
>> +   u32 triminfo_ctrl[2];
> 
> 
> The above change needs to be documented.

OK, I'll add it.

> 
>> u32 triminfo_reload_shift;
>>
>> u32 tmu_ctrl;
>> @@ -295,6 +294,8 @@ struct exynos_tmu_platform_data {
>> u8 second_point_trim;
>> u8 default_temp_offset;
>> u8 test_mux;
>> +   u8 triminfo_reload[2];
>> +   u8 triminfo_reload_count;
>>
> 
> The above addition needs to be documented too.

OK, I'll add it.

> 
>> enum calibration_type cal_type;
>> enum calibration_mode cal_mode;
>> diff --git a/drivers/thermal/samsung/exynos_tmu_data.c 
>> b/drivers/thermal/samsung/exynos_tmu_data.c
>> index aa8e0de..8cd609c 100644
>> --- a/drivers/thermal/samsung/exynos_tmu_data.c
>> +++ b/drivers/thermal/samsung/exynos_tmu_data.c
>> @@ -95,6 +95,8 @@ static const struct exynos_tmu_registers 
>> exynos3250_tmu_registers = {
>> .triminfo_data = EXYNOS_TMU_REG_TRIMINFO,
>> .triminfo_25_shift = EXYNOS_TRIMINFO_25_SHIFT,
>> .triminfo_85_shift = EXYNOS_TRIMINFO_85_SHIFT,
>> +   .triminfo_ctrl[0] = EXYNOS_TMU_TRIMINFO_CON1,
>> +   .triminfo_ctrl[1] = EXYNOS_TMU_TRIMINFO_CON2,
>> .tmu_ctrl = EXYNOS_TMU_REG_CONTROL,
>> .test_mux_addr_shift = EXYNOS4412_MUX_ADDR_SHIFT,
>> .buf_vref_sel_shift = EXYNOS_TMU_REF_VOLTAGE_SHIFT,
>> @@ -160,8 +162,11 @@ static const struct exynos_tmu_registers 
>> exynos3250_tmu_registers = {
>> .temp_level = 95, \
>> }, \
>> .freq_tab_count = 2, \
>> +   .triminfo_reload[0] = 0x1, \
>> +   .triminfo_reload[1] = 0x11, \
> 
> What does 0x1 mean? How about 0x11?

The bit of 'RELOAD' field in TRIMINFO_CONTROL register is [0].
and The bit of 'AC Time' field in TRIMINFO_CONTROL register is [5:4].

0x1 means that set RELOAD field.
0x11 means that set RELOAD field and ACTIME field.

> 
>> +   .triminfo_reload_count = 2, \
> 
> What is count?

Just, the number of TRIMINFO_CONTROL registers.
Exynos4412/4212 has only one TRIMINFO_CONTROL register
and Exynos3250 has two TRIMINFO_CONTROL register.

> 
>> .registers = _tmu_registers, \
>> -   .features = (TMU_SUPPORT_EMULATION | \
>> +   .features = (TMU_SUPPORT_EMULATION | TMU_SUPPORT_TRIM_RELOAD | \
>> TMU_SUPPORT_FALLING_TRIP | TMU_SUPPORT_READY_STATUS 
>> | \
>> 

Re: [PATCH 8/9] ARM: zynq: Remove hotplug.c

2014-08-20 Thread Daniel Lezcano

On 08/20/2014 10:41 PM, Soren Brinkmann wrote:

The hotplug code contains only a single function, which is an SMP
function. Move that to platsmp.c where all other SMP runctions reside.
That allows removing hotplug.c and declaring the cpu_die function
static.

Signed-off-by: Soren Brinkmann 
---
  arch/arm/mach-zynq/Makefile  |  1 -
  arch/arm/mach-zynq/common.h  |  3 +--
  arch/arm/mach-zynq/hotplug.c | 17 -
  arch/arm/mach-zynq/platsmp.c | 18 ++
  4 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/arch/arm/mach-zynq/Makefile b/arch/arm/mach-zynq/Makefile
index 820dff6e1eba..c85fb3f7d5cd 100644
--- a/arch/arm/mach-zynq/Makefile
+++ b/arch/arm/mach-zynq/Makefile
@@ -6,5 +6,4 @@
  obj-y := common.o slcr.o pm.o
  CFLAGS_REMOVE_hotplug.o   =-march=armv6k
  CFLAGS_hotplug.o  =-Wa,-march=armv7-a -mcpu=cortex-a9
-obj-$(CONFIG_HOTPLUG_CPU)  += hotplug.o
  obj-$(CONFIG_SMP) += headsmp.o platsmp.o
diff --git a/arch/arm/mach-zynq/common.h b/arch/arm/mach-zynq/common.h
index c0773e87e83c..e6bb12c50a23 100644
--- a/arch/arm/mach-zynq/common.h
+++ b/arch/arm/mach-zynq/common.h
@@ -39,8 +39,7 @@ extern struct smp_operations zynq_smp_ops __initdata;

  extern void __iomem *zynq_scu_base;

-/* Hotplug */
-extern void zynq_platform_cpu_die(unsigned int cpu);
+int zynq_pm_late_init(void);

  int zynq_pm_late_init(void);

diff --git a/arch/arm/mach-zynq/hotplug.c b/arch/arm/mach-zynq/hotplug.c
index fe44a05677e2..b685c89f11e4 100644
--- a/arch/arm/mach-zynq/hotplug.c
+++ b/arch/arm/mach-zynq/hotplug.c
@@ -12,20 +12,3 @@
   */
  #include 

-/*
- * platform-specific code to shutdown a CPU
- *
- * Called with IRQs disabled
- */
-void zynq_platform_cpu_die(unsigned int cpu)
-{
-   zynq_slcr_cpu_state_write(cpu, true);
-
-   /*
-* there is no power-control hardware on this platform, so all
-* we can do is put the core into WFI; this is safe as the calling
-* code will have already disabled interrupts
-*/
-   for (;;)
-   cpu_do_idle();
-}
diff --git a/arch/arm/mach-zynq/platsmp.c b/arch/arm/mach-zynq/platsmp.c
index f77f7ca4c45b..04e578718aa2 100644
--- a/arch/arm/mach-zynq/platsmp.c
+++ b/arch/arm/mach-zynq/platsmp.c
@@ -132,6 +132,24 @@ static int zynq_cpu_kill(unsigned cpu)
zynq_slcr_cpu_stop(cpu);
return 1;
  }
+
+/*
+ * platform-specific code to shutdown a CPU
+ *
+ * Called with IRQs disabled
+ */
+static void zynq_platform_cpu_die(unsigned int cpu)
+{
+   zynq_slcr_cpu_state_write(cpu, true);
+
+   /*
+* there is no power-control hardware on this platform, so all
+* we can do is put the core into WFI; this is safe as the calling
+* code will have already disabled interrupts
+*/
+   for (;;)
+   cpu_do_idle();


IIUC, the cpu_do_idle() will flush the L1 cache and then call the WFI. 
It makes sense if we are about to power down the core. So I am wondering 
if we can just call wfi() instead.



+}
  #endif

  struct smp_operations zynq_smp_ops __initdata = {




--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] tg3: Fix tx_pending checks for tg3_tso_bug

2014-08-20 Thread Benjamin Poirier
On 2014/08/19 16:10, Michael Chan wrote:
> On Tue, 2014-08-19 at 11:52 -0700, Benjamin Poirier wrote: 
> > @@ -7838,11 +7838,14 @@ static int tg3_tso_bug(struct tg3 *tp, struct 
> > tg3_napi *tnapi,
> >struct netdev_queue *txq, struct sk_buff *skb)
> >  {
> > struct sk_buff *segs, *nskb;
> > -   u32 frag_cnt_est = skb_shinfo(skb)->gso_segs * 3;
> >  
> > -   /* Estimate the number of fragments in the worst case */
> > -   if (unlikely(tg3_tx_avail(tnapi) <= frag_cnt_est)) {
> > +   if (unlikely(tg3_tx_avail(tnapi) <= skb_shinfo(skb)->gso_segs)) {
> > +   trace_printk("stopping queue, %d <= %d\n",
> > +tg3_tx_avail(tnapi), 
> > skb_shinfo(skb)->gso_segs);
> > netif_tx_stop_queue(txq);
> > +   trace_printk("stopped queue\n");
> > +   tnapi->wakeup_thresh = skb_shinfo(skb)->gso_segs;
> > +   BUG_ON(tnapi->wakeup_thresh >= tnapi->tx_pending);
> >  
> > /* netif_tx_stop_queue() must be done before checking
> >  * checking tx index in tg3_tx_avail() below, because in 
> 
> I don't quite understand this logic and I must be missing something.
> gso_segs is the number of TCP segments the large packet will be broken
> up into.  If it exceeds dev->gso_max_segs, it means it exceeds
> hardware's capabilty and it will do GSO instead of TSO.  But in this
> case in tg3_tso_bug(), we are doing GSO and we may not have enough DMA
> descriptors to do GSO.  Each gso_seg typically requires 2 DMA
> descriptors.

You're right, I had wrongly assumed that the skbs coming out of
skb_gso_segment() were linear. I'll address that in v2 of the patch by masking
out NETIF_F_SG in tg3_tso_bug().

I noticed another issue that had not occurred to me: when tg3_tso_bug is
submitting a full gso segs sequence to tg3_start_xmit, the code at the end of
that function stops the queue before the end of the sequence because tx_avail
becomes smaller than (MAX_SKB_FRAGS + 1). The transmission actually proceeds
because tg3_tso_bug() does not honour the queue state but it seems rather
unsightly to me. I'm trying different solutions to this and will resubmit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zram: add num_discards for discarded pages stat

2014-08-20 Thread Minchan Kim
Hi Chao,

On Wed, Aug 20, 2014 at 04:20:48PM +0800, Chao Yu wrote:
> Hi Minchan,
> 
> > -Original Message-
> > From: Minchan Kim [mailto:minc...@kernel.org]
> > Sent: Wednesday, August 20, 2014 10:09 AM
> > To: Sergey Senozhatsky
> > Cc: Chao Yu; linux-kernel@vger.kernel.org; linux...@kvack.org; 
> > ngu...@vflare.org; 'Jerome
> > Marchand'; 'Andrew Morton'
> > Subject: Re: [PATCH] zram: add num_discards for discarded pages stat
> > 
> > Hi Sergey,
> > 
> > On Tue, Aug 19, 2014 at 08:25:00PM +0900, Sergey Senozhatsky wrote:
> > > Hello,
> > >
> > > On (08/19/14 13:45), Chao Yu wrote:
> > > > > On (08/15/14 11:27), Chao Yu wrote:
> > > > > > Now we have supported handling discard request which is sended by 
> > > > > > filesystem,
> > > > > > but no interface could be used to show information of discard.
> > > > > > This patch adds num_discards to stat discarded pages, then export 
> > > > > > it to sysfs
> > > > > > for displaying.
> > > > > >
> > > > >
> > > > > a side question: we account discarded pages via slot free notify in
> > > > > notify_free and via req_discard in num_discards. how about accounting
> > > > > both of them in num_discards? because, after all, they account a 
> > > > > number
> > > > > of discarded pages (zram_free_page()). or there any particular reason 
> > > > > we
> > > > > want to distinguish.
> > > >
> > > > Yeah, I agree with you as I have no such reason unless there are our 
> > > > users'
> > > > explicitly requirement for showing notify_free/num_discards separately 
> > > > later.
> > > >
> > > > How do you think of sending another patch to merge these two counts?
> > > >
> > >
> > > Minchan, what do you think? let's account discarded pages in one place.
> > 
> > First of all, I'd like to know why we need num_discards.
> > It should be in description and depends on it whether we should merge both
> > counts or separate.
> 
> Oh, it's my mistaken.
> 
> When commit   9b9913d80b2896ecd9e0a1a8f167ccad66fac79c (Staging: zram: Update
> zram documentation) and commit e98419c23b1a189c932775f7833e94cb5230a16b 
> (Staging:
> zram: Document sysfs entries) description related to 'discard' stat was 
> designed
> and added to zram.txt and sysfs-block-zram, but without implementation of 
> function
> for handling discard request, description in documents were removed in commit
> 8dd1d3247e6c00b50ef83934ea8b22a1590015de (zram: document failed_reads,
> failed_writes stats)

Thanks for letting me know the history.

> 
> For now, we have already supported discard handling, so it's better to resume
> the stat of discard number, this discard stat supports user one more kind of 
> runtime
> information of zram as other stats supported.
> 
> How do you think?

I'm not strong against the idea but just "resume is better" and
"one more is problem as other stats supported" is not logical
to me.

You should explain why we need such new stat so that user can take
what kinds of benefit from that. Otherwise, we couldn't know the stat
is best or not for the goal.


I might be paranoid about small stuff and I admit I'm not good for it,
too but pz, understand that adding the new feature requires a
good description which should include clear goal.

I hope I'm not discouraging. :)

> 
> > 
> > Thanks.
> > 
> > 
> > 
> > >
> > > > One more thing is that I am missing to update document of zram, sorry 
> > > > about
> > > > that, let me update it in v2.
> > >
> > > thanks.
> > >
> > >   -ss
> > >
> > > > Thanks,
> > > > Yu
> > > >
> > > > >
> > > > >   -ss
> > > > >
> > > > > > Signed-off-by: Chao Yu 
> > > > > > ---
> > > > > >  Documentation/ABI/testing/sysfs-block-zram | 10 ++
> > > > > >  drivers/block/zram/zram_drv.c  |  3 +++
> > > > > >  drivers/block/zram/zram_drv.h  |  1 +
> > > > > >  3 files changed, 14 insertions(+)
> > > > > >
> > > > > > diff --git a/Documentation/ABI/testing/sysfs-block-zram
> > > > > b/Documentation/ABI/testing/sysfs-block-zram
> > > > > > index 70ec992..fa8936e 100644
> > > > > > --- a/Documentation/ABI/testing/sysfs-block-zram
> > > > > > +++ b/Documentation/ABI/testing/sysfs-block-zram
> > > > > > @@ -57,6 +57,16 @@ Description:
> > > > > > The failed_writes file is read-only and specifies the 
> > > > > > number of
> > > > > > failed writes happened on this device.
> > > > > >
> > > > > > +
> > > > > > +What:  /sys/block/zram/num_discards
> > > > > > +Date:  August 2014
> > > > > > +Contact:   Chao Yu 
> > > > > > +Description:
> > > > > > +   The num_discards file is read-only and specifies the 
> > > > > > number of
> > > > > > +   physical blocks which are discarded by this device. 
> > > > > > These blocks
> > > > > > +   are included in discard request which is sended by 
> > > > > > filesystem as
> > > > > > +   the blocks are no longer used.
> > > > > > +
> > > > > >  What:  /sys/block/zram/max_comp_streams
> > > > > >  Date: 

[PATCHSET 0/2] perf hists browser: Cleanup callchain routines (v2)

2014-08-20 Thread Namhyung Kim
Hello,

This patch fixes and cleans up TUI callchain routines.  I tried to
consolidate similar functions but not to break the current output with
this change.  Hope that it makes code more readable and maintainable.

It should not change any behavior and outputs.  I verified it with
expanding by 'E' key + dumping by 'P' key on TUI and then running
"diff -u" results before and after the patchset.

Actually I have more changes that would change (or improve) some
behavior and output.  I'll post them after this patchset is merged.


 * changes in v2)
  - drop patch 1-3 in v1 since it's already merged
  - update description  (Arnaldo)
  - remove stylish changes  (Arnaldo)
  - remove unnecessary 'has_single_node' check


You can get this from 'perf/callchain-fix-v2' branch on my tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Any comments are welcome, thanks
Namhyung


Namhyung Kim (2):
  perf hists browser: Cleanup callchain print functions
  perf hists browser: Consolidate callchain print functions in TUI

 tools/perf/ui/browsers/hists.c | 302 +
 1 file changed, 95 insertions(+), 207 deletions(-)

-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] perf hists browser: Cleanup callchain print functions

2014-08-20 Thread Namhyung Kim
The hist_browser__show_callchain() and friends don't need to be that
complex.  They're splitted in 3 pieces - one for traversing top-level
tree, other one for special casing first chains in the top-level
entries, and last one for recursive traversing inner trees.  It led to
code duplication and unnecessary complexity IMHO.

Simplify the function and consolidate the logic into a single function
- it can recursively call itself.  A little difference in printing
callchains in top-level tree can be handled with a small change.

It should have no functional change.

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/ui/browsers/hists.c | 112 +++--
 1 file changed, 29 insertions(+), 83 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index d42d8a8f3810..519353d9f5fb 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -502,23 +502,16 @@ static void hist_browser__show_callchain_entry(struct 
hist_browser *browser,
 
 #define LEVEL_OFFSET_STEP 3
 
-static int hist_browser__show_callchain_node_rb_tree(struct hist_browser 
*browser,
-struct callchain_node 
*chain_node,
-u64 total, int level,
-unsigned short row,
-off_t *row_offset,
-bool *is_current_entry)
+static int hist_browser__show_callchain(struct hist_browser *browser,
+   struct rb_root *root, int level,
+   unsigned short row, off_t *row_offset,
+   u64 total, bool *is_current_entry)
 {
struct rb_node *node;
int first_row = row, offset = level * LEVEL_OFFSET_STEP;
u64 new_total;
 
-   if (callchain_param.mode == CHAIN_GRAPH_REL)
-   new_total = chain_node->children_hit;
-   else
-   new_total = total;
-
-   node = rb_first(_node->rb_root);
+   node = rb_first(root);
while (node) {
struct callchain_node *child = rb_entry(node, struct 
callchain_node, rb_node);
struct rb_node *next = rb_next(node);
@@ -535,7 +528,7 @@ static int hist_browser__show_callchain_node_rb_tree(struct 
hist_browser *browse
 
if (first)
first = false;
-   else
+   else if (level > 1)
extra_offset = LEVEL_OFFSET_STEP;
 
folded_sign = callchain_list__folded(chain);
@@ -547,8 +540,9 @@ static int hist_browser__show_callchain_node_rb_tree(struct 
hist_browser *browse
alloc_str = NULL;
str = callchain_list__sym_name(chain, bf, sizeof(bf),
   browser->show_dso);
-   if (was_first) {
-   double percent = cumul * 100.0 / new_total;
+
+   if (was_first && level > 1) {
+   double percent = cumul * 100.0 / total;
 
if (asprintf(_str, "%2.2f%% %s", percent, 
str) < 0)
str = "Not enough memory!";
@@ -571,78 +565,23 @@ do_next:
 
if (folded_sign == '-') {
const int new_level = level + (extra_offset ? 2 : 1);
-   row += 
hist_browser__show_callchain_node_rb_tree(browser, child, new_total,
-
new_level, row, row_offset,
-
is_current_entry);
-   }
-   if (row == browser->b.rows)
-   goto out;
-   node = next;
-   }
-out:
-   return row - first_row;
-}
-
-static int hist_browser__show_callchain_node(struct hist_browser *browser,
-struct callchain_node *node,
-int level, unsigned short row,
-off_t *row_offset,
-bool *is_current_entry)
-{
-   struct callchain_list *chain;
-   int first_row = row;
-   int offset = level * LEVEL_OFFSET_STEP;
-   char folded_sign = ' ';
-
-   list_for_each_entry(chain, >val, list) {
-   char bf[1024], *s;
 
-   folded_sign = callchain_list__folded(chain);
+   if (callchain_param.mode == CHAIN_GRAPH_REL)
+   new_total = child->children_hit;
+   else
+   new_total = total;
 
-   if 

[PATCH v2 2/2] perf hists browser: Consolidate callchain print functions in TUI

2014-08-20 Thread Namhyung Kim
Currently there're two callchain print functions in TUI - one for the
hists browser and another for file dump.  They do almost same job so
it'd be better consolidate the codes.

To do that, move row calculation code into a print callback so that
the dump code cannot be limited by the current screen size.

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/ui/browsers/hists.c | 210 +++--
 1 file changed, 76 insertions(+), 134 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 519353d9f5fb..48d8c8eee6c2 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -477,20 +477,32 @@ static char *callchain_list__sym_name(struct 
callchain_list *cl,
return bf;
 }
 
+struct callchain_print_arg {
+   /* for hists browser */
+   unsigned short row;
+   off_t row_offset;
+   bool is_current_entry;
+
+   /* for file dump */
+   FILE *fp;
+   int printed;
+};
+
 static void hist_browser__show_callchain_entry(struct hist_browser *browser,
   struct callchain_list *chain,
-  unsigned short row, int offset,
-  char folded_sign, const char 
*str,
-  bool *is_current_entry)
+  const char *str, int offset,
+  struct callchain_print_arg *arg)
 {
int color, width;
+   unsigned short row = arg->row;
+   char folded_sign = callchain_list__folded(chain);
 
color = HE_COLORSET_NORMAL;
width = browser->b.width - (offset + 2);
if (ui_browser__is_current_entry(>b, row)) {
browser->selection = >ms;
color = HE_COLORSET_SELECTED;
-   *is_current_entry = true;
+   arg->is_current_entry = true;
}
 
ui_browser__set_color(>b, color);
@@ -498,17 +510,40 @@ static void hist_browser__show_callchain_entry(struct 
hist_browser *browser,
slsmg_write_nstring(" ", offset);
slsmg_printf("%c ", folded_sign);
slsmg_write_nstring(str, width);
+
+   /*
+* increase row here so that we can reuse the
+* hist_browser__show_callchain() for dumping the whole
+* callchain to a file.
+*/
+   arg->row++;
+}
+
+static void hist_browser__fprintf_callchain_entry(struct hist_browser *b 
__maybe_unused,
+ struct callchain_list *chain,
+ const char *str, int offset,
+ struct callchain_print_arg 
*arg)
+{
+   char folded_sign = callchain_list__folded(chain);
+
+   arg->printed += fprintf(arg->fp, "%*s%c %s\n", offset, " ",
+   folded_sign, str);
 }
 
+typedef void (*print_callchain_entry_fn)(struct hist_browser *browser,
+struct callchain_list *chain,
+const char *str, int offset,
+struct callchain_print_arg *arg);
+
 #define LEVEL_OFFSET_STEP 3
 
 static int hist_browser__show_callchain(struct hist_browser *browser,
-   struct rb_root *root, int level,
-   unsigned short row, off_t *row_offset,
-   u64 total, bool *is_current_entry)
+   struct rb_root *root, int level, u64 
total,
+   print_callchain_entry_fn print,
+   struct callchain_print_arg *arg)
 {
struct rb_node *node;
-   int first_row = row, offset = level * LEVEL_OFFSET_STEP;
+   int first_row = arg->row, offset = level * LEVEL_OFFSET_STEP;
u64 new_total;
 
node = rb_first(root);
@@ -532,8 +567,8 @@ static int hist_browser__show_callchain(struct hist_browser 
*browser,
extra_offset = LEVEL_OFFSET_STEP;
 
folded_sign = callchain_list__folded(chain);
-   if (*row_offset != 0) {
-   --*row_offset;
+   if (arg->row_offset != 0) {
+   arg->row_offset--;
goto do_next;
}
 
@@ -550,13 +585,11 @@ static int hist_browser__show_callchain(struct 
hist_browser *browser,
str = alloc_str;
}
 
-   hist_browser__show_callchain_entry(browser, chain, row,
-  offset + 
extra_offset,
-  folded_sign, 

[PATCH V4 4/8] Documentation: add a section for /proc//ns/

2014-08-20 Thread Richard Guy Briggs
---
 Documentation/filesystems/proc.txt |   16 
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index ddc531a..c4bfd6f 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -42,6 +42,7 @@ Table of Contents
   3.6  /proc//comm  & /proc//task//comm
   3.7   /proc//task//children - Information about task children
   3.8   /proc//fdinfo/ - Information about opened file
+  3.9   /proc//ns/{,_snum} - Information about process namespaces
 
   4Configuring procfs
   4.1  Mount options
@@ -1744,6 +1745,21 @@ pair provide additional information particular to the 
objects they represent.
optional and may be omitted if no marks created yet.
 
 
+3.9/proc//ns/{,_snum} - Information about process namespaces
+--
+These files provides information about the namespaces within which the process
+is contained.  The files named only with the namespace type  contain a
+link that lists the containing namespace' inode number in its proc filesystem.
+The files with suffix _snum contain a link that lists the containing
+namespace' instance serial number, unique per kernel since boot.  The
+namespace types are self-describing.
+
+The output format of the inode links is:
+   :[]
+The output format of the serial number links is:
+   _snum:[]
+
+
 --
 Configuring procfs
 --
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V4 3/8] namespaces: expose ns instance serial numbers in proc

2014-08-20 Thread Richard Guy Briggs
Expose the namespace instace serial numbers in the proc filesystem at
/proc//ns/_snum.  The link text gives the serial number in hex.

"snum" was chosen instead of "seq" for consistency with inum and there are a
number of other uses of "seq" in the namespace code.

Suggested-by: Serge E. Hallyn 
Signed-off-by: Richard Guy Briggs 
---
 fs/proc/namespaces.c |   33 +
 1 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c
index 8902609..e953e0a 100644
--- a/fs/proc/namespaces.c
+++ b/fs/proc/namespaces.c
@@ -47,12 +47,15 @@ static char *ns_dname(struct dentry *dentry, char *buffer, 
int buflen)
struct inode *inode = dentry->d_inode;
const struct proc_ns_operations *ns_ops = PROC_I(inode)->ns.ns_ops;
 
-   return dynamic_dname(dentry, buffer, buflen, "%s:[%lu]",
-   ns_ops->name, inode->i_ino);
+   if (strstr(dentry->d_iname, "_snum"))
+   return dynamic_dname(dentry, buffer, buflen, "%s_snum:[%llx]",
+   ns_ops->name, ns_ops->snum(PROC_I(inode)->ns.ns));
+   else
+   return dynamic_dname(dentry, buffer, buflen, "%s:[%lu]",
+   ns_ops->name, inode->i_ino);
 }
 
-const struct dentry_operations ns_dentry_operations =
-{
+const struct dentry_operations ns_dentry_operations = {
.d_delete   = always_delete_dentry,
.d_dname= ns_dname,
 };
@@ -160,7 +163,10 @@ static int proc_ns_readlink(struct dentry *dentry, char 
__user *buffer, int bufl
if (!ns)
goto out_put_task;
 
-   snprintf(name, sizeof(name), "%s:[%u]", ns_ops->name, ns_ops->inum(ns));
+   if (strstr(dentry->d_iname, "_snum"))
+   snprintf(name, sizeof(name), "%s_snum:[%llx]", ns_ops->name, 
ns_ops->snum(ns));
+   else
+   snprintf(name, sizeof(name), "%s:[%u]", ns_ops->name, 
ns_ops->inum(ns));
res = readlink_copy(buffer, buflen, name);
ns_ops->put(ns);
 out_put_task:
@@ -210,16 +216,23 @@ static int proc_ns_dir_readdir(struct file *file, struct 
dir_context *ctx)
 
if (!dir_emit_dots(file, ctx))
goto out;
-   if (ctx->pos >= 2 + ARRAY_SIZE(ns_entries))
+   if (ctx->pos >= 2 + 2 * ARRAY_SIZE(ns_entries))
goto out;
entry = ns_entries + (ctx->pos - 2);
last = _entries[ARRAY_SIZE(ns_entries) - 1];
while (entry <= last) {
const struct proc_ns_operations *ops = *entry;
+   char name[50];
+
if (!proc_fill_cache(file, ctx, ops->name, strlen(ops->name),
 proc_ns_instantiate, task, ops))
break;
ctx->pos++;
+   snprintf(name, sizeof(name), "%s_snum", ops->name);
+   if (!proc_fill_cache(file, ctx, name, strlen(name),
+proc_ns_instantiate, task, ops))
+   break;
+   ctx->pos++;
entry++;
}
 out:
@@ -247,9 +260,13 @@ static struct dentry *proc_ns_dir_lookup(struct inode *dir,
 
last = _entries[ARRAY_SIZE(ns_entries)];
for (entry = ns_entries; entry < last; entry++) {
-   if (strlen((*entry)->name) != len)
+   char name[50];
+
+   snprintf(name, sizeof(name), "%s_snum", (*entry)->name);
+   if (strlen((*entry)->name) != len && strlen(name) != len)
continue;
-   if (!memcmp(dentry->d_name.name, (*entry)->name, len))
+   if (!memcmp(dentry->d_name.name, (*entry)->name, len)
+   || !memcmp(dentry->d_name.name, name, len))
break;
}
if (entry == last)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V4 2/8] namespaces: expose namespace instance serial number in proc_ns_operations

2014-08-20 Thread Richard Guy Briggs
Expose the namespace instance serial number for each namespace type in the proc
namespace operations structure to make it available for the proc filesystem.

Signed-off-by: Richard Guy Briggs 
---
 fs/namespace.c   |7 +++
 include/linux/proc_ns.h  |1 +
 ipc/namespace.c  |8 
 kernel/pid_namespace.c   |7 +++
 kernel/user_namespace.c  |7 +++
 kernel/utsname.c |8 
 net/core/net_namespace.c |7 +++
 7 files changed, 45 insertions(+), 0 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 9af49ff..f433f21 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3028,6 +3028,12 @@ static unsigned int mntns_inum(void *ns)
return mnt_ns->proc_inum;
 }
 
+static long long mntns_snum(void *ns)
+{
+   struct mnt_namespace *mnt_ns = ns;
+   return mnt_ns->serial_num;
+}
+
 const struct proc_ns_operations mntns_operations = {
.name   = "mnt",
.type   = CLONE_NEWNS,
@@ -3035,4 +3041,5 @@ const struct proc_ns_operations mntns_operations = {
.put= mntns_put,
.install= mntns_install,
.inum   = mntns_inum,
+   .snum   = mntns_snum,
 };
diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h
index 34a1e10..aaafe3e 100644
--- a/include/linux/proc_ns.h
+++ b/include/linux/proc_ns.h
@@ -14,6 +14,7 @@ struct proc_ns_operations {
void (*put)(void *ns);
int (*install)(struct nsproxy *nsproxy, void *ns);
unsigned int (*inum)(void *ns);
+   long long (*snum)(void *ns);
 };
 
 struct proc_ns {
diff --git a/ipc/namespace.c b/ipc/namespace.c
index 76dac5c..36ce7ff 100644
--- a/ipc/namespace.c
+++ b/ipc/namespace.c
@@ -191,6 +191,13 @@ static unsigned int ipcns_inum(void *vp)
return ns->proc_inum;
 }
 
+static long long ipcns_snum(void *vp)
+{
+   struct ipc_namespace *ns = vp;
+
+   return ns->serial_num;
+}
+
 const struct proc_ns_operations ipcns_operations = {
.name   = "ipc",
.type   = CLONE_NEWIPC,
@@ -198,4 +205,5 @@ const struct proc_ns_operations ipcns_operations = {
.put= ipcns_put,
.install= ipcns_install,
.inum   = ipcns_inum,
+   .snum   = ipcns_snum,
 };
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index 40a8b36..059b330 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -370,6 +370,12 @@ static unsigned int pidns_inum(void *ns)
return pid_ns->proc_inum;
 }
 
+static long long pidns_snum(void *ns)
+{
+   struct pid_namespace *pid_ns = ns;
+   return pid_ns->serial_num;
+}
+
 const struct proc_ns_operations pidns_operations = {
.name   = "pid",
.type   = CLONE_NEWPID,
@@ -377,6 +383,7 @@ const struct proc_ns_operations pidns_operations = {
.put= pidns_put,
.install= pidns_install,
.inum   = pidns_inum,
+   .snum   = pidns_snum,
 };
 
 static __init int pid_namespaces_init(void)
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 5c5c399..3f04df5 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -896,6 +896,12 @@ static unsigned int userns_inum(void *ns)
return user_ns->proc_inum;
 }
 
+static long long userns_snum(void *ns)
+{
+   struct user_namespace *user_ns = ns;
+   return user_ns->serial_num;
+}
+
 const struct proc_ns_operations userns_operations = {
.name   = "user",
.type   = CLONE_NEWUSER,
@@ -903,6 +909,7 @@ const struct proc_ns_operations userns_operations = {
.put= userns_put,
.install= userns_install,
.inum   = userns_inum,
+   .snum   = userns_snum,
 };
 
 static __init int user_namespaces_init(void)
diff --git a/kernel/utsname.c b/kernel/utsname.c
index d0cf7b5..ffeac1b 100644
--- a/kernel/utsname.c
+++ b/kernel/utsname.c
@@ -132,6 +132,13 @@ static unsigned int utsns_inum(void *vp)
return ns->proc_inum;
 }
 
+static long long utsns_snum(void *vp)
+{
+   struct uts_namespace *ns = vp;
+
+   return ns->serial_num;
+}
+
 const struct proc_ns_operations utsns_operations = {
.name   = "uts",
.type   = CLONE_NEWUTS,
@@ -139,4 +146,5 @@ const struct proc_ns_operations utsns_operations = {
.put= utsns_put,
.install= utsns_install,
.inum   = utsns_inum,
+   .snum   = utsns_snum,
 };
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 3b5cfdb..c402eea 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -671,6 +671,12 @@ static unsigned int netns_inum(void *ns)
return net->proc_inum;
 }
 
+static long long netns_snum(void *ns)
+{
+   struct net *net = ns;
+   return net->serial_num;
+}
+
 const struct proc_ns_operations 

[PATCH V4 8/8] audit: initialize at subsystem time rather than device time

2014-08-20 Thread Richard Guy Briggs
The audit subsystem should be initialized a bit earlier so that it is in place
in time for initial namespace serial number logging.
---
 kernel/audit.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 6d95d1c..aa99518 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1186,7 +1186,7 @@ static int __init audit_init(void)
 
return 0;
 }
-__initcall(audit_init);
+subsys_initcall(audit_init);
 
 /* Process kernel command-line parameter at boot time.  audit=0 or audit=1. */
 static int __init audit_enable(char *str)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V4 1/8] namespaces: assign each namespace instance a serial number

2014-08-20 Thread Richard Guy Briggs
Generate and assign a serial number per namespace instance since boot.

Use a serial number per namespace (unique across one boot of one kernel)
instead of the inode number (which is claimed to have had the right to change
reserved and is not necessarily unique if there is more than one proc fs) to
uniquely identify it per kernel boot.

Signed-off-by: Richard Guy Briggs 
---
 fs/mount.h |1 +
 fs/namespace.c |1 +
 include/linux/ipc_namespace.h  |1 +
 include/linux/nsproxy.h|8 
 include/linux/pid_namespace.h  |1 +
 include/linux/user_namespace.h |1 +
 include/linux/utsname.h|1 +
 include/net/net_namespace.h|1 +
 init/version.c |1 +
 ipc/msgutil.c  |1 +
 ipc/namespace.c|2 ++
 kernel/nsproxy.c   |   17 +
 kernel/pid.c   |1 +
 kernel/pid_namespace.c |2 ++
 kernel/user.c  |1 +
 kernel/user_namespace.c|2 ++
 kernel/utsname.c   |2 ++
 net/core/net_namespace.c   |8 +++-
 18 files changed, 51 insertions(+), 1 deletions(-)

diff --git a/fs/mount.h b/fs/mount.h
index d55297f..c076f99 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -5,6 +5,7 @@
 struct mnt_namespace {
atomic_tcount;
unsigned intproc_inum;
+   long long   serial_num;
struct mount *  root;
struct list_headlist;
struct user_namespace   *user_ns;
diff --git a/fs/namespace.c b/fs/namespace.c
index 182bc41..9af49ff 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2486,6 +2486,7 @@ static struct mnt_namespace *alloc_mnt_ns(struct 
user_namespace *user_ns)
kfree(new_ns);
return ERR_PTR(ret);
}
+   new_ns->serial_num = ns_serial();
new_ns->seq = atomic64_add_return(1, _ns_seq);
atomic_set(_ns->count, 1);
new_ns->root = NULL;
diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
index 35e7eca..8ccfb2d 100644
--- a/include/linux/ipc_namespace.h
+++ b/include/linux/ipc_namespace.h
@@ -69,6 +69,7 @@ struct ipc_namespace {
struct user_namespace *user_ns;
 
unsigned intproc_inum;
+   long long   serial_num;
 };
 
 extern struct ipc_namespace init_ipc_ns;
diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index b4ec59d..8e5fe0d 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -66,6 +66,14 @@ static inline struct nsproxy *task_nsproxy(struct 
task_struct *tsk)
return rcu_dereference(tsk->nsproxy);
 }
 
+long long ns_serial(void);
+enum {
+   NS_IPC_INIT_SN  = 1,
+   NS_UTS_INIT_SN  = 2,
+   NS_USER_INIT_SN = 3,
+   NS_PID_INIT_SN  = 4,
+};
+
 int copy_namespaces(unsigned long flags, struct task_struct *tsk);
 void exit_task_namespaces(struct task_struct *tsk);
 void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new);
diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
index 7246ef3..4d8023e 100644
--- a/include/linux/pid_namespace.h
+++ b/include/linux/pid_namespace.h
@@ -43,6 +43,7 @@ struct pid_namespace {
int hide_pid;
int reboot; /* group exit code if this pidns was rebooted */
unsigned int proc_inum;
+   long long   serial_num;
 };
 
 extern struct pid_namespace init_pid_ns;
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 4836ba3..159ac26 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -27,6 +27,7 @@ struct user_namespace {
kuid_t  owner;
kgid_t  group;
unsigned intproc_inum;
+   long long   serial_num;
 
/* Register of per-UID persistent keyrings for this namespace */
 #ifdef CONFIG_PERSISTENT_KEYRINGS
diff --git a/include/linux/utsname.h b/include/linux/utsname.h
index 239e277..8490197 100644
--- a/include/linux/utsname.h
+++ b/include/linux/utsname.h
@@ -24,6 +24,7 @@ struct uts_namespace {
struct new_utsname name;
struct user_namespace *user_ns;
unsigned int proc_inum;
+   long long   serial_num;
 };
 extern struct uts_namespace init_uts_ns;
 
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 361d260..5238a06 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -61,6 +61,7 @@ struct net {
struct user_namespace   *user_ns;   /* Owning user namespace */
 
unsigned intproc_inum;
+   long long   serial_num;
 
struct proc_dir_entry   *proc_net;
struct proc_dir_entry   *proc_net_stat;
diff --git a/init/version.c b/init/version.c
index 1a4718e..cfdcb85 100644
--- a/init/version.c
+++ b/init/version.c
@@ -36,6 +36,7 @@ struct uts_namespace init_uts_ns = {
},

[PATCH V4 6/8] audit: log namespace serial numbers

2014-08-20 Thread Richard Guy Briggs
Log the namespace serial numbers of a task in a new record type (1329) (usually
accompanies audit_log_task_info() type=SYSCALL record) which is used by syscall
audits, among others..

Idea first presented:
https://www.redhat.com/archives/linux-audit/2013-March/msg00020.html

Typical output format would look something like:
type=NS_INFO msg=audit(1408577535.306:82):  netns=8 utsns=2 ipcns=1 
pidns=4 userns=3 mntns=5

The serial numbers are printed in hex.

Suggested-by: Aristeu Rozanski 
Signed-off-by: Richard Guy Briggs 
Acked-by: Serge Hallyn 
---
 include/linux/audit.h|7 +++
 include/uapi/linux/audit.h   |1 +
 kernel/audit.c   |   29 +
 kernel/auditsc.c |2 ++
 security/integrity/ima/ima_api.c |2 ++
 5 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index 22cfddb..5ea3609 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -101,6 +101,13 @@ extern int __weak audit_classify_compat_syscall(int abi, 
unsigned syscall);
 struct filename;
 
 extern void audit_log_session_info(struct audit_buffer *ab);
+#ifdef CONFIG_NAMESPACES
+extern void audit_log_namespace_info(struct task_struct *tsk);
+#else
+void audit_log_namespace_info(struct task_struct *tsk)
+{
+}
+#endif
 
 #ifdef CONFIG_AUDIT_COMPAT_GENERIC
 #define audit_is_compat(arch)  (!((arch) & __AUDIT_ARCH_64BIT))
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index cf67147..84bbcdb 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -110,6 +110,7 @@
 #define AUDIT_SECCOMP  1326/* Secure Computing event */
 #define AUDIT_PROCTITLE1327/* Proctitle emit event */
 #define AUDIT_FEATURE_CHANGE   1328/* audit log listing feature changes */
+#define AUDIT_NS_INFO  1329/* Record process namespace IDs */
 
 #define AUDIT_AVC  1400/* SE Linux avc denial or grant */
 #define AUDIT_SELINUX_ERR  1401/* Internal SE Linux Errors */
diff --git a/kernel/audit.c b/kernel/audit.c
index 3ef2e0e..a4c39a0 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "audit.h"
@@ -743,6 +744,8 @@ static void audit_log_feature_change(int which, u32 
old_feature, u32 new_feature
 audit_feature_names[which], !!old_feature, 
!!new_feature,
 !!old_lock, !!new_lock, res);
audit_log_end(ab);
+
+   audit_log_namespace_info(current);
 }
 
 static int audit_set_feature(struct sk_buff *skb)
@@ -1661,6 +1664,30 @@ void audit_log_session_info(struct audit_buffer *ab)
audit_log_format(ab, " auid=%u ses=%u", auid, sessionid);
 }
 
+#ifdef CONFIG_NAMESPACES
+void audit_log_namespace_info(struct task_struct *tsk)
+{
+   const struct proc_ns_operations **entry;
+   bool end = false;
+   struct audit_buffer *ab;
+
+   if (!tsk)
+   return;
+   ab = audit_log_start(tsk->audit_context, GFP_KERNEL,
+AUDIT_NS_INFO);
+   if (!ab)
+   return;
+   for (entry = ns_entries; !end; entry++) {
+   void *ns = (*entry)->get(tsk);
+   audit_log_format(ab, " %sns=%llx", (*entry)->name,
+(*entry)->snum(ns));
+   (*entry)->put(ns);
+   end = (*entry)->type == CLONE_NEWNS;
+   }
+   audit_log_end(ab);
+}
+#endif /* CONFIG_NAMESPACES */
+
 void audit_log_key(struct audit_buffer *ab, char *key)
 {
audit_log_format(ab, " key=");
@@ -1933,6 +1960,8 @@ void audit_log_link_denied(const char *operation, struct 
path *link)
audit_log_format(ab, " res=0");
audit_log_end(ab);
 
+   audit_log_namespace_info(current);
+
/* Generate AUDIT_PATH record with object. */
name->type = AUDIT_TYPE_NORMAL;
audit_copy_inode(name, link->dentry, link->dentry->d_inode);
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 21eae3c..08b9af9 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1383,6 +1383,8 @@ static void audit_log_exit(struct audit_context *context, 
struct task_struct *ts
audit_log_key(ab, context->filterkey);
audit_log_end(ab);
 
+   audit_log_namespace_info(tsk);
+
for (aux = context->aux; aux; aux = aux->next) {
 
ab = audit_log_start(context, GFP_KERNEL, aux->type);
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index d9cd5ce..06d6897 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -323,6 +323,8 @@ void ima_audit_measurement(struct integrity_iint_cache 
*iint,
audit_log_task_info(ab, current);
audit_log_end(ab);
 
+   audit_log_namespace_info(current);
+
iint->flags |= IMA_AUDITED;
 }
 
-- 
1.7.1

--
To unsubscribe from 

[PATCH V4 5/8] namespaces: expose ns_entries

2014-08-20 Thread Richard Guy Briggs
Expose ns_entries so subsystems other than proc can use this set of namespace
operations.

Signed-off-by: Richard Guy Briggs 
---
 fs/proc/namespaces.c|2 +-
 include/linux/proc_ns.h |1 +
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c
index e953e0a..29c3909 100644
--- a/fs/proc/namespaces.c
+++ b/fs/proc/namespaces.c
@@ -15,7 +15,7 @@
 #include "internal.h"
 
 
-static const struct proc_ns_operations *ns_entries[] = {
+const struct proc_ns_operations *ns_entries[] = {
 #ifdef CONFIG_NET_NS
_operations,
 #endif
diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h
index aaafe3e..f4563db 100644
--- a/include/linux/proc_ns.h
+++ b/include/linux/proc_ns.h
@@ -28,6 +28,7 @@ extern const struct proc_ns_operations ipcns_operations;
 extern const struct proc_ns_operations pidns_operations;
 extern const struct proc_ns_operations userns_operations;
 extern const struct proc_ns_operations mntns_operations;
+extern const struct proc_ns_operations *ns_entries[];
 
 /*
  * We always define these enumerators
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V4 7/8] audit: log creation and deletion of namespace instances

2014-08-20 Thread Richard Guy Briggs
Log the creation and deletion of namespace instances in all 6 types of
namespaces.

Twelve new audit message types have been introduced:
AUDIT_NS_INIT_MNT   1330/* Record mount namespace instance creation */
AUDIT_NS_INIT_UTS   1331/* Record UTS namespace instance creation */
AUDIT_NS_INIT_IPC   1332/* Record IPC namespace instance creation */
AUDIT_NS_INIT_USER  1333/* Record USER namespace instance creation */
AUDIT_NS_INIT_PID   1334/* Record PID namespace instance creation */
AUDIT_NS_INIT_NET   1335/* Record NET namespace instance creation */
AUDIT_NS_DEL_MNT1336/* Record mount namespace instance deletion */
AUDIT_NS_DEL_UTS1337/* Record UTS namespace instance deletion */
AUDIT_NS_DEL_IPC1338/* Record IPC namespace instance deletion */
AUDIT_NS_DEL_USER   1339/* Record USER namespace instance deletion */
AUDIT_NS_DEL_PID1340/* Record PID namespace instance deletion */
AUDIT_NS_DEL_NET1341/* Record NET namespace instance deletion */

As suggested by Eric Paris, there are 12 message types, one for each of
creation and deletion, one for each type of namespace so that text searches are
easier in conjunction with the AUDIT_NS_INFO message type, being able to search
for all records such as "netns=7 " and to avoid fields disappearing per message
type to make ausearch more efficient.

A typical startup would look roughly like:

type=AUDIT_NS_INIT_UTS msg=audit(1408577534.868:5): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_utsns=0 utsns=2 res=1
type=AUDIT_NS_INIT_USER msg=audit(1408577534.868:6): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_userns=0 userns=3 res=1
type=AUDIT_NS_INIT_PID msg=audit(1408577534.868:7): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_pidns=0 pidns=4 res=1
type=AUDIT_NS_INIT_MNT msg=audit(1408577534.868:8): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_mntns=0 mntns=5 res=1
type=AUDIT_NS_INIT_IPC msg=audit(1408577534.868:9): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_ipcns=0 ipcns=1 res=1
type=AUDIT_NS_INIT_NET msg=audit(1408577533.500:10): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_netns=0 netns=7 res=1

And a CLONE action would result in:
type=type=AUDIT_NS_INIT_NET msg=audit(1408577535.306:81): pid=481 uid=0 
auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 old_netns=7 
netns=8 res=1
type=type=AUDIT_NS_INIT_MNT msg=audit(1408577535.307:83): pid=481 uid=0 
auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 old_mntns=5 
mntns=9 res=1

While deleting a namespace would result in:
type=type=AUDIT_NS_DEL_MNT msg=audit(1408577552.221:85): pid=481 uid=0 
auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 mntns=9 res=1

If non-zero, old_snum lists the namespace from which it was cloned.

Signed-off-by: Richard Guy Briggs 
---
 fs/namespace.c |   12 +++
 include/linux/audit.h  |8 +++
 include/uapi/linux/audit.h |   12 +++
 ipc/namespace.c|   10 +
 kernel/audit.c |   47 
 kernel/pid_namespace.c |   10 +
 kernel/user_namespace.c|   11 ++
 kernel/utsname.c   |   11 ++
 net/core/net_namespace.c   |   12 +++
 9 files changed, 133 insertions(+), 0 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index f433f21..cb05b3d 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "pnode.h"
 #include "internal.h"
 
@@ -2459,6 +2460,7 @@ dput_out:
 
 static void free_mnt_ns(struct mnt_namespace *ns)
 {
+   audit_log_ns_del(AUDIT_NS_DEL_MNT, ns->serial_num);
proc_free_inum(ns->proc_inum);
put_user_ns(ns->user_ns);
kfree(ns);
@@ -2519,6 +2521,7 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, 
struct mnt_namespace *ns,
new_ns = alloc_mnt_ns(user_ns);
if (IS_ERR(new_ns))
return new_ns;
+   audit_log_ns_init(AUDIT_NS_INIT_MNT, ns->serial_num, 
new_ns->serial_num);
 
namespace_lock();
/* First pass: copy the tree topology */
@@ -2831,6 +2834,15 @@ static void __init init_mount_tree(void)
set_fs_root(current->fs, );
 }
 
+/* log the serial number of init mnt namespace after audit service starts */
+static int __init mnt_ns_init_log(void)
+{
+   struct mnt_namespace *init_mnt_ns = init_task.nsproxy->mnt_ns;
+   audit_log_ns_init(AUDIT_NS_INIT_MNT, 0, init_mnt_ns->serial_num);
+   return 0;
+}
+late_initcall(mnt_ns_init_log);
+
 void __init mnt_init(void)
 {
unsigned u;
diff --git a/include/linux/audit.h b/include/linux/audit.h
index 5ea3609..c245837 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -466,6 +466,9 @@ extern void  

[PATCH V4 0/8] namespaces: log namespaces per task

2014-08-20 Thread Richard Guy Briggs
The purpose is to track namespace instances in use by logged processes from the
perspective of init_*_ns by assigning each a per-kernel, per-boot serial
number.

1/8 defines a function to generate them and assigns them.

Use a serial number per namespace (unique across one boot of one kernel)
instead of the inode number (which is claimed to have had the right to change
reserved and is not necessarily unique if there is more than one proc fs).  It
could be argued that the inode numbers have now become a defacto interface and
can't change now, but I'm proposing this approach to see if this helps address
some of the objections to the earlier patchset.

2/8 adds access functions to get to the serial numbers in a similar way to
inode access for namespace proc operations.

3/8 implements, as suggested by Serge Hallyn, making these serial numbers
available in /proc/self/ns/{ipc,mnt,net,pid,user,uts}_snum.  I chose "snum"
instead of "seq" for consistency with inum and there are a number of other uses
of "seq" in the namespace code.

4/8 Document proc's ns entries structure in Documentation/filesystems/proc.txt

5/8 exposes proc's ns entries structure which lists a number of useful
operations per namespace type for other subsystems to use.

6/8 provides an example of usage for audit_log_task_info() which is used by
syscall audits, among others.  audit_log_task() and audit_common_recv_message()
would be other potential use cases.

Proposed output format:
This differs slightly from Aristeu's patch because of the label conflict with
"pid=" due to including it in existing records rather than it being a seperate
record.  It has now returned to being a seperate record.  The serial numbers
are printed in hex.
type=NS_INFO msg=audit(1408577535.306:82):  netns=8 utsns=2 ipcns=1 
pidns=4 userns=3 mntns=5

7/8 tracks the creation and deletion of of namespaces, listing the type of
namespace instance, related namespace id if there is one and the newly minted
serial number.

Proposed output format for initial namespace creation:
type=AUDIT_NS_INIT_UTS msg=audit(1408577534.868:5): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_utsns=0 utsns=2 res=1
type=AUDIT_NS_INIT_USER msg=audit(1408577534.868:6): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_userns=0 userns=3 res=1
type=AUDIT_NS_INIT_PID msg=audit(1408577534.868:7): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_pidns=0 pidns=4 res=1
type=AUDIT_NS_INIT_MNT msg=audit(1408577534.868:8): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_mntns=0 mntns=5 res=1
type=AUDIT_NS_INIT_IPC msg=audit(1408577534.868:9): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_ipcns=0 ipcns=1 res=1
type=AUDIT_NS_INIT_NET msg=audit(1408577533.500:10): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel old_netns=0 netns=7 res=1

And a CLONE action would result in:
type=type=AUDIT_NS_INIT_NET msg=audit(1408577535.306:81): pid=481 uid=0 
auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 old_netns=7 
netns=8 res=1
type=type=AUDIT_NS_INIT_MNT msg=audit(1408577535.307:83): pid=481 uid=0 
auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 old_mntns=5 
mntns=9 res=1

While deleting a namespace would result in:
type=type=AUDIT_NS_DEL_MNT msg=audit(1408577552.221:85): pid=481 uid=0 
auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 mntns=9 res=1

8/8 change audit startup from __initcall to subsys_initcall to get it started
earlier to be able to receive initial namespace log messages.


v3 -> v4:
Seperate out the NS_INFO message from the SYSCALL message.
Moved audit_log_namespace_info() out of audit_log_task_info().
Use a seperate message type per namespace type for each of INIT/DEL.
Make ns= easier to search across NS_INFO and NS_INIT/DEL_XXX msg types.
Add /proc//ns/ documentation.
Fix dynamic initial ns logging.

v2 -> v3:
Use atomic64_t in ns_serial to simplify it.
Avoid funciton duplication in proc, keying on dentry.
Squash down audit patch to avoid rcu sleep issues.
Add tracking for creation and deletion of namespace instances.

v1 -> v2:
Avoid rollover by switching from an int to a long long.
Change rollover behaviour from simply avoiding zero to raising a BUG.
Expose serial numbers in /proc//ns/*_snum.
Expose ns_entries and use it in audit.


Notes:
As for CAP_AUDIT_READ, a patchset has been accepted upstream to check
capabilities of userspace processes that try to join netlink broadcast groups.

This set does not try to solve the non-init namespace audit messages and
auditd problem yet.  That will come later, likely with additional auditd
instances running in another namespace with a limited ability to influence the
master auditd.  I echo Eric B's idea that messages destined for different
namespaces would 

Re: [PATCHv2 1/3] fs/buffer.c: allocate buffer cache with user specific flag

2014-08-20 Thread Gioh Kim



2014-08-21 오전 7:02, Jan Kara 쓴 글:

On Wed 20-08-14 11:38:10, Gioh Kim wrote:



@@ -1381,12 +1383,7 @@ EXPORT_SYMBOL(__find_get_block);
  struct buffer_head *
  __getblk(struct block_device *bdev, sector_t block, unsigned size)
  {
-   struct buffer_head *bh = __find_get_block(bdev, block, size);
-
-   might_sleep();
-   if (bh == NULL)
-   bh = __getblk_slow(bdev, block, size);
-   return bh;
+   return __getblk_gfp(bdev, block, size, __GFP_MOVABLE);
  }
  EXPORT_SYMBOL(__getblk);

   Why did you remove the __find_get_block() call? That looks like a bug.

   I'm not sure if you didn't miss this comment


I'm sorry I missed it.
I think calling __find_get_block() in __getblk_gfp() can replace it.
I'm not sure about it.

If anybody disagree with it, I'll change it as the original code.

   OK, I see. Thanks for explanation. I agree we can remove
__find_get_block() from __getblk() but please make this change a separate
patch and also please put the might_sleep() check __getblk_gfp().

Honza



I got it.
I'm going to report v3 patch that reverts __getblk() and adds 
sb_bread_unmovable(),
if Andrew give me a feedback about the other codes.
I can wait ;-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 4/4] zram: report maximum used memory

2014-08-20 Thread Minchan Kim
Normally, zram user could get maximum memory usage zram consumed
via polling mem_used_total with sysfs in userspace.

But it has a critical problem because user can miss peak memory
usage during update inverval of polling. For avoiding that,
user should poll it with shorter interval(ie, 0.01s)
with mlocking to avoid page fault delay when memory pressure
is heavy. It would be troublesome.

This patch adds new knob "mem_used_max" so user could see
the maximum memory usage easily via reading the knob and reset
it via "echo 0 > /sys/block/zram0/mem_used_max".

Signed-off-by: Minchan Kim 
---
 Documentation/ABI/testing/sysfs-block-zram | 10 ++
 Documentation/blockdev/zram.txt|  1 +
 drivers/block/zram/zram_drv.c  | 57 --
 drivers/block/zram/zram_drv.h  |  1 +
 4 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-block-zram 
b/Documentation/ABI/testing/sysfs-block-zram
index 025331c19045..ffd1ea7443dd 100644
--- a/Documentation/ABI/testing/sysfs-block-zram
+++ b/Documentation/ABI/testing/sysfs-block-zram
@@ -120,6 +120,16 @@ Description:
statistic.
Unit: bytes
 
+What:  /sys/block/zram/mem_used_max
+Date:  August 2014
+Contact:   Minchan Kim 
+Description:
+   The mem_used_max file is read/write and specifies the amount
+   of maximum memory zram have consumed to store compressed data.
+   For resetting the value, you should do "echo 0". Otherwise,
+   you could see -EINVAL.
+   Unit: bytes
+
 What:  /sys/block/zram/mem_limit
 Date:  August 2014
 Contact:   Minchan Kim 
diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index 9f239ff8c444..3b2247c2d4cf 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -107,6 +107,7 @@ size of the disk when not in use so a huge zram is wasteful.
orig_data_size
compr_data_size
mem_used_total
+   mem_used_max
 
 8) Deactivate:
swapoff /dev/zram0
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index adc91c7ecaef..138787579478 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -149,6 +149,41 @@ static ssize_t mem_limit_store(struct device *dev,
return len;
 }
 
+static ssize_t mem_used_max_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   u64 val = 0;
+   struct zram *zram = dev_to_zram(dev);
+
+   down_read(>init_lock);
+   if (init_done(zram))
+   val = atomic64_read(>stats.max_used_pages);
+   up_read(>init_lock);
+
+   return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
+}
+
+static ssize_t mem_used_max_store(struct device *dev,
+   struct device_attribute *attr, const char *buf, size_t len)
+{
+   int err;
+   unsigned long val;
+   struct zram *zram = dev_to_zram(dev);
+   struct zram_meta *meta = zram->meta;
+
+   err = kstrtoul(buf, 10, );
+   if (err || val != 0)
+   return -EINVAL;
+
+   down_read(>init_lock);
+   if (init_done(zram))
+   atomic64_set(>stats.max_used_pages,
+   zs_get_total_size(meta->mem_pool));
+   up_read(>init_lock);
+
+   return len;
+}
+
 static ssize_t max_comp_streams_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t len)
 {
@@ -461,6 +496,18 @@ out_cleanup:
return ret;
 }
 
+static inline void update_used_max(struct zram *zram, const unsigned long 
pages)
+{
+   u64 old_max, cur_max;
+
+   do {
+   old_max = cur_max = atomic64_read(>stats.max_used_pages);
+   if (pages > cur_max)
+   old_max = atomic64_cmpxchg(>stats.max_used_pages,
+   cur_max, pages);
+   } while (old_max != cur_max);
+}
+
 static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
   int offset)
 {
@@ -472,6 +519,7 @@ static int zram_bvec_write(struct zram *zram, struct 
bio_vec *bvec, u32 index,
struct zram_meta *meta = zram->meta;
struct zcomp_strm *zstrm;
bool locked = false;
+   unsigned long alloced_pages;
 
page = bvec->bv_page;
if (is_partial_io(bvec)) {
@@ -541,13 +589,15 @@ static int zram_bvec_write(struct zram *zram, struct 
bio_vec *bvec, u32 index,
goto out;
}
 
-   if (zram->limit_pages &&
-   zs_get_total_size(meta->mem_pool) > zram->limit_pages) {
+   alloced_pages = zs_get_total_size(meta->mem_pool);
+   if (zram->limit_pages && alloced_pages > zram->limit_pages) {
zs_free(meta->mem_pool, handle);
ret = -ENOMEM;
 

[PATCH v3 0/4] zram memory control enhance

2014-08-20 Thread Minchan Kim
Currently, zram has no feature to limit memory so theoretically
zram can deplete system memory.
Users have asked for a limit several times as even without exhaustion
zram makes it hard to control memory usage of the platform.
This patchset adds the feature.

Patch 1 makes zs_get_total_size_bytes faster because it would be
used frequently in later patches for the new feature.

Patch 2 changes zs_get_total_size_bytes's return unit from bytes
to page so that zsmalloc doesn't need unnecessary operation(ie,
<< PAGE_SHIFT).

Patch 3 adds new feature. I added the feature into zram layer,
not zsmalloc because limiation is zram's requirement, not zsmalloc
so any other user using zsmalloc(ie, zpool) shouldn't affected
by unnecessary branch of zsmalloc. In future, if every users
of zsmalloc want the feature, then, we could move the feature
from client side to zsmalloc easily but vice versa would be
painful.

Patch 4 adds news facility to report maximum memory usage of zram
so that user can get how many of peak memory zram used easily
without extremely short inverval polling via
/sys/block/zram0/mem_used_total.

* From v2
 * Introduce helper funcntion to update max_used_pages
   for readability - David
 * Avoid unncessary zs_get_total_size call in updating loop
   for max_used_pages - David

* From v1
 * rebased on next-20140815
 * fix up race problem - David, Dan
 * reset mem_used_max as current total_bytes, rather than 0 - David
 * resetting works with only "0" write for extensiblilty - David, Dan

Minchan Kim (4):
  zsmalloc: move pages_allocated to zs_pool
  zsmalloc: change return value unit of zs_get_total_size_bytes
  zram: zram memory size limitation
  zram: report maximum used memory

 Documentation/ABI/testing/sysfs-block-zram | 19 ++
 Documentation/blockdev/zram.txt| 21 +--
 drivers/block/zram/zram_drv.c  | 98 +-
 drivers/block/zram/zram_drv.h  |  6 ++
 include/linux/zsmalloc.h   |  2 +-
 mm/zsmalloc.c  | 38 ++--
 6 files changed, 159 insertions(+), 25 deletions(-)

-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/4] zsmalloc: move pages_allocated to zs_pool

2014-08-20 Thread Minchan Kim
Pages_allocated has counted in size_class structure and when user
want to see total_size_bytes, it gathers all of value from each
size_class to report the sum.

It's not bad if user don't see the value often but if user start
to see the value frequently, it would be not a good deal for
performance POV.

This patch moves the variable from size_class to zs_pool so it would
reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
but it adds new locking overhead but it wouldn't be severe because
it's not a hot path in zs_malloc(ie, it is called only when new
zspage is created, not a object).

Signed-off-by: Minchan Kim 
---
 mm/zsmalloc.c | 30 --
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 94f38fac5e81..a65924255763 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -199,9 +199,6 @@ struct size_class {
 
spinlock_t lock;
 
-   /* stats */
-   u64 pages_allocated;
-
struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
 };
 
@@ -217,9 +214,12 @@ struct link_free {
 };
 
 struct zs_pool {
+   spinlock_t stat_lock;
+
struct size_class size_class[ZS_SIZE_CLASSES];
 
gfp_t flags;/* allocation flags used when growing pool */
+   unsigned long pages_allocated;
 };
 
 /*
@@ -967,6 +967,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
 
}
 
+   spin_lock_init(>stat_lock);
pool->flags = flags;
 
return pool;
@@ -1028,8 +1029,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t 
size)
return 0;
 
set_zspage_mapping(first_page, class->index, ZS_EMPTY);
+   spin_lock(>stat_lock);
+   pool->pages_allocated += class->pages_per_zspage;
+   spin_unlock(>stat_lock);
spin_lock(>lock);
-   class->pages_allocated += class->pages_per_zspage;
}
 
obj = (unsigned long)first_page->freelist;
@@ -1082,14 +1085,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
 
first_page->inuse--;
fullness = fix_fullness_group(pool, first_page);
-
-   if (fullness == ZS_EMPTY)
-   class->pages_allocated -= class->pages_per_zspage;
-
spin_unlock(>lock);
 
-   if (fullness == ZS_EMPTY)
+   if (fullness == ZS_EMPTY) {
+   spin_lock(>stat_lock);
+   pool->pages_allocated -= class->pages_per_zspage;
+   spin_unlock(>stat_lock);
free_zspage(first_page);
+   }
 }
 EXPORT_SYMBOL_GPL(zs_free);
 
@@ -1185,12 +1188,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
 
 u64 zs_get_total_size_bytes(struct zs_pool *pool)
 {
-   int i;
-   u64 npages = 0;
-
-   for (i = 0; i < ZS_SIZE_CLASSES; i++)
-   npages += pool->size_class[i].pages_allocated;
+   u64 npages;
 
+   spin_lock(>stat_lock);
+   npages = pool->pages_allocated;
+   spin_unlock(>stat_lock);
return npages << PAGE_SHIFT;
 }
 EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >