Re: [PATCH v2] x86, kaslr: export offset in VMCOREINFO ELF notes

2014-01-24 Thread Ingo Molnar

* Kees Cook  wrote:

> From: Eugene Surovegin 
> 
> Include kASLR offset in VMCOREINFO ELF notes to assist in debugging.
> 
> Signed-off-by: Eugene Surovegin 
> Signed-off-by: Kees Cook 
> ---
> v2:
>  - make sure "From:" got sent correctly
> ---
>  arch/x86/kernel/machine_kexec_64.c |2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/x86/kernel/machine_kexec_64.c 
> b/arch/x86/kernel/machine_kexec_64.c
> index 4eabc160696f..679cef0791cd 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -279,5 +279,7 @@ void arch_crash_save_vmcoreinfo(void)
>   VMCOREINFO_SYMBOL(node_data);
>   VMCOREINFO_LENGTH(node_data, MAX_NUMNODES);
>  #endif
> + vmcoreinfo_append_str("KERNELOFFSET=%lx\n",
> +   (unsigned long)&_text - __START_KERNEL);
>  }

I've Cc:-ed Adrian Hunter, who has sent the following kaslr fixes for 
perf yesterday:

  http://lkml.org/lkml/2014/1/24/220

Adrian, is this patch the right solution from the perf tooling 
perspective?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: disabled APICs being counted as processors ?

2014-01-24 Thread Ingo Molnar

* Dave Jones  wrote:

> I have a system with 4 cores (configured with CONFIG_NR_CPUS=4) that shows 
> during boot..
> 
> [0.00] smpboot: 8 Processors exceeds NR_CPUS limit of 4
> 
> it looks like this is because..
> 
> [0.00] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> [0.00] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
> [0.00] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
> [0.00] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
> [0.00] ACPI: LAPIC (acpi_id[0x05] lapic_id[0xff] disabled)
> [0.00] ACPI: LAPIC (acpi_id[0x06] lapic_id[0xff] disabled)
> [0.00] ACPI: LAPIC (acpi_id[0x07] lapic_id[0xff] disabled)
> [0.00] ACPI: LAPIC (acpi_id[0x08] lapic_id[0xff] disabled)
> 
> Should the CPU counting code be ignoring those disabled APICs ?

Hm, so to the kernel it looks like as if those were 'possible CPUs', 
in theory hotpluggable. Not sure what they are - disabled cores in an 
8-core system? Or BIOS reporting crap?

But perhaps the boot message could be improved to say something like:

> [0.00] smpboot: 8 possible processors exceeds NR_CPUS limit of 4

?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] perf fixes

2014-01-24 Thread Ingo Molnar
Linus,

Please pull the latest perf-urgent-for-linus git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
perf-urgent-for-linus

   # HEAD: 993e5ee67a90c7b6a5dbb61b9c31df2955afff46 Merge tag 
'perf-urgent-for-mingo' of 
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

A handful of tooling fixes.

 Thanks,

Ingo

-->
Alan Cox (1):
  perf tools: Ensure sscanf does not overrun the "mem" field

Baruch Siach (1):
  perf tools: Add support for the xtensa architecture

Josh Boyer (1):
  perf tools: Fix traceevent plugin path definitions

Masami Hiramatsu (1):
  perf symbols: Load map before using map->map_ip()

Namhyung Kim (1):
  perf symbols: Fix JIT symbol resolution on heap

Stanislav Fomichev (2):
  perf timechart: Fix wrong SVG height
  perf session: Free cpu_map in perf_session__cpu_bitmap

Stephane Eranian (3):
  perf stat: fix NULL pointer reference bug with event unit
  perf evsel: Remove duplicate member zeroing after free
  perf stat: Fix memory corruption of xyarray when cpumask is used


 tools/lib/traceevent/Makefile  |  2 +-
 tools/perf/builtin-timechart.c |  3 +++
 tools/perf/config/Makefile |  2 +-
 tools/perf/perf.h  |  7 +++
 tools/perf/util/evlist.c   |  7 +--
 tools/perf/util/evsel.c|  1 -
 tools/perf/util/header.c   |  2 +-
 tools/perf/util/map.c  |  7 ---
 tools/perf/util/parse-events.c |  2 +-
 tools/perf/util/pmu.c  | 24 
 tools/perf/util/pmu.h  |  2 +-
 tools/perf/util/session.c  | 10 +++---
 12 files changed, 51 insertions(+), 18 deletions(-)

diff --git a/tools/lib/traceevent/Makefile b/tools/lib/traceevent/Makefile
index 56d52a3..005c9cc 100644
--- a/tools/lib/traceevent/Makefile
+++ b/tools/lib/traceevent/Makefile
@@ -63,7 +63,7 @@ endif
 endif
 
 ifeq ($(set_plugin_dir),1)
-PLUGIN_DIR = -DPLUGIN_DIR="$(DESTDIR)/$(plugin_dir)"
+PLUGIN_DIR = -DPLUGIN_DIR="$(plugin_dir)"
 PLUGIN_DIR_SQ = '$(subst ','\'',$(PLUGIN_DIR))'
 endif
 
diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c
index 652af0b..25526d6 100644
--- a/tools/perf/builtin-timechart.c
+++ b/tools/perf/builtin-timechart.c
@@ -1045,6 +1045,9 @@ static void write_svg_file(struct timechart *tchart, 
const char *filename)
thresh /= 10;
} while (!process_filter && thresh && count < tchart->proc_num);
 
+   if (!tchart->proc_num)
+   count = 0;
+
open_svg(filename, tchart->numcpus, count, tchart->first_time, 
tchart->last_time);
 
svg_time_grid();
diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index d604e50..c48d449 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -600,5 +600,5 @@ perfexec_instdir_SQ = $(subst ','\'',$(perfexec_instdir))
 # Otherwise we install plugins into the global $(libdir).
 ifdef DESTDIR
 plugindir=$(libdir)/traceevent/plugins
-plugindir_SQ= $(subst ','\'',$(prefix)/$(plugindir))
+plugindir_SQ= $(subst ','\'',$(plugindir))
 endif
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 3c2f213..7daa806 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -132,6 +132,13 @@
 #define CPUINFO_PROC   "CPU"
 #endif
 
+#ifdef __xtensa__
+#define mb()   asm volatile("memw" ::: "memory")
+#define wmb()  asm volatile("memw" ::: "memory")
+#define rmb()  asm volatile("" ::: "memory")
+#define CPUINFO_PROC   "core ID"
+#endif
+
 #define barrier() asm volatile ("" ::: "memory")
 
 #ifndef cpu_relax
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 40bd2c0..59ef280 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1003,9 +1003,12 @@ void perf_evlist__close(struct perf_evlist *evlist)
struct perf_evsel *evsel;
int ncpus = cpu_map__nr(evlist->cpus);
int nthreads = thread_map__nr(evlist->threads);
+   int n;
 
-   evlist__for_each_reverse(evlist, evsel)
-   perf_evsel__close(evsel, ncpus, nthreads);
+   evlist__for_each_reverse(evlist, evsel) {
+   n = evsel->cpus ? evsel->cpus->nr : ncpus;
+   perf_evsel__close(evsel, n, nthreads);
+   }
 }
 
 int perf_evlist__open(struct perf_evlist *evlist)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 22e18a2..55407c5 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1081,7 +1081,6 @@ void perf_evsel__close(struct perf_evsel *evsel, int 
ncpus, int nthreads)
 
perf_evsel__close_fd(evsel, ncpus, nthreads);
perf_evsel__free_fd(evsel);
-   evsel->fd = NULL;
 }
 
 static struct {
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index bb3e0ed..893f8e2 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -930,7 +930,7 @@ static int write_topo_node(int fd, int node)
/* skip over invalid lines */
  

Re: [GIT PULL] tick: A few more cleanups

2014-01-24 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> Ingo,
> 
> Please pull the timers/core branch that can be found at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
>   timers/core
> 
> HEAD: 8fe8ff09ce3b5750e1f3e45a1f4a81d59c7ff1f1
> 
> 
> Nothing very exiting, just a bunch of non-critical cleanups for the next 
> merge window:
> 
> 1) Make the IRQ tick APIs naming more symetric
> 
> 2) Optimize a bit jiffies_lock code coverage
> 
> 3) Whitespace fixes from Alex Shi
> 
> 4) Fix overflow in scheduler tick max deferment calculation. Given the
> current 1 second max limitation, this bug shouldn't happen in mainline.
> It's rather to prepare for making this value tunable. Or simply in case
> we change the current constant.
> 
> Thanks,
>   Frederic
> ---
> 
> Frederic Weisbecker (2):
>   tick: Rename tick_check_idle() to tick_irq_enter()
>   nohz: Get timekeeping max deferment outside jiffies_lock
> 
> Alex Shi (1):
>   nohz_full: fix code style issue of tick_nohz_full_stop_tick
> 
> Kevin Hilman (1):
>   sched/nohz: Fix overflow error in scheduler_tick_max_deferment()
> 
> 
>  include/linux/jiffies.h  |  6 ++
>  include/linux/tick.h |  6 +++---
>  kernel/sched/core.c  |  2 +-
>  kernel/softirq.c |  2 +-
>  kernel/time/tick-sched.c | 27 ++-
>  5 files changed, 25 insertions(+), 18 deletions(-)

Pulled, thanks Frederic!

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] scheduler fixes

2014-01-24 Thread Ingo Molnar
Linus,

Please pull the latest sched-urgent-for-linus git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
sched-urgent-for-linus

   # HEAD: 5e3c1afd4587e70c201bf7224b51f747c9a3dfa8 sched/x86/tsc: Initialize 
multiplier to 0

A couple of regression fixes mostly hitting virtualized setups, but 
also some bare metal systems.

 Thanks,

Ingo

-->
Peter Zijlstra (3):
  sched/preempt/x86: Fix voluntary preempt for x86
  sched/clock: Fixup early initialization
  sched/x86/tsc: Initialize multiplier to 0

Vincent Guittot (1):
  Revert "sched: Fix sleep time double accounting in enqueue entity"


 arch/x86/kernel/tsc.c   |  2 +-
 include/linux/preempt.h |  5 -
 kernel/sched/clock.c| 53 ++---
 kernel/sched/fair.c |  8 +---
 4 files changed, 43 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a3acbac..19e5adb 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -180,7 +180,7 @@ static void cyc2ns_write_end(int cpu, struct cyc2ns_data 
*data)
 
 static void cyc2ns_data_init(struct cyc2ns_data *data)
 {
-   data->cyc2ns_mul = 1U << CYC2NS_SCALE_FACTOR;
+   data->cyc2ns_mul = 0;
data->cyc2ns_shift = CYC2NS_SCALE_FACTOR;
data->cyc2ns_offset = 0;
data->__count = 0;
diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index 59749fc..de83b4e 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -134,7 +134,6 @@ do { \
 #undef preempt_check_resched
 #endif
 
-#ifdef CONFIG_PREEMPT
 #define preempt_set_need_resched() \
 do { \
set_preempt_need_resched(); \
@@ -144,10 +143,6 @@ do { \
if (tif_need_resched()) \
set_preempt_need_resched(); \
 } while (0)
-#else
-#define preempt_set_need_resched() do { } while (0)
-#define preempt_fold_need_resched() do { } while (0)
-#endif
 
 #ifdef CONFIG_PREEMPT_NOTIFIERS
 
diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
index 6bd6a67..43c2bcc 100644
--- a/kernel/sched/clock.c
+++ b/kernel/sched/clock.c
@@ -77,35 +77,50 @@ __read_mostly int sched_clock_running;
 
 #ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
 static struct static_key __sched_clock_stable = STATIC_KEY_INIT;
+static int __sched_clock_stable_early;
 
 int sched_clock_stable(void)
 {
-   if (static_key_false(&__sched_clock_stable))
-   return false;
-   return true;
+   return static_key_false(&__sched_clock_stable);
 }
 
-void set_sched_clock_stable(void)
+static void __set_sched_clock_stable(void)
 {
if (!sched_clock_stable())
-   static_key_slow_dec(&__sched_clock_stable);
+   static_key_slow_inc(&__sched_clock_stable);
+}
+
+void set_sched_clock_stable(void)
+{
+   __sched_clock_stable_early = 1;
+
+   smp_mb(); /* matches sched_clock_init() */
+
+   if (!sched_clock_running)
+   return;
+
+   __set_sched_clock_stable();
 }
 
 static void __clear_sched_clock_stable(struct work_struct *work)
 {
/* XXX worry about clock continuity */
if (sched_clock_stable())
-   static_key_slow_inc(&__sched_clock_stable);
+   static_key_slow_dec(&__sched_clock_stable);
 }
 
 static DECLARE_WORK(sched_clock_work, __clear_sched_clock_stable);
 
 void clear_sched_clock_stable(void)
 {
-   if (keventd_up())
-   schedule_work(_clock_work);
-   else
-   __clear_sched_clock_stable(_clock_work);
+   __sched_clock_stable_early = 0;
+
+   smp_mb(); /* matches sched_clock_init() */
+
+   if (!sched_clock_running)
+   return;
+
+   schedule_work(_clock_work);
 }
 
 struct sched_clock_data {
@@ -140,6 +155,20 @@ void sched_clock_init(void)
}
 
sched_clock_running = 1;
+
+   /*
+* Ensure that it is impossible to not do a static_key update.
+*
+* Either {set,clear}_sched_clock_stable() must see sched_clock_running
+* and do the update, or we must see their __sched_clock_stable_early
+* and do the update, or both.
+*/
+   smp_mb(); /* matches {set,clear}_sched_clock_stable() */
+
+   if (__sched_clock_stable_early)
+   __set_sched_clock_stable();
+   else
+   __clear_sched_clock_stable(NULL);
 }
 
 /*
@@ -340,7 +369,7 @@ EXPORT_SYMBOL_GPL(sched_clock_idle_wakeup_event);
  */
 u64 cpu_clock(int cpu)
 {
-   if (static_key_false(&__sched_clock_stable))
+   if (!sched_clock_stable())
return sched_clock_cpu(cpu);
 
return sched_clock();
@@ -355,7 +384,7 @@ u64 cpu_clock(int cpu)
  */
 u64 local_clock(void)
 {
-   if (static_key_false(&__sched_clock_stable))
+   if (!sched_clock_stable())
return sched_clock_cpu(raw_smp_processor_id());
 
return sched_clock();
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b24b6cf..efe6457 100644

Re: [GIT PULL 0/2] perf/urgent fixes

2014-01-24 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> From: Arnaldo Carvalho de Melo 
> 
> Hi Ingo,
> 
>   Please consider pulling,
> 
> Regards,
> 
> - Arnaldo
> 
> The following changes since commit bb236de5d9509c1c6ea5ce0680f02e731ee2:
> 
>   Merge tag 'perf-core-for-mingo' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent 
> (2014-01-23 17:43:35 +0100)
> 
> are available in the git repository at:
> 
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux 
> tags/perf-urgent-for-mingo
> 
> for you to fetch changes up to 4afc81cd1caa93daa50c1c29a3ab747c978abc13:
> 
>   perf symbols: Load map before using map->map_ip() (2014-01-23 15:48:12 
> -0300)
> 
> 
> perf/urgent fixes:
> 
> . Fix traceevent plugin path definitions (Josh Boyer)
> 
> . Load map before using map->map_ip() (Masami Hiramatsu)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Josh Boyer (1):
>   perf tools: Fix traceevent plugin path definitions
> 
> Masami Hiramatsu (1):
>   perf symbols: Load map before using map->map_ip()
> 
>  tools/lib/traceevent/Makefile | 2 +-
>  tools/perf/config/Makefile| 2 +-
>  tools/perf/util/map.c | 3 ++-
>  3 files changed, 4 insertions(+), 3 deletions(-)

Pulled, thanks Arnaldo!

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/3] ext4: increase mbcache scalability

2014-01-24 Thread Thavatchai Makphaibulchoke
On 01/24/2014 02:38 PM, Andi Kleen wrote:
> T Makphaibulchoke  writes:
> 
>> The patch consists of three parts.
>>
>> The first part changes the implementation of both the block and hash chains 
>> of
>> an mb_cache from list_head to hlist_bl_head and also introduces new members,
>> including a spinlock to mb_cache_entry, as required by the second part.
> 
> spinlock per entry is usually overkill for larger hash tables.
> 
> Can you use a second smaller lock table that just has locks and is 
> indexed by a subset of the hash key. Most likely a very small 
> table is good enough.
> 
> Also I would be good to have some data on the additional memory consumption.
> 
> -Andi
> 

Thanks Andi for the comments.  Will look into that.

Thanks,
Mak.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH-v2 1/3] percpu_ida: Make percpu_ida_alloc + callers accept task state bitmask

2014-01-24 Thread Nicholas A. Bellinger
On Fri, 2014-01-24 at 16:14 +0100, Peter Zijlstra wrote:
> On Thu, Jan 23, 2014 at 11:38:24AM -0800, Nicholas A. Bellinger wrote:
> > > AFAICT, those changes don't address the original bug that the series was
> > > trying to address, allowing the percpu_ida_alloc() tag stealing slow
> > > path to be interrupted by a signal..
> > > 
> > > Also, keep in mind this change needs to be backported to >= v3.12, which
> > > is why the percpu_ida changes have been kept to a minimum.
> 
> Well, the other option is to revert whatever caused the issue in the
> first place :-)
> 
> I'm not much for making ugly fixes just because its easier to backport.
> 

Me either, but given Kent's comment on subtle issues for larger
improvements in the tag stealing slow path, it's the right approach for
addressing a bug in stable code.

Also, I'd also prefer to avoid introducing new issues for blk-mq in the
first round of stable code, give the amount of testing it's already
undergone with percpu_ida for v3.13.

> > 
> > 
> > So would you prefer the following addition to the original bugfix
> > instead..?
> 
> I'll make a right old mess out of percpu_ida.c, but yeah.
> 

Kent and I would both like to see your improvements merged in upstream,
just not in a manner that would cause Greg-KH to start cursing loudly
for a stable backport.

That said, unless Jens throws a last minute NACK, I'll be queuing patch
#1 + #3 into target-pending/for-next for sunday night's build.

Jens, please feel free to pickup Patch #2 for a post v3.15 merge at your
earliest convenience.

Thanks,

--nab

> > diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
> > index a48ce2e..58b6714 100644
> > --- a/lib/percpu_ida.c
> > +++ b/lib/percpu_ida.c
> > @@ -174,7 +174,8 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state)
> >  *
> >  * global lock held and irqs disabled, don't need percpu 
> > lock
> >  */
> > -   prepare_to_wait(>wait, , state);
> > +   if (state != TASK_RUNNING)
> > +   prepare_to_wait(>wait, , state);
> >  
> > if (!tags->nr_free)
> > alloc_global_tags(pool, tags);
> > @@ -199,8 +200,9 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state)
> > local_irq_save(flags);
> > tags = this_cpu_ptr(pool->tag_cpu);
> > }
> > +   if (state != TASK_RUNNING)
> > +   finish_wait(>wait, );
> >  
> > -   finish_wait(>wait, );
> > return tag;
> >  }
> >  EXPORT_SYMBOL_GPL(percpu_ida_alloc);
> > 
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/3] ext4: increase mbcache scalability

2014-01-24 Thread Andreas Dilger
I think the ext4 block groups are locked with the blockgroup_lock that has 
about the same number of locks as the number of cores, with a max of 128, IIRC. 
 See blockgroup_lock.h. 

While there is some chance of contention, it is also unlikely that all of the 
cores are locking this area at the same time.  

Cheers, Andreas

> On Jan 24, 2014, at 14:38, Andi Kleen  wrote:
> 
> T Makphaibulchoke  writes:
> 
>> The patch consists of three parts.
>> 
>> The first part changes the implementation of both the block and hash chains 
>> of
>> an mb_cache from list_head to hlist_bl_head and also introduces new members,
>> including a spinlock to mb_cache_entry, as required by the second part.
> 
> spinlock per entry is usually overkill for larger hash tables.
> 
> Can you use a second smaller lock table that just has locks and is 
> indexed by a subset of the hash key. Most likely a very small 
> table is good enough.
> 
> Also I would be good to have some data on the additional memory consumption.
> 
> -Andi
> 
> -- 
> a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] mtd: block2mtd: Add mutex_destroy

2014-01-24 Thread Ezequiel Garcia
On Thu, Jan 23, 2014 at 08:51:47PM +0100, Fabian Frederick wrote:
> mutex_destroy added on each device in block2mtd_exit and add_device failure
> 
> Signed-off-by: Fabian Frederick 
> ---
>  drivers/mtd/devices/block2mtd.c | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
> index 0efee5b..8071596 100644
> --- a/drivers/mtd/devices/block2mtd.c
> +++ b/drivers/mtd/devices/block2mtd.c
> @@ -240,14 +240,14 @@ static struct block2mtd_dev *add_device(char *devname, 
> int erase_size)
>  
>   if (IS_ERR(bdev)) {
>   pr_err("error: cannot open device %s\n", devname);
> - goto devinit_err;
> + goto devinit_err1;

Ah, this one commit looks good, but the naming of the labels doesn't.

Instead, you should use a name describing what the error path does,
such as "err_free_block2mtd" and "err_destroy_mutex" or something like that.

We have a language full of words with meaning, so it's shame to use
dumb numbers :-)
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] mtd: block2mtd: mutex_init moved

2014-01-24 Thread Ezequiel Garcia
On Thu, Jan 23, 2014 at 08:54:56PM +0100, Fabian Frederick wrote:
> mutex_init declared when mtd structure is available
> 
> Signed-off-by: Fabian Frederick 
> ---
>  drivers/mtd/devices/block2mtd.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
> index f0fd4fc..5b9b145 100644
> --- a/drivers/mtd/devices/block2mtd.c
> +++ b/drivers/mtd/devices/block2mtd.c
> @@ -254,16 +254,15 @@ static struct block2mtd_dev *add_device(char *devname, 
> int erase_size)
>   goto devinit_err1;
>   }
>  
> + name = kasprintf(GFP_KERNEL, "block2mtd: %s", devname);
> + if (!name)
> + goto devinit_err1;
> +
>   mutex_init(>write_mutex);
>  
>   /* Setup the MTD structure */
>   /* make the name contain the block device in */
> - name = kasprintf(GFP_KERNEL, "block2mtd: %s", devname);
> - if (!name)
> - goto devinit_err2;
> -
>   dev->mtd.name = name;
> -
>   dev->mtd.size = dev->blkdev->bd_inode->i_size & PAGE_MASK;
>   dev->mtd.erasesize = erase_size;
>   dev->mtd.writesize = 1;
> -- 
> 1.8.1.4

Hm.. this change doesn't seem to make sense. You just moved the name
format to happen before the mutex_init. I wonder if I'm missing something.
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-24 Thread Mike Galbraith
On Fri, 2014-01-24 at 20:46 +0100, Sebastian Andrzej Siewior wrote: 
> * Mike Galbraith | 2013-12-23 06:12:39 [+0100]:
> 
> >P.S.
> >
> >virgin -rt7 doing tbench 64 + make -j64
> >
> >[   97.907960] perf samples too long (3138 > 2500), lowering 
> >kernel.perf_event_max_sample_rate to 5
> >[  103.047921] perf samples too long (5544 > 5000), lowering 
> >kernel.perf_event_max_sample_rate to 25000
> >[  181.561271] perf samples too long (10318 > 1), lowering 
> >kernel.perf_event_max_sample_rate to 13000
> >[  184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to 
> >run: 1.084 msecs
> >[  248.914422] perf samples too long (19719 > 19230), lowering 
> >kernel.perf_event_max_sample_rate to 7000
> >[  382.116674] NOHZ: local_softirq_pending 10
> This is block
> 
> >[  405.201593] perf samples too long (36824 > 35714), lowering 
> >kernel.perf_event_max_sample_rate to 4000
> >[  444.704185] NOHZ: local_softirq_pending 08
> >[  444.704208] NOHZ: local_softirq_pending 08
> >[  444.704579] NOHZ: local_softirq_pending 08
> >[  444.704678] NOHZ: local_softirq_pending 08
> >[  444.705100] NOHZ: local_softirq_pending 08
> >[  444.705980] NOHZ: local_softirq_pending 08
> >[  444.705994] NOHZ: local_softirq_pending 08
> >[  444.708315] NOHZ: local_softirq_pending 08
> >[  444.710348] NOHZ: local_softirq_pending 08
> 
> and this is RX. Is your testcase heavy disk-io or heavy disk-io +
> network?

Yeah.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-24 Thread Mike Galbraith
On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: 
> * Mike Galbraith | 2014-01-18 04:25:14 [+0100]:
> 
> >> ># timers-do-not-raise-softirq-unconditionally.patch
> >> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
> >> >
> >> >..those two out does seem to have stabilized the thing.
> >> 
> >> timers-do-not-raise-softirq-unconditionally.patch is on its way out.
> >> 
> >> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
> >> Didn't you report once that your box deadlocks without this patch? Now
> >> your 64way box on the other hand does not work with it?
> >
> >If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
> is this just an observation or you do know why it won't save me?

It's an observation from beyond the grave from the 64 core box that it
repeatedly did NOT save :)  Autopsy photos below.

I've built 3.12.8-rt9 with Stevens v2 "timer: Raise softirq if there's
irq_work" to see if it'll survive.

nohz_full_all:
PID: 508TASK: 8802739ba340  CPU: 16  COMMAND: "ksoftirqd/16"
 #0 [880276806a40] machine_kexec at 8103bc07
 #1 [880276806aa0] crash_kexec at 810d56b3
 #2 [880276806b70] panic at 815bf8b0
 #3 [880276806bf0] watchdog_overflow_callback at 810fed3d
 #4 [880276806c10] __perf_event_overflow at 81131928
 #5 [880276806ca0] perf_event_overflow at 81132254
 #6 [880276806cb0] intel_pmu_handle_irq at 8102078f
 #7 [880276806de0] perf_event_nmi_handler at 815c5825
 #8 [880276806e10] nmi_handle at 815c4ed3
 #9 [880276806ea0] default_do_nmi at 815c5063   


#10 [880276806ed0] do_nmi at 815c5388   


#11 [880276806ef0] end_repeat_nmi at 815c4371   


[exception RIP: _raw_spin_trylock+48]   


RIP: 815c3790  RSP: 880276803e28  RFLAGS: 0002  


RAX: 0010  RBX: 0010  RCX: 0002 


RDX: 880276803e28  RSI: 0018  RDI: 0001 


RBP: 815c3790   R8: 815c3790   R9: 0018
R10: 880276803e28  R11: 0002  R12: 
R13: 880273a0c000  R14: 8802739ba340  R15: 880273a03fd8
ORIG_RAX: 880273a03fd8  CS: 0010  SS: 0018
---  ---
#12 [880276803e28] _raw_spin_trylock at 815c3790
#13 [880276803e30] rt_spin_lock_slowunlock_hirq at 815c2cc8
#14 [880276803e50] rt_spin_unlock_after_trylock_in_irq at 815c3425
#15 [880276803e60] get_next_timer_interrupt at 810684a7
#16 [880276803ed0] tick_nohz_stop_sched_tick at 810c5f2e
#17 [880276803f50] tick_nohz_irq_exit at 810c6333
#18 [880276803f70] irq_exit at 81060065
#19 [880276803f90] smp_apic_timer_interrupt at 810358f5
#20 [880276803fb0] apic_timer_interrupt at 815cbf9d
---  ---
#21 [880273a03b28] apic_timer_interrupt at 815cbf9d
[exception RIP: _raw_spin_lock+50]
RIP: 815c3642  RSP: 880273a03bd8  RFLAGS: 0202
RAX: 8b49  RBX: 880272157290  RCX: 8802739ba340
RDX: 8b4a  RSI: 0010  RDI: 880273a0c000
RBP: 880273a03bd8   R8: 0001   R9: 
R10:   R11: 0001  R12: 810927b5
R13: 880273a03b68  R14: 0010  R15: 0010
ORIG_RAX: ff10  CS: 0010  SS: 0018
#22 [880273a03be0] rt_spin_lock_slowlock at 815c2591
#23 [880273a03cc0] rt_spin_lock at 815c3362
#24 [880273a03cd0] run_timer_softirq at 81069002
#25 [880273a03d70] handle_softirq at 81060d0f
#26 

Re: [PATCH v11 1/4] qrwlock: A queue read/write lock implementation

2014-01-24 Thread Waiman Long

On 01/24/2014 03:25 AM, Peter Zijlstra wrote:

On Thu, Jan 23, 2014 at 11:28:48PM -0500, Waiman Long wrote:

+/**
+ * queue_read_trylock - try to acquire read lock of a queue rwlock
+ * @lock : Pointer to queue rwlock structure
+ * Return: 1 if lock acquired, 0 if failed
+ */
+static inline int queue_read_trylock(struct qrwlock *lock)
+{
+   union qrwcnts cnts;
+
+   cnts.rwc = ACCESS_ONCE(lock->cnts.rwc);
+   if (likely(!cnts.writer)) {
+   cnts.rwc = (u32)atomic_add_return(_QR_BIAS,>cnts.rwa);
+   if (likely(!cnts.writer)) {
+   smp_mb__after_atomic_inc();

That's superfluous, as atomic_add_return() is documented as being a full
barrier.


Yes, you are right. I have reviewed the memory_barrier.txt again and 
atomic_add_return() is supposed to act as a memory barrier. So no extra 
barrier. I will correct that in the next version.



+   return 1;
+   }
+   atomic_sub(_QR_BIAS,>cnts.rwa);
+   }
+   return 0;
+}
+
+/**
+ * queue_write_trylock - try to acquire write lock of a queue rwlock
+ * @lock : Pointer to queue rwlock structure
+ * Return: 1 if lock acquired, 0 if failed
+ */
+static inline int queue_write_trylock(struct qrwlock *lock)
+{
+   union qrwcnts old, new;
+
+   old.rwc = ACCESS_ONCE(lock->cnts.rwc);
+   if (likely(!old.rwc)) {
+   new.rwc = old.rwc;
+   new.writer = _QW_LOCKED;
+   if (likely(cmpxchg(>cnts.rwc, old.rwc, new.rwc)
+   == old.rwc))

One could actually use atomic_cmpxchg() and avoid one (ab)use of that
union :-)


I think either one is fine. I would like to keep the original code if it 
is not really a problem.



+   return 1;
+   }
+   return 0;
+}
+/**
+ * queue_read_lock - acquire read lock of a queue rwlock
+ * @lock: Pointer to queue rwlock structure
+ */
+static inline void queue_read_lock(struct qrwlock *lock)
+{
+   union qrwcnts cnts;
+
+   cnts.rwc = atomic_add_return(_QR_BIAS,>cnts.rwa);
+   if (likely(!cnts.writer)) {
+   smp_mb__after_atomic_inc();

Superfluous again.


Will remove that.


+   return;
+   queue_write_lock_slowpath(lock);
+}
+
+/**
+ * queue_read_unlock - release read lock of a queue rwlock
+ * @lock : Pointer to queue rwlock structure
+ */
+static inline void queue_read_unlock(struct qrwlock *lock)
+{
+   /*
+* Atomically decrement the reader count
+*/
+   smp_mb__before_atomic_dec();
+   atomic_sub(_QR_BIAS,>cnts.rwa);
+}
+
+/**
+ * queue_write_unlock - release write lock of a queue rwlock
+ * @lock : Pointer to queue rwlock structure
+ */
+static inline void queue_write_unlock(struct qrwlock *lock)
+{
+   /*
+* If the writer field is atomic, it can be cleared directly.
+* Otherwise, an atomic subtraction will be used to clear it.
+*/
+   if (__native_word(lock->cnts.writer))
+   smp_store_release(>cnts.writer, 0);
+   else {
+   smp_mb__before_atomic_dec();
+   atomic_sub(_QW_LOCKED,>cnts.rwa);
+   }

Missing {}, Documentation/CodingStyle Chapter 3 near the very end.


Thank for spotting that. Will fix it in the next version.

-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v11 4/4] qrwlock: Use the mcs_spinlock helper functions for MCS queuing

2014-01-24 Thread Waiman Long

On 01/24/2014 03:26 AM, Peter Zijlstra wrote:

On Thu, Jan 23, 2014 at 11:28:51PM -0500, Waiman Long wrote:

There is a pending MCS lock patch series that adds generic MCS lock
helper functions to do MCS-style locking. This patch will enable
the queue rwlock to use that generic MCS lock/unlock primitives for
internal queuing. This patch should only be merged after the merging
of that generic MCS locking patch.

I would still very much like this patch to be merged into the first. It
saves having to review all the code removed again.


I will merge it to the first one once the once the MCS patches are in 
the mainline or the tip branch.


-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] lustre: add myself to list of people to CC on lustre patches

2014-01-24 Thread Oleg Drokin
Hello!

On Jan 24, 2014, at 3:55 AM, Geert Uytterhoeven wrote:

> On Fri, Jan 24, 2014 at 6:51 AM, Oleg Drokin  wrote:
>>> +STAGING - LUSTRE
>>> +M:   Andreas Dilger 
>>> +M:   Oleg Drokin 
>>> +M:   Peng Tao .
>>> +L:   hpdd-discuss 
>>> +S:   Odd Fixes
>> 
>> Actually we are at least Maintained here, if not outright Supported.
> 
> Good to hear that!
> So can we assume that https://lkml.org/lkml/2013/9/9/725 will be fixed
> shortly,

We are working on it.

> and http://kisskb.ellerman.id.au/kisskb/buildresult/10508264/
> will turn green again?

I certainly hope so.
BTW, do you also provide interface where I can drop in my patch and your 
builder verify it for me by any chance,
so that I don't need to replicate build environment for every supported 
architecture?

Thanks.

Bye,
Oleg--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -tip 4/8] perf-probe: Use the actual address instead of the symbol name

2014-01-24 Thread Masami Hiramatsu
(2014/01/24 21:13), Arnaldo Carvalho de Melo wrote:
> Em Fri, Jan 24, 2014 at 10:49:32AM +0900, Masami Hiramatsu escreveu:
>> (2014/01/24 1:12), Steven Rostedt wrote:
>>> On Thu, 23 Jan 2014 11:52:11 -0300
>>> Arnaldo Carvalho de Melo  wrote:
>>>
 Em Thu, Jan 23, 2014 at 02:29:55AM +, Masami Hiramatsu escreveu:
> Since several local symbols can have same name (e.g. t_show),
> we need to use the actual address instead of symbol name for
> those points. Note that this works only with debuginfo.
>
> E.g. without this change;
> 

 Please use spaces after dashed lines, this is even as (or more)
 important as prefixing # lines, as this makes everything after the ---
 line and the patch itself to be ignored.

>>
>> Oops, I thought that "" was safe...
>>
>>> You recommend after? I found adding a single space before the dashes
>>> better, as that way I know I added one and didn't forget to.
>>
>> OK, I'll add at least one space before dashes.
>>
>> BTW, should "#" have two spaces instead of one?
> 
> I suggest using two spaces before --- and #, that is what I do and edit
> patches from others when merging, so that would save me some time while
> processing patches.

OK, I'll do it.
I think it also would better change checkpatch.pl to handle it.

Thank you very much! :)

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3 0/6] Crashdump Accepting Active IOMMU

2014-01-24 Thread Baoquan He
Tested this patchset on my local HP Z420 workstation, and it works very
well.

Hi Bill,

Thanks for your effort.

There are several concerns from me.

Firstly, I think the patch log need be rearanged. Patchset cover letter
can contain information to express why, how briefly. If you think this
is very useful, it can be split and put into patch log.

Then for each patch, patch log should be accurate and summary to
describe why and how this patch really does. If you feel several patches
have the corelation, they may need to be merged.

Secondly, each patch could get a seperate subject which tells what this
patch really does. Even they are merged to kernel git tree, each of them
is a independent commit. People can take to use or depend only one of
them. Actually, I don't like current patch subject.

Thirdly, this patchset will be part of intel-iommu, though they only
works for kdump kernel. As a subsystem, the style need be consistent. I
like the debug method which introduces a struct pr_debug, however
maintainers may not like it. Because a debug utility may bloat code and
affect people's review. Personally I like refined code, the less the
easier to review. Or put it as a independent patch at the end of the
patchset, let maintainer decide whether it's OK to pull in.

Sorry to say so much, I think this solution is truly the right way. As
you know, it's a big problem for kdump when intel-iommu is active in 1st
kernel. Because of this bug, many machines with intel-iommu have to be
set intel-iommu=off, the performance is affected very much.

Baoquan
Thanks

On 01/10/14 at 03:07pm, Bill Sumner wrote:
> v2->v3:
> 1. Commented-out "#define DEBUG 1" to eliminate debug messages
> 2. Updated the comments about changes in each version in all patches in the 
> set.
> 3. Fixed: one-line added to Copy-Translations" patch to initialize the iovad
>   struct as recommended by Baoquan He [b...@redhat.com]
>   init_iova_domain(>iovad, DMA_32BIT_PFN);
> 
> v1->v2:
> The following series implements a fix for:
> A kdump problem about DMA that has been discussed for a long time. That is,
> when a kernel panics and boots into the kdump kernel, DMA started by the 
> panicked kernel is not stopped before the kdump kernel is booted and the 
> kdump kernel disables the IOMMU while this DMA continues.  This causes the
> IOMMU to stop translating the DMA addresses as IOVAs and begin to treat them 
> as physical memory addresses -- which causes the DMA to either:
> (1) generate DMAR errors or (2) generate PCI SERR errors or (3) transfer  
> data to or from incorrect areas of memory. Often this causes the dump to fail.
> 
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: allocate cpumask during check irq vectors

2014-01-24 Thread Chen, Gong
On Fri, Jan 24, 2014 at 12:11:07PM -0800, Yinghai Lu wrote:
> Date: Fri, 24 Jan 2014 12:11:07 -0800
> From: Yinghai Lu 
> To: "H. Peter Anvin" 
> Cc: Thomas Gleixner , Ingo Molnar ,
>  linux-kernel@vger.kernel.org, Yinghai Lu , Prarit
>  Bhargava 
> Subject: [PATCH] x86: allocate cpumask during check irq vectors
> X-Mailer: git-send-email 1.8.4
> 
> Fix warning:
> arch/x86/kernel/irq.c: In function check_irq_vectors_for_cpu_disable:
> arch/x86/kernel/irq.c:337:1: warning: the frame size of 2052 bytes is larger 
> than 2048 bytes
> 
> when NR_CPUS=8192
> 
> We should use zalloc_cpumask_var() instead.
> 
> Signed-off-by: Yinghai Lu 
> Cc: Prarit Bhargava 
> 
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index dbb6087..b114ee4 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -277,11 +277,14 @@ int check_irq_vectors_for_cpu_disable(void)
>   unsigned int this_cpu, vector, this_count, count;
>   struct irq_desc *desc;
>   struct irq_data *data;
> - struct cpumask affinity_new, online_new;
> + cpumask_var_t affinity_new, online_new;
> +
> + zalloc_cpumask_var(_new, GFP_KERNEL);
> + zalloc_cpumask_var(_new, GFP_KERNEL);
>  
Hi, Yinghai

The original author Prarit Bhargava had committed a similar
patch and I ever said this function is protected by stop_machine
so GFP_KERNEL is not reliable. It should be GFP_ATOMIC.



signature.asc
Description: Digital signature


Re: [PATCH RT v2] timer: Raise softirq if there's irq_work

2014-01-24 Thread Steven Rostedt
On Fri, 24 Jan 2014 16:19:36 -0800
"Paul E. McKenney"  wrote:
 
> Failing to invoke rsp_wakeup() when it was needed could potentially
> stop RCU grace periods from happening, so having rsp_wakeup() happen
> when it is needed is pretty important...
> 
> But I would guess that you knew that already.  ;-)

Yep, I did. But it's always nice to hear confirmation, which is why I
Cc'd you ;-)

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] Thermal management updates for v3.14-rc1

2014-01-24 Thread Linus Torvalds
On Thu, Jan 23, 2014 at 7:36 PM, Zhang Rui  wrote:
>
> Please pull from the git repository at
>
>git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux.git next
>
> to receive Thermal management updates for v3.14-rc1

This annoys me _enormously_:

  CONFIG_ACPI_INT3403_THERMAL:

  This driver uses ACPI INT3403 device objects. If present, it will
  register each INT3403 thermal sensor as a thermal zone.

This is the "help" text for the new ACPI thermal driver.

WTF?

Really, whoever wrote that "help" text wasn't thinking of helping
users. What the f*ck is an INT3403 device object? Is is common? Where?
Is it a standard? Is it something normal people should expect?

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -v2] x86: allocate cpumask during check irq vectors

2014-01-24 Thread Yinghai Lu
Fix warning:
arch/x86/kernel/irq.c: In function check_irq_vectors_for_cpu_disable:
arch/x86/kernel/irq.c:337:1: warning: the frame size of 2052 bytes is larger 
than 2048 bytes

when NR_CPUS=8192

We should use zalloc_cpumask_var() instead.

-v2: update to GFP_ATOMIC instead and free the allocated cpumask at last.

Signed-off-by: Yinghai Lu 
Cc: Prarit Bhargava 

---
 arch/x86/kernel/irq.c |   24 +---
 1 file changed, 17 insertions(+), 7 deletions(-)

Index: linux-2.6/arch/x86/kernel/irq.c
===
--- linux-2.6.orig/arch/x86/kernel/irq.c
+++ linux-2.6/arch/x86/kernel/irq.c
@@ -277,11 +277,18 @@ int check_irq_vectors_for_cpu_disable(vo
unsigned int this_cpu, vector, this_count, count;
struct irq_desc *desc;
struct irq_data *data;
-   struct cpumask affinity_new, online_new;
+   cpumask_var_t affinity_new, online_new;
+
+   if (!alloc_cpumask_var(_new, GFP_ATOMIC))
+   return -ENOMEM;
+   if (!alloc_cpumask_var(_new, GFP_ATOMIC)) {
+   free_cpumask_var(affinity_new);
+   return -ENOMEM;
+   }
 
this_cpu = smp_processor_id();
-   cpumask_copy(_new, cpu_online_mask);
-   cpu_clear(this_cpu, online_new);
+   cpumask_copy(online_new, cpu_online_mask);
+   cpumask_clear_cpu(this_cpu, online_new);
 
this_count = 0;
for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
@@ -289,8 +296,8 @@ int check_irq_vectors_for_cpu_disable(vo
if (irq >= 0) {
desc = irq_to_desc(irq);
data = irq_desc_get_irq_data(desc);
-   cpumask_copy(_new, data->affinity);
-   cpu_clear(this_cpu, affinity_new);
+   cpumask_copy(affinity_new, data->affinity);
+   cpumask_clear_cpu(this_cpu, affinity_new);
 
/* Do not count inactive or per-cpu irqs. */
if (!irq_has_action(irq) || irqd_is_per_cpu(data))
@@ -311,12 +318,15 @@ int check_irq_vectors_for_cpu_disable(vo
 * mask is not zero; that is the down'd cpu is the
 * last online cpu in a user set affinity mask.
 */
-   if (cpumask_empty(_new) ||
-   !cpumask_subset(_new, _new))
+   if (cpumask_empty(affinity_new) ||
+   !cpumask_subset(affinity_new, online_new))
this_count++;
}
}
 
+   free_cpumask_var(affinity_new);
+   free_cpumask_var(online_new);
+
count = 0;
for_each_online_cpu(cpu) {
if (cpu == this_cpu)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RT v2] timer: Raise softirq if there's irq_work

2014-01-24 Thread Paul E. McKenney
On Fri, Jan 24, 2014 at 03:35:42PM -0500, Steven Rostedt wrote:
> On Fri, 24 Jan 2014 21:20:39 +0100
> Sebastian Andrzej Siewior  wrote:
> 
> > * Steven Rostedt | 2014-01-24 15:09:33 [-0500]:
> > 
> > >[ Talking with Sebastian on IRC, it seems that doing the irq_work_run()
> > >  from the interrupt in -rt is a bad thing. Here we simply raise the
> > >  softirq if there's irq work to do. This too boots on my i7 ]
> > 
> > It is okay in general because most of the users should not run in bare
> > interrupt context. The only exception here is the nohz_full_kick_work
> > thing.
> 
> I know we discussed this on IRC, but I wanted to publicly state that
> the missing irq work callback was the RCU's rsp_wakeup() function.

Failing to invoke rsp_wakeup() when it was needed could potentially
stop RCU grace periods from happening, so having rsp_wakeup() happen
when it is needed is pretty important...

But I would guess that you knew that already.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] Create eeprom_dev hardware class for EEPROM devices

2014-01-24 Thread Andy Lutomirski
On 01/23/2014 11:16 AM, Curt Brune wrote:
> Create a new hardware class under /sys/class/eeprom_dev
> 
> EEPROM drivers can register their devices with the eeprom_dev class
> during instantiation.
> 
> The registered devices show up as:
> 
>   /sys/class/eeprom_dev/eeprom0
>   /sys/class/eeprom_dev/eeprom1
>   ...
>   /sys/class/eeprom_dev/eeprom[N]
> 
> Each member of the eeprom class exports a sysfs file called "label",
> containing the label property from the corresponding device tree node.
> 
> Example:
> 
>   /sys/class/eeprom_dev/eeprom0/label
> 
> If the device tree node property "label" does not exist the value
> "unknown" is used.
> 
> Note: The class cannot be called 'eeprom' as that is the name of the
> I/O file created by the driver.  The class name appears as a
> sub-directory within the main device directory.  Hence the class name
> 'eeprom_dev'.
> 
> Userspace can use the label to identify what the EEPROM is for.

Since my previous email [1] seems to have vanished into the void, I'll
try again more succinctly:

How will this work on non device tree / openfirmware systems?

Is there a better way to expose topology information (i.e. that the
eeprom belongs to another device that might not live on the i2c bus at all)?

Can we expose type information?  There's a big difference between SPD
EEPROMs, EDID EEPROMs, and nic mac-address-containing EEPROMs, for example.

--Andy

> 
> The real device is available from the class device via the "device"
> link:
> 
>   /sys/class/eeprom_dev/eeprom0/device
> 
> Signed-off-by: Curt Brune 
> 
> ---
>  Documentation/misc-devices/eeprom_hw_class.txt |   81 
>  drivers/misc/eeprom/Kconfig|   11 ++
>  drivers/misc/eeprom/Makefile   |1 +
>  drivers/misc/eeprom/eeprom_class.c |  159 
> 
>  include/linux/eeprom_class.h   |   35 ++
>  5 files changed, 287 insertions(+)
>  create mode 100644 Documentation/misc-devices/eeprom_hw_class.txt
>  create mode 100644 drivers/misc/eeprom/eeprom_class.c
>  create mode 100644 include/linux/eeprom_class.h
> 
> diff --git a/Documentation/misc-devices/eeprom_hw_class.txt 
> b/Documentation/misc-devices/eeprom_hw_class.txt
> new file mode 100644
> index 000..b5cbc35
> --- /dev/null
> +++ b/Documentation/misc-devices/eeprom_hw_class.txt
> @@ -0,0 +1,81 @@
> +EEPROM Device Hardware Class
> +
> +
> +This feature is enabled by CONFIG_EEPROM_CLASS.
> +
> +The original problem:
> +
> +We work on several different switching platforms, each of which has
> +about 64 EEPROMs, one for each of the 10G SFP+ modules.  In addition
> +the systems typically have a board info EEPROM, SPD and power supply
> +EEPROMs.  It is difficult to map the device tree entries for the
> +EEPROMs to the appropriate sysfs device needed for I/O in a generic
> +way.
> +
> +Also mappings are further complicated by some systems using custom i2c
> +buses implemented in FPGAs.
> +
> +The solution is two fold:
> +
> +1. Create an EEPROM class for all EEPROM devices.  Each EEPROM driver,
> +at24 for example, would register with the class during probe().
> +
> +2. Create a mapping in the .dts file by adding a property called
> +'label' to each EEPROM entry.  The EEPROM class will expose this label
> +property for all EEPROMs.
> +
> +For example, for all the EEPROM devices in the system you would see
> +directories in sysfs like:
> +
> +  /sys/class/eeprom_dev/eeprom0
> +  /sys/class/eeprom_dev/eeprom1
> +  /sys/class/eeprom_dev/eeprom2
> +  ...
> +  /sys/class/eeprom_dev/eeprom
> +
> +Within each eepromN directory you would find:
> +
> +  root@switch:/sys/class/eeprom_dev# ls -l eeprom2/
> +  total 0
> +  lrwxrwxrwx 1 root root0 Sep  3 22:08 device -> ../../../1-0050
> +  -r--r--r-- 1 root root 4096 Sep  3 22:08 label
> +  lrwxrwxrwx 1 root root0 Sep  4 17:18 subsystem ->  
> ../../../../../../../class/eeprom_dev
> +
> +device -- this is a symlink to the physical device.  For example to
> +dump the EEPROM data of eeprom2 you could do:
> +
> +  hexdump -C /sys/class/eeprom_dev/eeprom2/device/eeprom
> +
> +As an example the device tree entry corresponding to eeprom2 could
> +look like:
> +
> + sfp_eeprom@50 {
> + compatible = "at,24c04";
> + reg = <0x50>;
> + label = "port6";
> + };
> +
> +From the original problem, imagine 64 similar entries for all the
> +other ports.  Plus a few more entries for board EEPROM and power
> +supply EEPROMs.
> +
> +From user space if I wanted to know the device corresponding to port6
> +I could do something as simple as:
> +
> +root@switch:~# grep port6 /sys/class/eeprom_dev/eeprom*/label
> +/sys/class/eeprom_dev/eeprom2/label:port6
> +
> +Then I could access the information via
> +/sys/class/eeprom_dev/eeprom2/device/eeprom.
> +
> +It is nice that it keeps the mapping all in one place, in the .dts
> +file.  It is not spread around in the device tree and 

Re: [PATCH 3.12 00/27] 3.12.9-stable review

2014-01-24 Thread Greg Kroah-Hartman
On Fri, Jan 24, 2014 at 08:39:50PM +0100, Radim Krčmář wrote:
> Hello,
> 
> could 3.12.9 and 3.10.28 include
>   0dce7cd kvm: x86: fix apic_base enable check
> ?  It fixes a regression applied to 3.10.26 and 3.12.7.

I'll queue it up for the next round of stable kernels, thanks.

> Stable was mentioned on kvm-list[1], but it might have been forgotten.

Yes, no one marked it for the stable tree :(

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Ref: ES/9420X2/68.

2014-01-24 Thread STALLION
Lopullisesta ilmoituksesta
Sähköpostiisi id on voittanut (450,000.00 euroa) Espanjan "El Gordo"
Kansainvälinen sähköpostia arpajaiset palkinnon Lucky Numbers 9/11/13/24/43 ja
Ref: ES/9420X2/68.
Selvyyden ja menettely Yhteystiedot:
STALLION MEGA KORVAUSVAATIMUKSEN AGENCY
Arvoisa; Juan Carlos
 
Sähköposti; infostall...@aol.com
Tel :0034-632 085 356 (Anglictina ja puhu espanjaa)
Teidän koko nimi, osoite, ikä, ammatti, puhelinnumeroita,
Lähetä replay tähän sähköpostiin: infostall...@aol.com
Onnittelut.
Ystävällisin terveisin,

---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] regmap: cache: Handle stride > 1 in sync_block_raw_flush

2014-01-24 Thread Dylan Reid
regcache_sync_block_raw_flush takes the address of the base register
and the address of one past the last register to write to.  "count" is
the number of registers in the range, not the number of bytes, it
should be (end addr - start addr) / stride. Without accounting for
strides greater than one, registers past the end might be synced or
the writeable_reg callback at the beginning of _regmap_raw_write will
fail and nothing will be written.

Signed-off-by: Dylan Reid 
---
 drivers/base/regmap/regcache.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/regmap/regcache.c b/drivers/base/regmap/regcache.c
index d4dd771..dd56177 100644
--- a/drivers/base/regmap/regcache.c
+++ b/drivers/base/regmap/regcache.c
@@ -636,10 +636,10 @@ static int regcache_sync_block_raw_flush(struct regmap 
*map, const void **data,
if (*data == NULL)
return 0;
 
-   count = cur - base;
+   count = (cur - base) / map->reg_stride;
 
dev_dbg(map->dev, "Writing %zu bytes for %d registers from 0x%x-0x%x\n",
-   count * val_bytes, count, base, cur - 1);
+   count * val_bytes, count, base, cur - map->reg_stride);
 
map->cache_bypass = 1;
 
-- 
1.8.1.3.605.g02339dd

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] Git v1.9-rc0

2014-01-24 Thread Jeff King
On Thu, Jan 23, 2014 at 10:15:33AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Junio, since you prepare such tarballs[1] anyway for kernel.org, it
> > might be worth uploading them to the "Releases" page of git/git.  I
> > imagine there is a programmatic way to do so via GitHub's API, but I
> > don't know offhand. I can look into it if you are interested.
> 
> I already have a script that takes the three tarballs and uploads
> them to two places, so adding GitHub as the third destination should
> be a natural and welcome way to automate it.

I came up with the script below, which you can use like:

  ./script v1.8.2.3 git-1.8.2.3.tar.gz

It expects the tag to already be pushed up to GitHub.  I'll leave
sticking it on the "todo" branch and integrating it into RelUpload to
you. This can also be used to backfill the old releases (though I looked
on k.org and it seems to have only partial coverage).

It sets the "prerelease" flag for -rc releases, but I did not otherwise
fill in any fields, including the summary and description. GitHub seems
to display reasonably if they are not set.

-- >8 --
#!/bin/sh
#
# usage: $0  

repo=git/git

# replace this with however you store your oauth token
# if you don't have one, make one here:
# https://github.com/settings/tokens/new
token() {
  pass -n github.web.oauth
}

post() {
  curl -H "Authorization: token $(token)" "$@"
}

# usage: create 
create() {
  case "$1" in
  *-rc*)
prerelease=true
;;
  *)
prerelease=false
;;
  esac

  post -d '
  {
"tag_name": "'"$1"'",
"prerelease": '"$prerelease"'
  }' "https://api.github.com/repos/$repo/releases;
}

# use: upload  
upload() {
  url="https://uploads.github.com/repos/$repo/releases/$1/assets; &&
  url="$url?name=$(basename $2)" &&
  post -H "Content-Type: $(file -b --mime-type "$2")" \
   --data-binary "@$2" \
   "$url"
}

# This is a hack. If you don't mind a dependency on
# perl's JSON (or another parser), we can do a lot better.
extract_id() {
  perl -lne '/"id":\s*(\d+)/ or next; print $1; exit 0'
}

create "$1" >release.json &&
id=$(extract_id /dev/null &&
rm -f release.json
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/2] mm: reduce reclaim stalls with heavy anon and dirty cache

2014-01-24 Thread Tejun Heo
On Fri, Jan 24, 2014 at 05:21:44PM -0500, Tejun Heo wrote:
> The trigger conditions seem quite plausible - high anon memory usage
> w/ heavy buffered IO and swap configured - and it's highly likely that
> this is happening in the wild too.  (this can happen with copying
> large files to usb sticks too, right?)

So, just tested with the usb stick and these two patches, while not
perfect, make a world of difference.  The problem is really easy to
reproduce on my machine which has 8gig of memory with the two attached
test programs.

* run "test-membloat 4300" and wait for it to report completion.

* run "test-latency"

Mount a slow USB stick and copy a large (multi-gig) file to it.
test-latency tries to print out a dot every 10ms but will report a
log2 number if the latency becomes more than twice high - ie. 4 means
it took 2^4 * 10ms to complete a loop which is supposed to take
slightly longer than 10ms (10ms sleep + 4 page fault).  My USB stick
only can do a couple mbytes/s and without these patches the machine
becomes basically useless.  It's just not useable, it stutters more
than it runs until the whole file finishes copying.

Because I've been using tmpfs as build target for a while, I've been
experiencing this occassionally and secretly growing bitter
disappointment towards the linux kernel which developed into
self-loathing to the point where I found booting into win8 consoling
after looking at my machine stuttering for 45mins while it was
repartitioning the hard drive to make room for steamos.  Oh the irony.
I had to stay in fetal position for a while afterwards.  It was a
crisis.

With the patches applied, for both heavy harddrive IO and
copy-large-file-to-slow-USB cases, the behavior is vastly improved.
It does stutter for a while once memory is filled up but stabilizes in
somewhere above ten seconds and then stays responsive.  While it isn't
perfect, it's not completely ridiculous as before.

So, lots of kudos to Johannes for *finally* fixing the issue and I
strongly believe this is something we should consider for -stable even
if that takes considerable amount of effort to verify it's not too
harmful for other workloads.

Thanks a lot.

-- 
tejun
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define NR_ALPHAS   ('z' - 'a' + 1)

int main(int argc, char **argv)
{
struct timespec intv_ts = { }, ts;
unsigned long long time0, time1;
long long msecs = 10;
const size_t map_size = 4096 * 4;

if (argc > 1) {
msecs = atoll(argv[1]);
if (msecs <= 0) {
fprintf(stderr, "test-latency [interval-in-msecs]\n");
return 1;
}
}

intv_ts.tv_sec = msecs / 1000;
intv_ts.tv_nsec = (msecs % 1000) * 100;

clock_gettime(CLOCK_MONOTONIC, );
time1 = ts.tv_sec * 10LLU + ts.tv_nsec;

while (1) {
void *map, *p;
int idx;
char c;

nanosleep(_ts, NULL);
map = mmap(NULL, map_size, PROT_READ | PROT_WRITE,
   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (map == MAP_FAILED) {
perror("mmap");
return 1;
}

for (p = map; p < map + map_size; p += 4096)
*(volatile unsigned long *)p = 0xdeadbeef;

munmap(map, map_size);

time0 = time1;
clock_gettime(CLOCK_MONOTONIC, );
time1 = ts.tv_sec * 10LLU + ts.tv_nsec;

idx = (time1 - time0) / msecs / 100;
idx = log2(idx);
if (idx <= 1) {
c = '.';
} else {
if (idx > 9)
idx = 9;
c = '0' + idx;
}
write(1, , 1);
}
}
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(int argc, char **argv)
{
struct timespec ts_100s = { .tv_sec = 100 };
long mbytes, cnt;
void *map, *p;
int fd = -1;
int flags;

if (argc < 2 || (mbytes = atol(argv[1])) <= 0) {
fprintf(stderr, "test-membloat SIZE_IN_MBYTES [FILENAME]\n");
return 1;
}

if (argc >= 3) {
fd = open(argv[2], O_CREAT|O_TRUNC|O_RDWR, S_IRWXU);
if (fd < 0) {
perror("open");
return 1;
}

if (ftruncate(fd, mbytes << 20)) {
perror("ftruncate");
return 1;
}

flags = MAP_SHARED;
} else {
flags = MAP_ANONYMOUS | MAP_PRIVATE;
}

map = mmap(NULL, (size_t)mbytes << 20, PROT_READ | PROT_WRITE,
 

Re: [PATCH v2] tty: Allow stealing of controlling ttys within user namespaces

2014-01-24 Thread Eric W. Biederman
Seth Forshee  writes:

> root is allowed to steal ttys from other sessions, but it
> requires system-wide CAP_SYS_ADMIN and therefore is not possible
> for root within a user namespace. This should be allowed so long
> as the process doing the stealing is privileged towards the
> session which currently owns the tty.
>
> Update this code to only require CAP_SYS_ADMIN in the user
> namespaces of the target session's tasks, allowing the tty to be
> stolen from sessions whose tasks are in the same or lesser
> privileged user namespaces.

This code looks essentially correct.  I would like to look at it a bit
more before we merge it, just to ensure something silly hasn't been
missed, but the only thing that concerns me at this point is are we
checking the proper per task bits.

The case I am currently worrying about is a task that does something
privileged drops perms sets dumpable and then calls setns() on the
userns.

So I think we may have to solve the dumpable problem at the same time as
we solve this issue.

Now I don't know if it makes sense to take this through the tty tree or
my userns tree.  I am inclined to take it through the userns tree simply
because I am reviewing it and I have seen the several failed attempts at
this but if Greg wants it in the tty tree I won't object.

What I do want to do is be especially careful with a patch like this so
we don't accidentally introduce a DAC policy hole, and cause security
problems for people.  Bugs like that don't do anyone any good.

> Cc: Serge Hallyn 
> Cc: "Eric W. Biederman" 
> Signed-off-by: Seth Forshee 
> ---
>  drivers/tty/tty_io.c | 31 +++
>  1 file changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index c74a00a..558e6dc 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -2410,17 +2410,32 @@ static int tiocsctty(struct tty_struct *tty, int arg)
>* This tty is already the controlling
>* tty for another session group!
>*/
> - if (arg == 1 && capable(CAP_SYS_ADMIN)) {
> - /*
> -  * Steal it away
> -  */
> - read_lock(_lock);
> - session_clear_tty(tty->session);
> - read_unlock(_lock);
> - } else {
> + struct user_namespace *user_ns;
> + struct task_struct *p;
> +
> + if (arg != 1) {
>   ret = -EPERM;
>   goto unlock;
>   }
> +
> + read_lock(_lock);
> + do_each_pid_task(tty->session, PIDTYPE_SID, p) {
> + rcu_read_lock();
> + user_ns = task_cred_xxx(p, user_ns);
> + if (!ns_capable(user_ns, CAP_SYS_ADMIN)) {
> + rcu_read_unlock();
> + read_unlock(_lock);
> + ret = -EPERM;
> + goto unlock;
> + }
> + rcu_read_unlock();
> + } while_each_pid_task(tty->session, PIDTYPE_SID, p);
> +
> + /*
> +  * Steal it away
> +  */
> + session_clear_tty(tty->session);
> + read_unlock(_lock);
>   }
>   proc_set_tty(current, tty);
>  unlock:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/2] mm: reduce reclaim stalls with heavy anon and dirty cache

2014-01-24 Thread Rik van Riel
On 01/24/2014 05:51 PM, Johannes Weiner wrote:
> On Fri, Jan 24, 2014 at 02:30:03PM -0800, Andrew Morton wrote:
>> On Fri, 24 Jan 2014 17:03:02 -0500 Johannes Weiner  
>> wrote:
>>
>>> Tejun reported stuttering and latency spikes on a system where random
>>> tasks would enter direct reclaim and get stuck on dirty pages.  Around
>>> 50% of memory was occupied by tmpfs backed by an SSD, and another disk
>>> (rotating) was reading and writing at max speed to shrink a partition.
>>
>> Do you think this is serious enough to squeeze these into 3.14?
> 
> We have been biasing towards cache reclaim at least as far back as the
> LRU split and we always considered anon dirtyable, so it's not really
> a *new* problem.  And there is a chance of regressing write bandwidth
> for certain workloads by effectively shrinking their dirty limit -
> although that is easily fixed by changing dirty_ratio.
> 
> On the other hand, the stuttering is pretty nasty (could reproduce it
> locally too) and the workload is not exactly esoteric.  Plus, I'm not
> sure if waiting will increase the test exposure.
> 
> So 3.14 would work for me, unless Mel and Rik have concerns.

3.14 would be fine, indeed.

On the other hand, if there are enough user reports of the stuttering
problem on older kernels, a -stable backport could be appropriate
too...

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/2] mm: page-writeback: do not count anon pages as dirtyable memory

2014-01-24 Thread Rik van Riel
On 01/24/2014 05:03 PM, Johannes Weiner wrote:
> The VM is currently heavily tuned to avoid swapping.  Whether that is
> good or bad is a separate discussion, but as long as the VM won't swap
> to make room for dirty cache, we can not consider anonymous pages when
> calculating the amount of dirtyable memory, the baseline to which
> dirty_background_ratio and dirty_ratio are applied.
> 
> A simple workload that occupies a significant size (40+%, depending on
> memory layout, storage speeds etc.) of memory with anon/tmpfs pages
> and uses the remainder for a streaming writer demonstrates this
> problem.  In that case, the actual cache pages are a small fraction of
> what is considered dirtyable overall, which results in an relatively
> large portion of the cache pages to be dirtied.  As kswapd starts
> rotating these, random tasks enter direct reclaim and stall on IO.
> 
> Only consider free pages and file pages dirtyable.
> 
> Signed-off-by: Johannes Weiner 

Reviewed-by: Rik van Riel 


-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] firmware/google: drop 'select EFI' to avoid recursive dependency

2014-01-24 Thread David Rientjes
On Fri, 24 Jan 2014, Joe Perches wrote:

> diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
> index 9c3986f..ef05ed6 100755
> --- a/scripts/get_maintainer.pl
> +++ b/scripts/get_maintainer.pl
> @@ -483,6 +483,13 @@ my %deduplicate_address_hash = ();
>  
>  my @maintainers = get_maintainers();
>  
> +if ($email_maintainer && !$interactive && !$email_git_blame &&
> +(!@maintainers || ($email_list && @maintainers == 1))) {
> +warn "$P: No maintainer found, trying harder, addresses may be 
> stale...\n";
> +$email_git_blame = 1;
> +@maintainers = get_maintainers();
> +}
> +
>  if (@maintainers) {
>  @maintainers = merge_email(@maintainers);
>  output(@maintainers);

Works well and has good advice on how emails may be stale, thanks Joe!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4 5/5] Documentation: power: reset: Add documentation for generic SYSCON reboot driver

2014-01-24 Thread Marc C
Hi Mark,

>> diff --git a/Documentation/devicetree/bindings/power/reset/syscon-reboot.txt
b/Documentation/devicetree/bindings/power/reset/syscon-reboot.txt
>> new file mode 100644
>> index 000..e9eb1fe
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/power/reset/syscon-reboot.txt
>> @@ -0,0 +1,16 @@
>> +Generic SYSCON mapped register reset driver
>
> Bindings should describe hardware, not drivers

In a perfect world, the hardware designers would place _all_ of the registers 
needed to
support rebooting in a contiguous section of the memory map. However, this 
isn't the case
on some platforms, especially on ARM-based SoCs.

While I completely agree with you that the bindings describe hardware, I don't 
see how
Feng's work is contrary to that. Feng is working on logically-grouping an 
otherwise
"random" set of registers into a logical grouping. In this case, Feng is 
uniting a group
of registers and calling them the "reboot" register block.

> What's wrong with having a system clock unit binding, that the kernel
> can decompose as appropriate?

>From what I understand, the arm-soc maintainers want to reduce (and perhaps 
>even
eliminate) these board-specific constructs, and try to utilize common 
driver-code that
resides in the "driver" folder. I can vouch for the syscon/regmap framework as 
something
which would enable the effort.

Thanks,
Marc C

On 01/24/2014 10:23 AM, Mark Rutland wrote:
> On Fri, Jan 24, 2014 at 06:03:10PM +, Feng Kan wrote:
>> On Fri, Jan 24, 2014 at 3:39 AM, Mark Rutland  wrote:
>>> On Thu, Jan 23, 2014 at 07:20:01PM +, Feng Kan wrote:
 Add documentation for generic SYSCON reboot driver.

 Signed-off-by: Feng Kan 
 ---
  .../bindings/power/reset/syscon-reboot.txt |   16 
  1 files changed, 16 insertions(+), 0 deletions(-)
  create mode 100644 
 Documentation/devicetree/bindings/power/reset/syscon-reboot.txt

 diff --git 
 a/Documentation/devicetree/bindings/power/reset/syscon-reboot.txt 
 b/Documentation/devicetree/bindings/power/reset/syscon-reboot.txt
 new file mode 100644
 index 000..e9eb1fe
 --- /dev/null
 +++ b/Documentation/devicetree/bindings/power/reset/syscon-reboot.txt
 @@ -0,0 +1,16 @@
 +Generic SYSCON mapped register reset driver
>>>
>>> Bindings should describe hardware, not drivers.
>>>
>>> What precisely does this binding describe?
>>>
 +
 +Required properties:
 +- compatible: should contain "syscon-reboot"
 +- regmap: this is phandle to the register map node
 +- offset: offset in the register map for the reboot register
 +- mask: the reset value written to the reboot register
 +
 +Examples:
 +
 +reboot {
 +   compatible = "syscon-reboot";
 +   regmap = <>;
 +   offset = <0x0>;
 +   mask = <0x1>;
 +};
>>>
>>> Access size? Endianness?
>> FKAN: are you asking for documentation? I don't see alot of example of
>> support for these.
> 
> If I used the enippet in the example, what endianness and access size
> should I expect an OS to perform? That should be documented.
> 
> If this doesn't match the general case, we can add properties later to
> adjust the access size and/or endianness. We just need to document what
> the binding actually describes currently, or it's not possible to
> implement anything based off of the binding documentation.
> 
> I should be able to read a binding document and write a dts. I shouldn't
> have to read the code to figure out what the binding describes.
> 
>>
>>>
>>> Why can we not have a binding for the register bank this exists in, and
>>> have that pass on the appropriate details to a syscon-reboot driver?
>>
>> FKAN: Thats a good idea. But the hardware in this case (SCU) system
>> clock unit has a bunch of registers used for different functions. If syscon 
>> is
>> used alot in this case and we pile more attribute into it. It would get kinda
>> messy after a while.
> 
> Huh?
> 
> What's wrong with having a system clock unit binding, that the kernel
> can decompose as appropriate?
> 
> I don't get your syscon argument.
> 
> Thanks,
> Mark.
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] mm: page-writeback: fix dirty_balance_reserve subtraction from dirtyable memory

2014-01-24 Thread Rik van Riel
On 01/24/2014 05:03 PM, Johannes Weiner wrote:
> The dirty_balance_reserve is an approximation of the fraction of free
> pages that the page allocator does not make available for page cache
> allocations.  As a result, it has to be taken into account when
> calculating the amount of "dirtyable memory", the baseline to which
> dirty_background_ratio and dirty_ratio are applied.
> 
> However, currently the reserve is subtracted from the sum of free and
> reclaimable pages, which is non-sensical and leads to erroneous
> results when the system is dominated by unreclaimable pages and the
> dirty_balance_reserve is bigger than free+reclaimable.  In that case,
> at least the already allocated cache should be considered dirtyable.
> 
> Fix the calculation by subtracting the reserve from the amount of free
> pages, then adding the reclaimable pages on top.
> 
> Signed-off-by: Johannes Weiner 

Reviewed-by: Rik van Riel 

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/5] ia64 ski emulator patches

2014-01-24 Thread Luck, Tony
Mikulas:
>> Here I'm sending some ia64 patches to make it work in the ski emulator. 
>> This has been broken for a long time.

Thanks - There are questions from time to time on how to test ia64
for those people who do not have hardware.

Mikael:
> Thanks.  I've recently started running 3.x kernels on ia64 via ski,
> but I'm getting random kernel crashes with 3.13.  I'll give your
> patches a try shortly.

Let me know how that goes - I haven't used ski in a decade and
have quite forgotten how to set it up.

> I've written a few patches to improve other aspects of running the
> kernel on ski:
> - kernel patch to turn PAL_HALT_LIGHT into a new SSC_HALT_LIGHT,
>   and a corresponing ski patch to pause() on SSC_HALT_LIGHT; this
>   together with the fixed-frequency ITC patch allows ski to idle
>   with very low host CPU overhead when the guest kernel idles
> - kernel patch to bump the RAM size from 130MB to 2GB
>
> I'd be happy to share these patches if there's interest in them.

It seems that there are at least two of you out there - so I'm happy
to take kernel patches that make things better.  Not sure where the
ski patches go - is someone maintaining that?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] user namespaces work for 3.14-rc1

2014-01-24 Thread Eric W. Biederman

Linus,

Please pull the for-linus branch from the git tree:

   git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git 
for-linus

   HEAD: f58437f1f9161847c636e4fed5569ed5b908af36 MIPS: VPE: Remove vpe_getuid 
and vpe_getgid

The work to convert the kernel to use kuid_t and kgid_t has been
finished since 3.12 so it is time to remove the scaffolding that allowed
the work to progress incrementally.

The first patch on this branch just removes the scaffolding, ensuring we
will always get compile errors if people accidentally try the userspace
and the kernel uid and gid types.  The second patch an overlooked and
unused chunk of mips code that that fails to build after the first
patch.

The code hasn't been in linux-next for long (as I was out of it and could
not sheppared the cold properly) but the patch has been around for a
long time just waiting for the day when I had finished the uid/gid
conversions.  Putting the code in linux-next did find the compile
failure on mips so I took the time to get that fix reviewed and
included.  Beyond that I am not too worried about errors because all
these two patches do is delete a modest amount of code.

Eric


Eric W. Biederman (2):
  userns:  userns: Remove UIDGID_STRICT_TYPE_CHECKS
  MIPS: VPE: Remove vpe_getuid and vpe_getgid

 arch/mips/include/asm/vpe.h |2 --
 arch/mips/kernel/vpe.c  |   28 
 include/linux/posix_acl.h   |3 ---
 include/linux/projid.h  |   15 ---
 include/linux/uidgid.h  |   22 --
 init/Kconfig|   11 ---
 6 files changed, 0 insertions(+), 81 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] firmware/google: drop 'select EFI' to avoid recursive dependency

2014-01-24 Thread Joe Perches
On Fri, 2014-01-24 at 02:27 -0800, David Rientjes wrote:
> On Thu, 23 Jan 2014, Joe Perches wrote:
> 
> > > > get_maintainer's default output should answer the question "who do I
> > > > email about this file", and that ain't working :(
> > 
> > Complaints cheerfully ignored.
> > Suggestions gratefully accepted.
> > 
> > Files that haven't had changes in a long time
> > generally aren't maintained.
> > 
> > Old addresses frequently become stale and bounce.
> > 
> > It'd be better if there was a MAINTAINERS entry
> > for drivers/firmware/google.
> > 
> 
> I think scripts/get_maintainer.pl is only really useful for emailing 
> patches so I think outputting at least somebody to cc on patches would be 
> a good idea.  It doesn't necessarily need to be someone who maintains the 
> code and pushes it to Linus.

Very very few people listed in MAINTAINERS actual push to Linus.

> I'm not sure how much runtime is a factor for people of the script, but 
> falling back to git-blame behavior to at least get one or two cc's sounds 
> appropriate.  If the email address is outdated, owell, we live and learn.

Maybe something like this would work.
It uses git-blame whenever no maintainers are found.
---
 scripts/get_maintainer.pl | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 9c3986f..ef05ed6 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -483,6 +483,13 @@ my %deduplicate_address_hash = ();
 
 my @maintainers = get_maintainers();
 
+if ($email_maintainer && !$interactive && !$email_git_blame &&
+(!@maintainers || ($email_list && @maintainers == 1))) {
+warn "$P: No maintainer found, trying harder, addresses may be stale...\n";
+$email_git_blame = 1;
+@maintainers = get_maintainers();
+}
+
 if (@maintainers) {
 @maintainers = merge_email(@maintainers);
 output(@maintainers);



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/2] mm: reduce reclaim stalls with heavy anon and dirty cache

2014-01-24 Thread Johannes Weiner
On Fri, Jan 24, 2014 at 02:30:03PM -0800, Andrew Morton wrote:
> On Fri, 24 Jan 2014 17:03:02 -0500 Johannes Weiner  wrote:
> 
> > Tejun reported stuttering and latency spikes on a system where random
> > tasks would enter direct reclaim and get stuck on dirty pages.  Around
> > 50% of memory was occupied by tmpfs backed by an SSD, and another disk
> > (rotating) was reading and writing at max speed to shrink a partition.
> 
> Do you think this is serious enough to squeeze these into 3.14?

We have been biasing towards cache reclaim at least as far back as the
LRU split and we always considered anon dirtyable, so it's not really
a *new* problem.  And there is a chance of regressing write bandwidth
for certain workloads by effectively shrinking their dirty limit -
although that is easily fixed by changing dirty_ratio.

On the other hand, the stuttering is pretty nasty (could reproduce it
locally too) and the workload is not exactly esoteric.  Plus, I'm not
sure if waiting will increase the test exposure.

So 3.14 would work for me, unless Mel and Rik have concerns.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/11] rel-html: update documentation on contribution process using the DCO

2014-01-24 Thread Luis R. Rodriguez
The DCO conversation is over and we no longer have to rely on some
questionable URL / project / etc. After discussions with folks from
the Linux Foundation we now have a reasonable document and home page
for the DCO as a project in itself, any project can embrace this DCO.

The shiny new DCO project page:

http://developercertificate.org/

Cc: W. Trevor King 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 CONTRIBUTING | 54 --
 1 file changed, 48 insertions(+), 6 deletions(-)

diff --git a/CONTRIBUTING b/CONTRIBUTING
index 194038e..f288959 100644
--- a/CONTRIBUTING
+++ b/CONTRIBUTING
@@ -1,7 +1,49 @@
-This project tracks patch provenance and licensing using the Developer
-Certificate of Origin and Signed-off-by tags initially developed by
-the Linux kernel project.  Because the documentation for this
-procedure is licensed under the GPLv2, we have chosen not to include
-it in our project directly.  Instead, please see:
 
-  
http://git.tremily.us/?p=signed-off-by.git;a=blob;f=Documentation/SubmittingPatches;h=34055986ab836553896f091225448c448a4cc62c;hb=refs/heads/signed-off-by
+This project embraces the Developer Certificate of Origin (DCO) for
+contributions. This means you must agree to the following prior to submitting
+patches, if you agree with this developer certificate you acknowledge this by
+adding a Signed-off-by tag to your patch commit log. Every submitted patch
+must have this.
+
+The source for the DCO:
+
+http://developercertificate.org/
+
+---
+
+Developer Certificate of Origin
+Version 1.1
+
+Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
+660 York Street, Suite 102,
+San Francisco, CA 94110 USA
+
+Everyone is permitted to copy and distribute verbatim copies of this
+license document, but changing it is not allowed.
+
+
+Developer's Certificate of Origin 1.1
+
+By making a contribution to this project, I certify that:
+
+(a) The contribution was created in whole or in part by me and I
+have the right to submit it under the open source license
+indicated in the file; or
+
+(b) The contribution is based upon previous work that, to the best
+of my knowledge, is covered under an appropriate open source
+license and I have the right under that license to submit that
+work with modifications, whether created in whole or in part
+by me, under the same open source license (unless I am
+permitted to submit under a different license), as indicated
+in the file; or
+
+(c) The contribution was provided directly to me by some other
+person who certified (a), (b) or (c) and I have not modified
+it.
+
+(d) I understand and agree that this project and the contribution
+are public and that a record of the contribution (including all
+personal information I submit with it, including my sign-off) is
+maintained indefinitely and may be redistributed consistent with
+this project or the open source license(s) involved.
-- 
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/11] rel-html: update documentation on contribution process using the DCO

2014-01-24 Thread Luis R. Rodriguez
The DCO conversation is over and we no longer have to rely on some
questionable URL / project / etc. After discussions with folks from
the Linux Foundation we now have a reasonable document and home page
for the DCO as a project in itself, any project can embrace this DCO.

The shiny new DCO project page:

http://developercertificate.org/

Cc: W. Trevor King 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 CONTRIBUTING | 54 --
 1 file changed, 48 insertions(+), 6 deletions(-)

diff --git a/CONTRIBUTING b/CONTRIBUTING
index 194038e..f288959 100644
--- a/CONTRIBUTING
+++ b/CONTRIBUTING
@@ -1,7 +1,49 @@
-This project tracks patch provenance and licensing using the Developer
-Certificate of Origin and Signed-off-by tags initially developed by
-the Linux kernel project.  Because the documentation for this
-procedure is licensed under the GPLv2, we have chosen not to include
-it in our project directly.  Instead, please see:
 
-  
http://git.tremily.us/?p=signed-off-by.git;a=blob;f=Documentation/SubmittingPatches;h=34055986ab836553896f091225448c448a4cc62c;hb=refs/heads/signed-off-by
+This project embraces the Developer Certificate of Origin (DCO) for
+contributions. This means you must agree to the following prior to submitting
+patches, if you agree with this developer certificate you acknowledge this by
+adding a Signed-off-by tag to your patch commit log. Every submitted patch
+must have this.
+
+The source for the DCO:
+
+http://developercertificate.org/
+
+---
+
+Developer Certificate of Origin
+Version 1.1
+
+Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
+660 York Street, Suite 102,
+San Francisco, CA 94110 USA
+
+Everyone is permitted to copy and distribute verbatim copies of this
+license document, but changing it is not allowed.
+
+
+Developer's Certificate of Origin 1.1
+
+By making a contribution to this project, I certify that:
+
+(a) The contribution was created in whole or in part by me and I
+have the right to submit it under the open source license
+indicated in the file; or
+
+(b) The contribution is based upon previous work that, to the best
+of my knowledge, is covered under an appropriate open source
+license and I have the right under that license to submit that
+work with modifications, whether created in whole or in part
+by me, under the same open source license (unless I am
+permitted to submit under a different license), as indicated
+in the file; or
+
+(c) The contribution was provided directly to me by some other
+person who certified (a), (b) or (c) and I have not modified
+it.
+
+(d) I understand and agree that this project and the contribution
+are public and that a record of the contribution (including all
+personal information I submit with it, including my sign-off) is
+maintained indefinitely and may be redistributed consistent with
+this project or the open source license(s) involved.
-- 
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mmc: sdhci-bcm-kona: Add basic use of clocks

2014-01-24 Thread Tim Kryger
Enable the external clock needed by the host controller during the
probe and disable it during the remove.

Signed-off-by: Tim Kryger 
Reviewed-by: Markus Mayer 
Reviewed-by: Matt Porter 
Reviewed-by: Christian Daudt 
---

This was dropped from "Update Kona drivers to use clocks" series so the
remaining patches could get into v3.14:

https://lkml.org/lkml/2014/1/24/286

Without this change the bcm28155_ap board can not be configured to use
the bcm281xx clock driver since unused clocks will be disabled during
late init.

 drivers/mmc/host/sdhci-bcm-kona.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/sdhci-bcm-kona.c 
b/drivers/mmc/host/sdhci-bcm-kona.c
index 7a190fe..923eefa 100644
--- a/drivers/mmc/host/sdhci-bcm-kona.c
+++ b/drivers/mmc/host/sdhci-bcm-kona.c
@@ -54,6 +54,7 @@
 
 struct sdhci_bcm_kona_dev {
struct mutexwrite_lock; /* protect back to back writes */
+   struct clk  *external_clk;
 };
 
 
@@ -257,6 +258,24 @@ static int sdhci_bcm_kona_probe(struct platform_device 
*pdev)
goto err_pltfm_free;
}
 
+   /* Get and enable the external clock */
+   kona_dev->external_clk = devm_clk_get(dev, NULL);
+   if (IS_ERR(kona_dev->external_clk)) {
+   dev_err(dev, "Failed to get external clock\n");
+   ret = PTR_ERR(kona_dev->external_clk);
+   goto err_pltfm_free;
+   }
+
+   if (clk_set_rate(kona_dev->external_clk, host->mmc->f_max) != 0) {
+   dev_err(dev, "Failed to set rate external clock\n");
+   goto err_pltfm_free;
+   }
+
+   if (clk_prepare_enable(kona_dev->external_clk) != 0) {
+   dev_err(dev, "Failed to enable external clock\n");
+   goto err_pltfm_free;
+   }
+
dev_dbg(dev, "non-removable=%c\n",
(host->mmc->caps & MMC_CAP_NONREMOVABLE) ? 'Y' : 'N');
dev_dbg(dev, "cd_gpio %c, wp_gpio %c\n",
@@ -271,7 +290,7 @@ static int sdhci_bcm_kona_probe(struct platform_device 
*pdev)
 
ret = sdhci_bcm_kona_sd_reset(host);
if (ret)
-   goto err_pltfm_free;
+   goto err_clk_disable;
 
sdhci_bcm_kona_sd_init(host);
 
@@ -307,6 +326,9 @@ err_remove_host:
 err_reset:
sdhci_bcm_kona_sd_reset(host);
 
+err_clk_disable:
+   clk_disable_unprepare(kona_dev->external_clk);
+
 err_pltfm_free:
sdhci_pltfm_free(pdev);
 
@@ -316,7 +338,18 @@ err_pltfm_free:
 
 static int __exit sdhci_bcm_kona_remove(struct platform_device *pdev)
 {
-   return sdhci_pltfm_unregister(pdev);
+   struct sdhci_host *host = platform_get_drvdata(pdev);
+   struct sdhci_pltfm_host *pltfm_priv = sdhci_priv(host);
+   struct sdhci_bcm_kona_dev *kona_dev = sdhci_pltfm_priv(pltfm_priv);
+   int dead = (readl(host->ioaddr + SDHCI_INT_STATUS) == 0x);
+
+   sdhci_remove_host(host, dead);
+
+   clk_disable_unprepare(kona_dev->external_clk);
+
+   sdhci_pltfm_free(pdev);
+
+   return 0;
 }
 
 static struct platform_driver sdhci_bcm_kona_driver = {
-- 
1.8.0.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/2] mm: reduce reclaim stalls with heavy anon and dirty cache

2014-01-24 Thread Andrew Morton
On Fri, 24 Jan 2014 17:03:02 -0500 Johannes Weiner  wrote:

> Tejun reported stuttering and latency spikes on a system where random
> tasks would enter direct reclaim and get stuck on dirty pages.  Around
> 50% of memory was occupied by tmpfs backed by an SSD, and another disk
> (rotating) was reading and writing at max speed to shrink a partition.

Do you think this is serious enough to squeeze these into 3.14?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/2] mm: reduce reclaim stalls with heavy anon and dirty cache

2014-01-24 Thread Tejun Heo
Hello,

On Fri, Jan 24, 2014 at 05:03:02PM -0500, Johannes Weiner wrote:
> These two patches fix the dirtyable memory calculation to acknowledge
> the fact that the VM does not really replace anon with dirty cache.
> As such, anon memory can no longer be considered "dirtyable."
> 
> Longer term we probably want to look into reducing some of the bias
> towards cache.  The problematic workload in particular was not even
> using any of the anon pages, one swap burst could have resolved it.

For both patches,

 Tested-by: Tejun Heo 

I don't have much idea what's going on here, but the problem was
pretty ridiculous.  It's a 8gig machine w/ one ssd and 10k rpm
harddrive and I could reliably reproduce constant stuttering every
several seconds for as long as buffered IO was going on on the hard
drive either with tmpfs occupying somewhere above 4gig or a test
program which allocates about the same amount of anon memory.
Although swap usage was zero, turning off swap also made the problem
go away too.

The trigger conditions seem quite plausible - high anon memory usage
w/ heavy buffered IO and swap configured - and it's highly likely that
this is happening in the wild too.  (this can happen with copying
large files to usb sticks too, right?)

So, if this is the right fix && can be determined not to cause
noticeable regressions, it probably is worthwhile to cc -stable.

Thanks a lot!

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Contact Me Please !

2014-01-24 Thread Mr. SUN Zhijun
Guten Tag,

Mein Name ist Mr.SUN Zhijun, ich mit der Bank of China zu arbeiten. Ich brauche 
Ihre Unterstützung in Durchführung einer Transaktion bei $ 18,5 Millionen 
Dollar geschätzt, möchte ich Ihnen 30% der gesamten Mittel als Ausgleich für 
Ihre Unterstützung in dieser Transaktion. Ich werde Sie über die vollständige 
Transaktion benachrichtigt nach Eingang Ihrer Antwort, wenn interessiert, bitte 
senden Sie mir Ihren vollständigen detials als unten, um meine E-Mail 
aufgeführt: sun.zhi...@yahoo.com.hk

1. Vollständiger Name
2.Private Telefonnummer
3.Current Wohnadresse

Mit freundlichen Grüßen,
Mr.SUN Zhijun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] crypto: ccp - Move HMAC calculation down to ccp ops file

2014-01-24 Thread Tom Lendacky
Move the support to perform an HMAC calculation into
the CCP operations file.  This eliminates the need to
perform a synchronous SHA operation used to calculate
the HMAC.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/ccp/ccp-crypto-sha.c |  130 +++
 drivers/crypto/ccp/ccp-crypto.h |8 +-
 drivers/crypto/ccp/ccp-ops.c|  104 
 include/linux/ccp.h |7 ++
 4 files changed, 139 insertions(+), 110 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-crypto-sha.c 
b/drivers/crypto/ccp/ccp-crypto-sha.c
index 3867290..873f234 100644
--- a/drivers/crypto/ccp/ccp-crypto-sha.c
+++ b/drivers/crypto/ccp/ccp-crypto-sha.c
@@ -24,75 +24,10 @@
 #include "ccp-crypto.h"
 
 
-struct ccp_sha_result {
-   struct completion completion;
-   int err;
-};
-
-static void ccp_sync_hash_complete(struct crypto_async_request *req, int err)
-{
-   struct ccp_sha_result *result = req->data;
-
-   if (err == -EINPROGRESS)
-   return;
-
-   result->err = err;
-   complete(>completion);
-}
-
-static int ccp_sync_hash(struct crypto_ahash *tfm, u8 *buf,
-struct scatterlist *sg, unsigned int len)
-{
-   struct ccp_sha_result result;
-   struct ahash_request *req;
-   int ret;
-
-   init_completion();
-
-   req = ahash_request_alloc(tfm, GFP_KERNEL);
-   if (!req)
-   return -ENOMEM;
-
-   ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
-  ccp_sync_hash_complete, );
-   ahash_request_set_crypt(req, sg, buf, len);
-
-   ret = crypto_ahash_digest(req);
-   if ((ret == -EINPROGRESS) || (ret == -EBUSY)) {
-   ret = wait_for_completion_interruptible();
-   if (!ret)
-   ret = result.err;
-   }
-
-   ahash_request_free(req);
-
-   return ret;
-}
-
-static int ccp_sha_finish_hmac(struct crypto_async_request *async_req)
-{
-   struct ahash_request *req = ahash_request_cast(async_req);
-   struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
-   struct ccp_ctx *ctx = crypto_ahash_ctx(tfm);
-   struct ccp_sha_req_ctx *rctx = ahash_request_ctx(req);
-   struct scatterlist sg[2];
-   unsigned int block_size =
-   crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
-   unsigned int digest_size = crypto_ahash_digestsize(tfm);
-
-   sg_init_table(sg, ARRAY_SIZE(sg));
-   sg_set_buf([0], ctx->u.sha.opad, block_size);
-   sg_set_buf([1], rctx->ctx, digest_size);
-
-   return ccp_sync_hash(ctx->u.sha.hmac_tfm, req->result, sg,
-block_size + digest_size);
-}
-
 static int ccp_sha_complete(struct crypto_async_request *async_req, int ret)
 {
struct ahash_request *req = ahash_request_cast(async_req);
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
-   struct ccp_ctx *ctx = crypto_ahash_ctx(tfm);
struct ccp_sha_req_ctx *rctx = ahash_request_ctx(req);
unsigned int digest_size = crypto_ahash_digestsize(tfm);
 
@@ -112,10 +47,6 @@ static int ccp_sha_complete(struct crypto_async_request 
*async_req, int ret)
if (req->result)
memcpy(req->result, rctx->ctx, digest_size);
 
-   /* If we're doing an HMAC, we need to perform that on the final op */
-   if (rctx->final && ctx->u.sha.key_len)
-   ret = ccp_sha_finish_hmac(async_req);
-
 e_free:
sg_free_table(>data_sg);
 
@@ -126,6 +57,7 @@ static int ccp_do_sha_update(struct ahash_request *req, 
unsigned int nbytes,
 unsigned int final)
 {
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+   struct ccp_ctx *ctx = crypto_ahash_ctx(tfm);
struct ccp_sha_req_ctx *rctx = ahash_request_ctx(req);
struct scatterlist *sg;
unsigned int block_size =
@@ -196,6 +128,11 @@ static int ccp_do_sha_update(struct ahash_request *req, 
unsigned int nbytes,
rctx->cmd.u.sha.ctx_len = sizeof(rctx->ctx);
rctx->cmd.u.sha.src = sg;
rctx->cmd.u.sha.src_len = rctx->hash_cnt;
+   rctx->cmd.u.sha.opad = ctx->u.sha.key_len ?
+   >u.sha.opad_sg : NULL;
+   rctx->cmd.u.sha.opad_len = ctx->u.sha.key_len ?
+   ctx->u.sha.opad_count : 0;
+   rctx->cmd.u.sha.first = rctx->first;
rctx->cmd.u.sha.final = rctx->final;
rctx->cmd.u.sha.msg_bits = rctx->msg_bits;
 
@@ -218,7 +155,6 @@ static int ccp_sha_init(struct ahash_request *req)
 
memset(rctx, 0, sizeof(*rctx));
 
-   memcpy(rctx->ctx, alg->init, sizeof(rctx->ctx));
rctx->type = alg->type;
rctx->first = 1;
 
@@ -261,10 +197,13 @@ static int ccp_sha_setkey(struct crypto_ahash *tfm, const 
u8 *key,
  unsigned int key_len)
 {
struct ccp_ctx *ctx = crypto_tfm_ctx(crypto_ahash_tfm(tfm));
-   struct scatterlist sg;
-   unsigned int block_size 

[PATCH 4/4] crypto: ccp - Perform completion callbacks using a tasklet

2014-01-24 Thread Tom Lendacky
Change from scheduling work to scheduling a tasklet to perform
the callback operations.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/ccp/ccp-dev.c |   21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-dev.c b/drivers/crypto/ccp/ccp-dev.c
index c3bc212..2c78161 100644
--- a/drivers/crypto/ccp/ccp-dev.c
+++ b/drivers/crypto/ccp/ccp-dev.c
@@ -30,6 +30,11 @@ MODULE_LICENSE("GPL");
 MODULE_VERSION("1.0.0");
 MODULE_DESCRIPTION("AMD Cryptographic Coprocessor driver");
 
+struct ccp_tasklet_data {
+   struct completion completion;
+   struct ccp_cmd *cmd;
+};
+
 
 static struct ccp_device *ccp_dev;
 static inline struct ccp_device *ccp_get_device(void)
@@ -192,17 +197,23 @@ static struct ccp_cmd *ccp_dequeue_cmd(struct 
ccp_cmd_queue *cmd_q)
return cmd;
 }
 
-static void ccp_do_cmd_complete(struct work_struct *work)
+static void ccp_do_cmd_complete(unsigned long data)
 {
-   struct ccp_cmd *cmd = container_of(work, struct ccp_cmd, work);
+   struct ccp_tasklet_data *tdata = (struct ccp_tasklet_data *)data;
+   struct ccp_cmd *cmd = tdata->cmd;
 
cmd->callback(cmd->data, cmd->ret);
+   complete(>completion);
 }
 
 static int ccp_cmd_queue_thread(void *data)
 {
struct ccp_cmd_queue *cmd_q = (struct ccp_cmd_queue *)data;
struct ccp_cmd *cmd;
+   struct ccp_tasklet_data tdata;
+   struct tasklet_struct tasklet;
+
+   tasklet_init(, ccp_do_cmd_complete, (unsigned long));
 
set_current_state(TASK_INTERRUPTIBLE);
while (!kthread_should_stop()) {
@@ -220,8 +231,10 @@ static int ccp_cmd_queue_thread(void *data)
cmd->ret = ccp_run_cmd(cmd_q, cmd);
 
/* Schedule the completion callback */
-   INIT_WORK(>work, ccp_do_cmd_complete);
-   schedule_work(>work);
+   tdata.cmd = cmd;
+   init_completion();
+   tasklet_schedule();
+   wait_for_completion();
}
 
__set_current_state(TASK_RUNNING);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] crypto: ccp - Use a single queue for proper ordering of tfm requests

2014-01-24 Thread Tom Lendacky
Move to a single queue to serialize requests within a tfm. When
testing using IPSec with a large number of network connections
the per cpu tfm queuing logic was not working properly.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/ccp/ccp-crypto-main.c |  164 ++
 1 file changed, 48 insertions(+), 116 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-crypto-main.c 
b/drivers/crypto/ccp/ccp-crypto-main.c
index b3f22b0..010fded 100644
--- a/drivers/crypto/ccp/ccp-crypto-main.c
+++ b/drivers/crypto/ccp/ccp-crypto-main.c
@@ -38,23 +38,20 @@ MODULE_PARM_DESC(sha_disable, "Disable use of SHA - any 
non-zero value");
 static LIST_HEAD(hash_algs);
 static LIST_HEAD(cipher_algs);
 
-/* For any tfm, requests for that tfm on the same CPU must be returned
- * in the order received.  With multiple queues available, the CCP can
- * process more than one cmd at a time.  Therefore we must maintain
- * a cmd list to insure the proper ordering of requests on a given tfm/cpu
- * combination.
+/* For any tfm, requests for that tfm must be returned on the order
+ * received.  With multiple queues available, the CCP can process more
+ * than one cmd at a time.  Therefore we must maintain a cmd list to insure
+ * the proper ordering of requests on a given tfm.
  */
-struct ccp_crypto_cpu_queue {
+struct ccp_crypto_queue {
struct list_head cmds;
struct list_head *backlog;
unsigned int cmd_count;
 };
-#define CCP_CRYPTO_MAX_QLEN50
+#define CCP_CRYPTO_MAX_QLEN100
 
-struct ccp_crypto_percpu_queue {
-   struct ccp_crypto_cpu_queue __percpu *cpu_queue;
-};
-static struct ccp_crypto_percpu_queue req_queue;
+static struct ccp_crypto_queue req_queue;
+static spinlock_t req_queue_lock;
 
 struct ccp_crypto_cmd {
struct list_head entry;
@@ -71,8 +68,6 @@ struct ccp_crypto_cmd {
 
/* Used for held command processing to determine state */
int ret;
-
-   int cpu;
 };
 
 struct ccp_crypto_cpu {
@@ -91,25 +86,21 @@ static inline bool ccp_crypto_success(int err)
return true;
 }
 
-/*
- * ccp_crypto_cmd_complete must be called while running on the appropriate
- * cpu and the caller must have done a get_cpu to disable preemption
- */
 static struct ccp_crypto_cmd *ccp_crypto_cmd_complete(
struct ccp_crypto_cmd *crypto_cmd, struct ccp_crypto_cmd **backlog)
 {
-   struct ccp_crypto_cpu_queue *cpu_queue;
struct ccp_crypto_cmd *held = NULL, *tmp;
+   unsigned long flags;
 
*backlog = NULL;
 
-   cpu_queue = this_cpu_ptr(req_queue.cpu_queue);
+   spin_lock_irqsave(_queue_lock, flags);
 
/* Held cmds will be after the current cmd in the queue so start
 * searching for a cmd with a matching tfm for submission.
 */
tmp = crypto_cmd;
-   list_for_each_entry_continue(tmp, _queue->cmds, entry) {
+   list_for_each_entry_continue(tmp, _queue.cmds, entry) {
if (crypto_cmd->tfm != tmp->tfm)
continue;
held = tmp;
@@ -120,47 +111,45 @@ static struct ccp_crypto_cmd *ccp_crypto_cmd_complete(
 *   Because cmds can be executed from any point in the cmd list
 *   special precautions have to be taken when handling the backlog.
 */
-   if (cpu_queue->backlog != _queue->cmds) {
+   if (req_queue.backlog != _queue.cmds) {
/* Skip over this cmd if it is the next backlog cmd */
-   if (cpu_queue->backlog == _cmd->entry)
-   cpu_queue->backlog = crypto_cmd->entry.next;
+   if (req_queue.backlog == _cmd->entry)
+   req_queue.backlog = crypto_cmd->entry.next;
 
-   *backlog = container_of(cpu_queue->backlog,
+   *backlog = container_of(req_queue.backlog,
struct ccp_crypto_cmd, entry);
-   cpu_queue->backlog = cpu_queue->backlog->next;
+   req_queue.backlog = req_queue.backlog->next;
 
/* Skip over this cmd if it is now the next backlog cmd */
-   if (cpu_queue->backlog == _cmd->entry)
-   cpu_queue->backlog = crypto_cmd->entry.next;
+   if (req_queue.backlog == _cmd->entry)
+   req_queue.backlog = crypto_cmd->entry.next;
}
 
/* Remove the cmd entry from the list of cmds */
-   cpu_queue->cmd_count--;
+   req_queue.cmd_count--;
list_del(_cmd->entry);
 
+   spin_unlock_irqrestore(_queue_lock, flags);
+
return held;
 }
 
-static void ccp_crypto_complete_on_cpu(struct work_struct *work)
+static void ccp_crypto_complete(void *data, int err)
 {
-   struct ccp_crypto_cpu *cpu_work =
-   container_of(work, struct ccp_crypto_cpu, work);
-   struct ccp_crypto_cmd *crypto_cmd = cpu_work->crypto_cmd;
+   struct ccp_crypto_cmd *crypto_cmd = data;
struct ccp_crypto_cmd *held, *next, *backlog;

[PATCH 1/4] crypto: ccp - Allow for selective disablement of crypto API algorithms

2014-01-24 Thread Tom Lendacky
Introduce module parameters that allow for disabling of a
crypto algorithm by not registering the algorithm with the
crypto API.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/ccp/ccp-crypto-main.c |   37 +++---
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-crypto-main.c 
b/drivers/crypto/ccp/ccp-crypto-main.c
index 2636f04..b3f22b0 100644
--- a/drivers/crypto/ccp/ccp-crypto-main.c
+++ b/drivers/crypto/ccp/ccp-crypto-main.c
@@ -11,6 +11,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -24,6 +25,14 @@ MODULE_LICENSE("GPL");
 MODULE_VERSION("1.0.0");
 MODULE_DESCRIPTION("AMD Cryptographic Coprocessor crypto API support");
 
+static unsigned int aes_disable;
+module_param(aes_disable, uint, 0444);
+MODULE_PARM_DESC(aes_disable, "Disable use of AES - any non-zero value");
+
+static unsigned int sha_disable;
+module_param(sha_disable, uint, 0444);
+MODULE_PARM_DESC(sha_disable, "Disable use of SHA - any non-zero value");
+
 
 /* List heads for the supported algorithms */
 static LIST_HEAD(hash_algs);
@@ -337,21 +346,25 @@ static int ccp_register_algs(void)
 {
int ret;
 
-   ret = ccp_register_aes_algs(_algs);
-   if (ret)
-   return ret;
+   if (!aes_disable) {
+   ret = ccp_register_aes_algs(_algs);
+   if (ret)
+   return ret;
 
-   ret = ccp_register_aes_cmac_algs(_algs);
-   if (ret)
-   return ret;
+   ret = ccp_register_aes_cmac_algs(_algs);
+   if (ret)
+   return ret;
 
-   ret = ccp_register_aes_xts_algs(_algs);
-   if (ret)
-   return ret;
+   ret = ccp_register_aes_xts_algs(_algs);
+   if (ret)
+   return ret;
+   }
 
-   ret = ccp_register_sha_algs(_algs);
-   if (ret)
-   return ret;
+   if (!sha_disable) {
+   ret = ccp_register_sha_algs(_algs);
+   if (ret)
+   return ret;
+   }
 
return 0;
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes

2014-01-24 Thread Tom Lendacky
Patch 1: Allow for selectively disabling the registration of an algorithm
family (sha or aes algorithms) via module parameters.

Patch 2-4: Fix errors/issues that were found during IPSec testing. In
order to prevent deadlocks with the networking code, the crypto callback
was changed to run as a tasklet.  In order for the callback to be run as
a tasklet, the HMAC calculation needed to be moved out of the callback
path (since it sleeps) and into the CCP sha operation logic.  Additionally,
trying to allow concurrency within the tfm while maintaining serialization
of the tfm per CPU was not working properly so a single queue is used now.

This patch series is based on the cryptodev-2.6 kernel tree.

---

Tom Lendacky (4):
  crypto: ccp - Allow for selective disablement of crypto API algorithms
  crypto: ccp - Move HMAC calculation down to ccp ops file
  crypto: ccp - Use a single queue for proper ordering of tfm requests
  crypto: ccp - Perform completion callbacks using a tasklet


 drivers/crypto/ccp/ccp-crypto-main.c |  201 --
 drivers/crypto/ccp/ccp-crypto-sha.c  |  130 --
 drivers/crypto/ccp/ccp-crypto.h  |8 +
 drivers/crypto/ccp/ccp-dev.c |   21 +++-
 drivers/crypto/ccp/ccp-ops.c |  104 +-
 include/linux/ccp.h  |7 +
 6 files changed, 229 insertions(+), 242 deletions(-)

-- 
Tom Lendacky

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Update][RFC/RFT][PATCH v2 5/6] ACPI / LPSS: Support for device latency tolerance PM QoS

2014-01-24 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 
Subject: ACPI / LPSS: Support for device latency tolerance PM QoS

Add a new routine, acpi_lpss_set_ltr(), for setting latency tolerance
values for LPSS devices having LTR (Latency Tolerance Reporting)
registers.  Add .bind()/.unbind() callbacks to lpss_handler to set
the LPSS devices' power.set_latency_tolerance callback pointers to
acpi_lpss_set_ltr() during device addition and to clear them on
device removal, respectively.

That will cause the device latency tolerance PM QoS to work for
the devices in question as documented.

This changeset includes a fix from Mika Westerberg.

Signed-off-by: Rafael J. Wysocki 
---

This one needs to be update to handle PM_QOS_LATENCY_ANY to follow the [3/6]
update I've just sent.

Thanks,
Rafael

---
 drivers/acpi/acpi_lpss.c |   73 ++-
 1 file changed, 72 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/acpi/acpi_lpss.c
===
--- linux-pm.orig/drivers/acpi/acpi_lpss.c
+++ linux-pm/drivers/acpi/acpi_lpss.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "internal.h"
@@ -33,6 +34,12 @@ ACPI_MODULE_NAME("acpi_lpss");
 #define LPSS_GENERAL_UART_RTS_OVRD BIT(3)
 #define LPSS_SW_LTR0x10
 #define LPSS_AUTO_LTR  0x14
+#define LPSS_LTR_SNOOP_REQ BIT(15)
+#define LPSS_LTR_SNOOP_MASK0x
+#define LPSS_LTR_SNOOP_LAT_1US 0x800
+#define LPSS_LTR_SNOOP_LAT_32US0xC00
+#define LPSS_LTR_SNOOP_LAT_SHIFT   5
+#define LPSS_LTR_MAX_VAL   0x3FF
 #define LPSS_TX_INT0x20
 #define LPSS_TX_INT_MASK   BIT(1)
 
@@ -316,6 +323,17 @@ static int acpi_lpss_create_device(struc
return ret;
 }
 
+static u32 __lpss_reg_read(struct lpss_private_data *pdata, unsigned int reg)
+{
+   return readl(pdata->mmio_base + pdata->dev_desc->prv_offset + reg);
+}
+
+static void __lpss_reg_write(u32 val, struct lpss_private_data *pdata,
+unsigned int reg)
+{
+   writel(val, pdata->mmio_base + pdata->dev_desc->prv_offset + reg);
+}
+
 static int lpss_reg_read(struct device *dev, unsigned int reg, u32 *val)
 {
struct acpi_device *adev;
@@ -337,7 +355,7 @@ static int lpss_reg_read(struct device *
ret = -ENODEV;
goto out;
}
-   *val = readl(pdata->mmio_base + pdata->dev_desc->prv_offset + reg);
+   *val = __lpss_reg_read(pdata, reg);
 
  out:
spin_unlock_irqrestore(>power.lock, flags);
@@ -390,6 +408,39 @@ static struct attribute_group lpss_attr_
.name = "lpss_ltr",
 };
 
+static void acpi_lpss_set_ltr(struct device *dev, s32 val)
+{
+   struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
+   u32 ltr_mode, ltr_val;
+
+   ltr_mode = __lpss_reg_read(pdata, LPSS_GENERAL);
+   if (val < 0) {
+   if (ltr_mode & LPSS_GENERAL_LTR_MODE_SW) {
+   ltr_mode &= ~LPSS_GENERAL_LTR_MODE_SW;
+   __lpss_reg_write(ltr_mode, pdata, LPSS_GENERAL);
+   }
+   return;
+   }
+   ltr_val = __lpss_reg_read(pdata, LPSS_SW_LTR) & ~LPSS_LTR_SNOOP_MASK;
+   if (val == PM_QOS_LATENCY_ANY) {
+   ltr_val |= LPSS_LTR_SNOOP_LAT_32US;
+   val = LPSS_LTR_MAX_VAL;
+   } else if (val > LPSS_LTR_MAX_VAL) {
+   ltr_val |= LPSS_LTR_SNOOP_LAT_32US | LPSS_LTR_SNOOP_REQ;
+   val >>= LPSS_LTR_SNOOP_LAT_SHIFT;
+   if (val > LPSS_LTR_MAX_VAL)
+   val = LPSS_LTR_MAX_VAL;
+   } else {
+   ltr_val |= LPSS_LTR_SNOOP_LAT_1US;
+   }
+   ltr_val |= val;
+   __lpss_reg_write(ltr_val, pdata, LPSS_SW_LTR);
+   if (!(ltr_mode & LPSS_GENERAL_LTR_MODE_SW)) {
+   ltr_mode |= LPSS_GENERAL_LTR_MODE_SW;
+   __lpss_reg_write(ltr_mode, pdata, LPSS_GENERAL);
+   }
+}
+
 static int acpi_lpss_platform_notify(struct notifier_block *nb,
 unsigned long action, void *data)
 {
@@ -427,9 +478,29 @@ static struct notifier_block acpi_lpss_n
.notifier_call = acpi_lpss_platform_notify,
 };
 
+static void acpi_lpss_bind(struct device *dev)
+{
+   struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
+
+   if (!pdata || !pdata->mmio_base)
+   return;
+
+   if (pdata->mmio_size >= pdata->dev_desc->prv_offset + LPSS_LTR_SIZE)
+   dev->power.set_latency_tolerance = acpi_lpss_set_ltr;
+   else
+   dev_err(dev, "MMIO size insufficient to access LTR\n");
+}
+
+static void acpi_lpss_unbind(struct device *dev)
+{
+   dev->power.set_latency_tolerance = NULL;
+}
+
 static struct acpi_scan_handler lpss_handler = {
.ids = acpi_lpss_device_ids,
.attach = acpi_lpss_create_device,
+   .bind = 

[Update][RFC/RFT][PATCH v2 3/6] PM / QoS: Introcuce latency tolerance device PM QoS type

2014-01-24 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

Add a new latency tolerance device PM QoS type to be use for
specifying active state (RPM_ACTIVE) memory access (DMA) latency
tolerance requirements for devices.  It may be used to prevent
hardware from choosing overly aggressive energy-saving operation
modes (causing too much latency to appear) for the whole platform.

This feature reqiures hardware support, so it only will be
available for devices having a new .set_latency_tolerance()
callback in struct dev_pm_info populated, in which case the
routine pointed to by it should implement whatever is necessary
to transfer the effective requirement value to the hardware.

Whenever the effective latency tolerance changes for the device,
its .set_latency_tolerance() callback will be executed and the
effective value will be passed to it.  If that value is negative,
which means that the list of latency tolerance requirements for
the device is empty, the callback is expected to switch the
underlying hardware latency tolerance control mechanism to an
autonomous mode if available.  If that value is PM_QOS_LATENCY_ANY,
in turn, and the hardware supports a special "no requirement"
setting, the callback is expected to use it.  That allows software
to prevent the hardware from automatically updating the device's
latency tolerance in response to its power state changes (e.g. during
transitions from D3cold to D0), which generally may be done in the
autonomous latency tolerance control mode.

If .set_latency_tolerance() is present for the device, a new
pm_qos_latency_tolerance_us attribute will be present in the
devivce's power directory in sysfs.  Then, user space can use
that attribute to specify its latency tolerance requirement for
the device, if any.  Writing "any" to it means "no requirement, but
do not let the hardware control latency tolerance" and writing
"none" to it allows the hardware to be switched to the autonomous
mode if there are no other requirements from the kernel side in the
device's list.

This changeset includes a fix from Mika Westerberg.

Signed-off-by: Rafael J. Wysocki 
---

In the previous version 0 was a special "no requirement" value, but of course
the plist code doesn't treat 0 in any special way, so if one of the requirements
in the list was set to 0, it would end up as the effective value and would
enforce the "no requirement" setting on everybody (even if someone had a 
specific
requirement with nonzero value).

For this reason, the update introduces a special value PM_QOS_LATENCY_ANY to
be used for that, which is just the maximum value of s32, so it should always
be at the end of the plist.  Still, the sysfs attribute now has two special
cases to handle, so I think it's better to represent them both as strings.

The documentation has been updated to follow the code changes.

Thanks,
Rafael

---
 Documentation/ABI/testing/sysfs-devices-power |   27 
 Documentation/power/pm_qos_interface.txt  |   61 +--
 drivers/base/power/qos.c  |  144 ++
 drivers/base/power/sysfs.c|   65 ++-
 include/linux/pm.h|1 
 include/linux/pm_qos.h|   12 ++
 kernel/power/qos.c|   13 +-
 7 files changed, 278 insertions(+), 45 deletions(-)

Index: linux-pm/include/linux/pm_qos.h
===
--- linux-pm.orig/include/linux/pm_qos.h
+++ linux-pm/include/linux/pm_qos.h
@@ -33,6 +33,9 @@ enum pm_qos_flags_status {
 #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE   (2000 * USEC_PER_SEC)
 #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE0
 #define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE0
+#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0
+#define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)
+#define PM_QOS_LATENCY_ANY ((s32)(~(__u32)0 >> 1))
 
 #define PM_QOS_FLAG_NO_POWER_OFF   (1 << 0)
 #define PM_QOS_FLAG_REMOTE_WAKEUP  (1 << 1)
@@ -50,6 +53,7 @@ struct pm_qos_flags_request {
 
 enum dev_pm_qos_req_type {
DEV_PM_QOS_RESUME_LATENCY = 1,
+   DEV_PM_QOS_LATENCY_TOLERANCE,
DEV_PM_QOS_FLAGS,
 };
 
@@ -89,8 +93,10 @@ struct pm_qos_flags {
 
 struct dev_pm_qos {
struct pm_qos_constraints resume_latency;
+   struct pm_qos_constraints latency_tolerance;
struct pm_qos_flags flags;
struct dev_pm_qos_request *resume_latency_req;
+   struct dev_pm_qos_request *latency_tolerance_req;
struct dev_pm_qos_request *flags_req;
 };
 
@@ -196,6 +202,8 @@ void dev_pm_qos_hide_latency_limit(struc
 int dev_pm_qos_expose_flags(struct device *dev, s32 value);
 void dev_pm_qos_hide_flags(struct device *dev);
 int dev_pm_qos_update_flags(struct device *dev, s32 mask, bool set);
+s32 dev_pm_qos_get_user_latency_tolerance(struct device *dev);
+int dev_pm_qos_update_user_latency_tolerance(struct device *dev, s32 val);
 
 static inline s32 

Re: [PATCH] arm64: fix build error if DMA_CMA is enabled

2014-01-24 Thread Laura Abbott

On 1/24/2014 7:37 AM, Catalin Marinas wrote:

On Fri, Jan 24, 2014 at 08:23:08AM +, Pankaj Dubey wrote:

arm64/include/asm/dma-contiguous.h is trying to include
 which does not exist, and thus failing
build for arm64 if we enable CONFIG_DMA_CMA. This patch fixes build
error by removing unwanted header inclusion from arm64's dma-contiguous.h.

Signed-off-by: Pankaj Dubey 
Signed-off-by: Somraj Mani 
---
  arch/arm64/include/asm/dma-contiguous.h |1 -
  1 file changed, 1 deletion(-)

diff --git a/arch/arm64/include/asm/dma-contiguous.h 
b/arch/arm64/include/asm/dma-contiguous.h
index d6aacb6..14c4c0c 100644
--- a/arch/arm64/include/asm/dma-contiguous.h
+++ b/arch/arm64/include/asm/dma-contiguous.h
@@ -18,7 +18,6 @@
  #ifdef CONFIG_DMA_CMA

  #include 
-#include 


Thanks for this.

Laura, did you have additional patches adding
asm-generic/dma-contiguous.h?



no, asm-generic/dma-contiguous.h was an old file which was later 
removed. I missed this when rebasing from my older branch to mainline.

You can have

Acked-by: Laura Abbott 

Thanks,
Laura

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] mtd: block2mtd: char mtd major check

2014-01-24 Thread Ezequiel Garcia
On Thu, Jan 23, 2014 at 08:49:47PM +0100, Fabian Frederick wrote:
> Deny use of a char mtd device to map as a block device.
> 
> Signed-off-by: Fabian Frederick 
> ---
>  drivers/mtd/devices/block2mtd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
> index d9fd87a..0efee5b 100644
> --- a/drivers/mtd/devices/block2mtd.c
> +++ b/drivers/mtd/devices/block2mtd.c
> @@ -244,7 +244,8 @@ static struct block2mtd_dev *add_device(char *devname, 
> int erase_size)
>   }
>   dev->blkdev = bdev;
>  
> - if (MAJOR(bdev->bd_dev) == MTD_BLOCK_MAJOR) {
> + if ((MAJOR(bdev->bd_dev) == MTD_BLOCK_MAJOR) ||
> + (MAJOR(bdev->bd_dev) == MTD_CHAR_MAJOR)) {
>   pr_err("attempting to use an MTD device as a block device\n");
>   goto devinit_err;
>   }
> -- 
> 1.8.1.4

Now that the changes are separated on a per-patch basis they're much
much easier to review. Thanks!

Regarding this changes, it seems to me it's not needed. Are you sure
you can attach "block2mtd" to a char MTD device?

That would be odd, given this function calls blkdev_get_by_{path/dev};
which will check the device is of block type in the first place.

What's your motivation for this change?
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 2/2] mm: page-writeback: do not count anon pages as dirtyable memory

2014-01-24 Thread Johannes Weiner
The VM is currently heavily tuned to avoid swapping.  Whether that is
good or bad is a separate discussion, but as long as the VM won't swap
to make room for dirty cache, we can not consider anonymous pages when
calculating the amount of dirtyable memory, the baseline to which
dirty_background_ratio and dirty_ratio are applied.

A simple workload that occupies a significant size (40+%, depending on
memory layout, storage speeds etc.) of memory with anon/tmpfs pages
and uses the remainder for a streaming writer demonstrates this
problem.  In that case, the actual cache pages are a small fraction of
what is considered dirtyable overall, which results in an relatively
large portion of the cache pages to be dirtied.  As kswapd starts
rotating these, random tasks enter direct reclaim and stall on IO.

Only consider free pages and file pages dirtyable.

Signed-off-by: Johannes Weiner 
---
 include/linux/vmstat.h |  2 --
 mm/internal.h  |  1 -
 mm/page-writeback.c|  6 --
 mm/vmscan.c| 23 +--
 4 files changed, 5 insertions(+), 27 deletions(-)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index e4b948080d20..a67b38415768 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -142,8 +142,6 @@ static inline unsigned long zone_page_state_snapshot(struct 
zone *zone,
return x;
 }
 
-extern unsigned long global_reclaimable_pages(void);
-
 #ifdef CONFIG_NUMA
 /*
  * Determine the per node value of a stat item. This function
diff --git a/mm/internal.h b/mm/internal.h
index 684f7aa9692a..8b6cfd63b5a5 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -85,7 +85,6 @@ extern unsigned long highest_memmap_pfn;
  */
 extern int isolate_lru_page(struct page *page);
 extern void putback_lru_page(struct page *page);
-extern unsigned long zone_reclaimable_pages(struct zone *zone);
 extern bool zone_reclaimable(struct zone *zone);
 
 /*
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 79cf52b058a7..29e129478644 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -205,7 +205,8 @@ static unsigned long zone_dirtyable_memory(struct zone 
*zone)
nr_pages = zone_page_state(zone, NR_FREE_PAGES);
nr_pages -= min(nr_pages, zone->dirty_balance_reserve);
 
-   nr_pages += zone_reclaimable_pages(zone);
+   nr_pages += zone_page_state(zone, NR_INACTIVE_FILE);
+   nr_pages += zone_page_state(zone, NR_ACTIVE_FILE);
 
return nr_pages;
 }
@@ -259,7 +260,8 @@ static unsigned long global_dirtyable_memory(void)
x = global_page_state(NR_FREE_PAGES);
x -= min(x, dirty_balance_reserve);
 
-   x += global_reclaimable_pages();
+   x += global_page_state(NR_INACTIVE_FILE);
+   x += global_page_state(NR_ACTIVE_FILE);
 
if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index eea668d9cff6..05e6095159dc 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -147,7 +147,7 @@ static bool global_reclaim(struct scan_control *sc)
 }
 #endif
 
-unsigned long zone_reclaimable_pages(struct zone *zone)
+static unsigned long zone_reclaimable_pages(struct zone *zone)
 {
int nr;
 
@@ -3297,27 +3297,6 @@ void wakeup_kswapd(struct zone *zone, int order, enum 
zone_type classzone_idx)
wake_up_interruptible(>kswapd_wait);
 }
 
-/*
- * The reclaimable count would be mostly accurate.
- * The less reclaimable pages may be
- * - mlocked pages, which will be moved to unevictable list when encountered
- * - mapped pages, which may require several travels to be reclaimed
- * - dirty pages, which is not "instantly" reclaimable
- */
-unsigned long global_reclaimable_pages(void)
-{
-   int nr;
-
-   nr = global_page_state(NR_ACTIVE_FILE) +
-global_page_state(NR_INACTIVE_FILE);
-
-   if (get_nr_swap_pages() > 0)
-   nr += global_page_state(NR_ACTIVE_ANON) +
- global_page_state(NR_INACTIVE_ANON);
-
-   return nr;
-}
-
 #ifdef CONFIG_HIBERNATION
 /*
  * Try to free `nr_to_reclaim' of memory, system-wide, and return the number of
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 1/2] mm: page-writeback: fix dirty_balance_reserve subtraction from dirtyable memory

2014-01-24 Thread Johannes Weiner
The dirty_balance_reserve is an approximation of the fraction of free
pages that the page allocator does not make available for page cache
allocations.  As a result, it has to be taken into account when
calculating the amount of "dirtyable memory", the baseline to which
dirty_background_ratio and dirty_ratio are applied.

However, currently the reserve is subtracted from the sum of free and
reclaimable pages, which is non-sensical and leads to erroneous
results when the system is dominated by unreclaimable pages and the
dirty_balance_reserve is bigger than free+reclaimable.  In that case,
at least the already allocated cache should be considered dirtyable.

Fix the calculation by subtracting the reserve from the amount of free
pages, then adding the reclaimable pages on top.

Signed-off-by: Johannes Weiner 
---
 mm/page-writeback.c | 52 +++-
 1 file changed, 23 insertions(+), 29 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 63807583d8e8..79cf52b058a7 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -191,6 +191,25 @@ static unsigned long writeout_period_time = 0;
  * global dirtyable memory first.
  */
 
+/**
+ * zone_dirtyable_memory - number of dirtyable pages in a zone
+ * @zone: the zone
+ *
+ * Returns the zone's number of pages potentially available for dirty
+ * page cache.  This is the base value for the per-zone dirty limits.
+ */
+static unsigned long zone_dirtyable_memory(struct zone *zone)
+{
+   unsigned long nr_pages;
+
+   nr_pages = zone_page_state(zone, NR_FREE_PAGES);
+   nr_pages -= min(nr_pages, zone->dirty_balance_reserve);
+
+   nr_pages += zone_reclaimable_pages(zone);
+
+   return nr_pages;
+}
+
 static unsigned long highmem_dirtyable_memory(unsigned long total)
 {
 #ifdef CONFIG_HIGHMEM
@@ -201,8 +220,7 @@ static unsigned long highmem_dirtyable_memory(unsigned long 
total)
struct zone *z =
_DATA(node)->node_zones[ZONE_HIGHMEM];
 
-   x += zone_page_state(z, NR_FREE_PAGES) +
-zone_reclaimable_pages(z) - z->dirty_balance_reserve;
+   x += zone_dirtyable_memory(zone);
}
/*
 * Unreclaimable memory (kernel memory or anonymous memory
@@ -238,9 +256,11 @@ static unsigned long global_dirtyable_memory(void)
 {
unsigned long x;
 
-   x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+   x = global_page_state(NR_FREE_PAGES);
x -= min(x, dirty_balance_reserve);
 
+   x += global_reclaimable_pages();
+
if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
 
@@ -289,32 +309,6 @@ void global_dirty_limits(unsigned long *pbackground, 
unsigned long *pdirty)
 }
 
 /**
- * zone_dirtyable_memory - number of dirtyable pages in a zone
- * @zone: the zone
- *
- * Returns the zone's number of pages potentially available for dirty
- * page cache.  This is the base value for the per-zone dirty limits.
- */
-static unsigned long zone_dirtyable_memory(struct zone *zone)
-{
-   /*
-* The effective global number of dirtyable pages may exclude
-* highmem as a big-picture measure to keep the ratio between
-* dirty memory and lowmem reasonable.
-*
-* But this function is purely about the individual zone and a
-* highmem zone can hold its share of dirty pages, so we don't
-* care about vm_highmem_is_dirtyable here.
-*/
-   unsigned long nr_pages = zone_page_state(zone, NR_FREE_PAGES) +
-   zone_reclaimable_pages(zone);
-
-   /* don't allow this to underflow */
-   nr_pages -= min(nr_pages, zone->dirty_balance_reserve);
-   return nr_pages;
-}
-
-/**
  * zone_dirty_limit - maximum number of dirty pages allowed in a zone
  * @zone: the zone
  *
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 0/2] mm: reduce reclaim stalls with heavy anon and dirty cache

2014-01-24 Thread Johannes Weiner
Tejun reported stuttering and latency spikes on a system where random
tasks would enter direct reclaim and get stuck on dirty pages.  Around
50% of memory was occupied by tmpfs backed by an SSD, and another disk
(rotating) was reading and writing at max speed to shrink a partition.

Analysis:

When calculating the amount of dirtyable memory, the VM considers all
free memory and all file and anon pages as baseline to which to apply
dirty limits.  This implies that, given memory pressure from dirtied
cache, the VM would actually start swapping to make room.  But alas,
this is not really the case and page reclaim tries very hard not to
swap as long as there is any used-once cache available.  The dirty
limit may have been 10-15% of main memory, but page cache was less
than 50% of that, which means that a third of the pages that the
reclaimers actually looked at were dirty.  Kswapd stopped making
progress, and in turn allocators were forced into direct reclaim only
to get stuck on dirty/writeback congestion.

These two patches fix the dirtyable memory calculation to acknowledge
the fact that the VM does not really replace anon with dirty cache.
As such, anon memory can no longer be considered "dirtyable."

Longer term we probably want to look into reducing some of the bias
towards cache.  The problematic workload in particular was not even
using any of the anon pages, one swap burst could have resolved it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] Fix overflow when HZ is smaller than 60

2014-01-24 Thread Mikulas Patocka
The patch that I sent before had wrong numbers in it, it could result in 
"Clocksource jiffies might overflow on 11% adjustment" message. This is 
the patch with correct numbers.

> +#if HZ < 30
> +#define JIFFIES_SHIFT6
> +#elif HZ < 60
> +#define JIFFIES_SHIFT7
> +#else
>  #define JIFFIES_SHIFT8
> +#endif

From: Mikulas Patocka 

Fix overflow when HZ is 32

When compiling for the IA-64 ski emulator, HZ is set to 32 because the
emulation is slow and we don't want to waste too many cycles processing
timers. Alpha also has an option to set HZ to 32.

This causes integer underflow in
kernel/time/jiffies.c:
kernel/time/jiffies.c:66:2: warning: large integer implicitly truncated to 
unsigned type [-Woverflow]
  .mult  = NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */
  ^

This patch reduces the JIFFIES_SHIFT value to avoid the overflow.

Signed-off-by: Mikulas Patocka 
Cc: sta...@vger.kernel.org

---
 kernel/time/jiffies.c |6 ++
 1 file changed, 6 insertions(+)

Index: linux-2.6-ia64/kernel/time/jiffies.c
===
--- linux-2.6-ia64.orig/kernel/time/jiffies.c   2014-01-24 22:34:17.0 
+0100
+++ linux-2.6-ia64/kernel/time/jiffies.c2014-01-24 22:36:56.0 
+0100
@@ -51,7 +51,13 @@
  * HZ shrinks, so values greater than 8 overflow 32bits when
  * HZ=100.
  */
+#if HZ < 34
+#define JIFFIES_SHIFT  6
+#elif HZ < 67
+#define JIFFIES_SHIFT  7
+#else
 #define JIFFIES_SHIFT  8
+#endif
 
 static cycle_t jiffies_read(struct clocksource *cs)
 {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] irqchip: orion: clear bridge cause register on init

2014-01-24 Thread Ezequiel Garcia
On Thu, Jan 23, 2014 at 11:38:04PM +0100, Sebastian Hesselbarth wrote:
> It is good practice to mask and clear pending irqs on init. We already
> mask all irqs, so also clear the bridge irq cause register.
> 
> Signed-off-by: Sebastian Hesselbarth 
> ---
> Cc: Thomas Gleixner 
> Cc: Jason Cooper 
> Cc: Andrew Lunn 
> Cc: Gregory Clement 
> Cc: Jason Gunthorpe 
> Cc: Ezequiel Garcia 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/irqchip/irq-orion.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-orion.c b/drivers/irqchip/irq-orion.c
> index e51d40031884..4137c3d15284 100644
> --- a/drivers/irqchip/irq-orion.c
> +++ b/drivers/irqchip/irq-orion.c
> @@ -180,8 +180,9 @@ static int __init orion_bridge_irq_init(struct 
> device_node *np,
>   gc->chip_types[0].chip.irq_mask = irq_gc_mask_clr_bit;
>   gc->chip_types[0].chip.irq_unmask = irq_gc_mask_set_bit;
>  
> - /* mask all interrupts */
> + /* mask and clear all interrupts */
>   writel(0, gc->reg_base + ORION_BRIDGE_IRQ_MASK);
> + writel(0, gc->reg_base + ORION_BRIDGE_IRQ_CAUSE);
>  

This looks a bit bogus to me, now that we are clearing the cause upon
irq_startup(). Don't have a strong opinion, it's just that I fail to see
why we'd want or need this change...
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 8/8] ARM: brcmstb: dts: add a reference DTS for Broadcom 7445

2014-01-24 Thread Marc C
Hi Mark,

> As I commented on v3 [1], these are contiguous and can be described with
> a single entry:
>
> memory {
>   device_type = "memory";
>   reg = <0x0 0x 0x0 0xc000>;
> };
>
> Is there any reason to have three entries?

Oopsies, sorry for missing that.

On BCM7445 and derivatives, there are 3 memory controllers. For each memory 
controller,
the first 1GB of physical DRAM is mapped to:

* 0x00__
* 0x00_4000_
* 0x00_8000_

The memory controllers aren't interleaved. So, it's possible for the SoC to 
have a
discontiguous memory-mapping, where a designer chooses not to populate physical 
DRAM in
the middle.

The 'reg' property was broken-up to have each chunk of memory given a dedicated 
memblock.

All that said, if you like, I can rework the patch as you've suggested.

Thanks,
Marc C


On 01/24/2014 03:09 AM, Mark Rutland wrote:
> On Wed, Jan 22, 2014 at 03:30:52AM +, Marc Carino wrote:
>> Add a sample DTS which will allow bootup of a board populated
>> with the BCM7445 chip.
>>
>> Signed-off-by: Marc Carino 
>> Acked-by: Florian Fainelli 
>> ---
>>  arch/arm/boot/dts/bcm7445.dts |  111 
>> +
>>  1 files changed, 111 insertions(+), 0 deletions(-)
>>  create mode 100644 arch/arm/boot/dts/bcm7445.dts
>>
>> diff --git a/arch/arm/boot/dts/bcm7445.dts b/arch/arm/boot/dts/bcm7445.dts
>> new file mode 100644
>> index 000..ffa3305
>> --- /dev/null
>> +++ b/arch/arm/boot/dts/bcm7445.dts
>> @@ -0,0 +1,111 @@
>> +/dts-v1/;
>> +/include/ "skeleton.dtsi"
>> +
>> +/ {
>> +#address-cells = <2>;
>> +#size-cells = <2>;
>> +model = "Broadcom STB (bcm7445)";
>> +compatible = "brcm,bcm7445", "brcm,brcmstb";
>> +interrupt-parent = <>;
>> +
>> +chosen {};
>> +
>> +memory {
>> +device_type = "memory";
>> +reg = <0x00 0x 0x00 0x4000>,
>> +  <0x00 0x4000 0x00 0x4000>,
>> +  <0x00 0x8000 0x00 0x4000>;
>> +};
> 
> As I commented on v3 [1], these are contiguous and can be described with
> a single entry:
> 
> memory {
>   device_type = "memory";
>   reg = <0x0 0x 0x0 0xc000>;
> };
> 
> Is there any reason to have three entries?
> 
> Thanks,
> Mark.
> 
> [1] 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-January/225899.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/3] ext4: increase mbcache scalability

2014-01-24 Thread Andi Kleen
T Makphaibulchoke  writes:

> The patch consists of three parts.
>
> The first part changes the implementation of both the block and hash chains of
> an mb_cache from list_head to hlist_bl_head and also introduces new members,
> including a spinlock to mb_cache_entry, as required by the second part.

spinlock per entry is usually overkill for larger hash tables.

Can you use a second smaller lock table that just has locks and is 
indexed by a subset of the hash key. Most likely a very small 
table is good enough.

Also I would be good to have some data on the additional memory consumption.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH V3 4/4] MAINTAINERS: entry for APM X-Gene PCIe host driver

2014-01-24 Thread Tanmay Inamdar
Add entry for AppliedMicro X-Gene PCIe host driver.

Signed-off-by: Tanmay Inamdar 
---
 MAINTAINERS |7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5c21402..721fec7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6539,6 +6539,13 @@ L:   linux-...@vger.kernel.org
 S: Maintained
 F: drivers/pci/host/*designware*
 
+PCI DRIVER FOR APPLIEDMICRO XGENE
+M: Tanmay Inamdar 
+L: linux-...@vger.kernel.org
+L: linux-arm-ker...@lists.infradead.org
+S: Maintained
+F: drivers/pci/host/pci-xgene.c
+
 PCMCIA SUBSYSTEM
 P: Linux PCMCIA Team
 L: linux-pcm...@lists.infradead.org
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH V3 3/4] dt-bindings: pci: xgene pcie device tree bindings

2014-01-24 Thread Tanmay Inamdar
This patch adds the bindings for X-Gene PCIe driver. The driver resides
under 'drivers/pci/host/pci-xgene.c' file.

Signed-off-by: Tanmay Inamdar 
---
 .../devicetree/bindings/pci/xgene-pci.txt  |   52 
 1 file changed, 52 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/xgene-pci.txt

diff --git a/Documentation/devicetree/bindings/pci/xgene-pci.txt 
b/Documentation/devicetree/bindings/pci/xgene-pci.txt
new file mode 100644
index 000..60e4a54
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/xgene-pci.txt
@@ -0,0 +1,52 @@
+* AppliedMicro X-Gene PCIe interface
+
+Required properties:
+- device_type: set to "pci"
+- compatible: should contain "xgene,pcie" to identify the core.
+- reg: A list of physical base address and length for each set of controller
+   registers. Must contain an entry for each entry in the reg-names
+   property.
+- reg-names: Must include the following entries:
+  "csr": controller configuration registers.
+  "cfg": pcie configuration space registers.
+- #address-cells: set to <3>
+- #size-cells: set to <2>
+- ranges: ranges for the outbound memory, I/O regions.
+- dma-ranges: ranges for the inbound memory regions.
+- #interrupt-cells: set to <1>
+- interrupt-map-mask and interrupt-map: standard PCI properties
+   to define the mapping of the PCIe interface to interrupt
+   numbers.
+- clocks: from common clock binding: handle to pci clock.
+
+Optional properties:
+- status: Either "ok" or "disabled".
+
+Example:
+
+SoC specific DT Entry:
+   pcie0: pcie@1f2b {
+   status = "disabled";
+   device_type = "pci";
+   compatible = "apm,xgene-pcie";
+   #interrupt-cells = <1>;
+   #size-cells = <2>;
+   #address-cells = <3>;
+   reg = < 0x00 0x1f2b 0x0 0x0001   /* Controller 
registers */
+   0xe0 0xd000 0x0 0x0020>; /* PCI config space */
+   reg-names = "csr", "cfg";
+   ranges = <0x0100 0x00 0x 0xe0 0x 0x00 
0x0001   /* io */
+ 0x0200 0x00 0x1000 0xe0 0x1000 0x00 
0x8000>; /* mem */
+   dma-ranges = <0x4200 0x40 0x 0x40 0x 0x40 
0x>;
+   interrupt-map-mask = <0x0 0x0 0x0 0x7>;
+   interrupt-map = <0x0 0x0 0x0 0x1  0x0 0xc2 0x1
+0x0 0x0 0x0 0x2  0x0 0xc3 0x1
+0x0 0x0 0x0 0x3  0x0 0xc4 0x1
+0x0 0x0 0x0 0x4  0x0 0xc5 0x1>;
+   clocks = < 0>;
+   };
+
+Board specific DT Entry:
+{
+   status = "ok";
+   };
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH V3 1/4] pci: APM X-Gene PCIe controller driver

2014-01-24 Thread Tanmay Inamdar
This patch adds the AppliedMicro X-Gene SOC PCIe controller driver.
X-Gene PCIe controller supports maximum upto 8 lanes and GEN3 speed.
X-Gene has maximum 5 PCIe ports supported.

Signed-off-by: Tanmay Inamdar 
---
 drivers/pci/host/Kconfig |   10 +
 drivers/pci/host/Makefile|1 +
 drivers/pci/host/pci-xgene.c |  784 ++
 3 files changed, 795 insertions(+)
 create mode 100644 drivers/pci/host/pci-xgene.c

diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index 47d46c6..19ce97d 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -33,4 +33,14 @@ config PCI_RCAR_GEN2
  There are 3 internal PCI controllers available with a single
  built-in EHCI/OHCI host controller present on each one.
 
+config PCI_XGENE
+   bool "X-Gene PCIe controller"
+   depends on ARCH_XGENE
+   depends on OF
+   select PCIEPORTBUS
+   help
+ Say Y here if you want internal PCI support on APM X-Gene SoC.
+ There are 5 internal PCIe ports available. Each port is GEN3 capable
+ and have varied lanes from x1 to x8.
+
 endmenu
diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile
index 13fb333..34c7c36 100644
--- a/drivers/pci/host/Makefile
+++ b/drivers/pci/host/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_PCI_IMX6) += pci-imx6.o
 obj-$(CONFIG_PCI_MVEBU) += pci-mvebu.o
 obj-$(CONFIG_PCI_TEGRA) += pci-tegra.o
 obj-$(CONFIG_PCI_RCAR_GEN2) += pci-rcar-gen2.o
+obj-$(CONFIG_PCI_XGENE) += pci-xgene.o
diff --git a/drivers/pci/host/pci-xgene.c b/drivers/pci/host/pci-xgene.c
new file mode 100644
index 000..650a860
--- /dev/null
+++ b/drivers/pci/host/pci-xgene.c
@@ -0,0 +1,784 @@
+/**
+ * APM X-Gene PCIe Driver
+ *
+ * Copyright (c) 2013 Applied Micro Circuits Corporation.
+ *
+ * Author: Tanmay Inamdar .
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PCIECORE_LTSSM 0x4c
+#define PCIECORE_CTLANDSTATUS  0x50
+#define PIPE_PHY_RATE_RD(src)  ((0xc000 & (u32)(src)) >> 0xe)
+#define INTXSTATUSMASK 0x6c
+#define PIM1_1L0x80
+#define IBAR2  0x98
+#define IR2MSK 0x9c
+#define PIM2_1L0xa0
+#define IBAR3L 0xb4
+#define IR3MSKL0xbc
+#define PIM3_1L0xc4
+#define OMR1BARL   0x100
+#define OMR2BARL   0x118
+#define CFGBARL0x154
+#define CFGBARH0x158
+#define CFGCTL 0x15c
+#define RTDID  0x160
+#define BRIDGE_CFG_0   0x2000
+#define BRIDGE_CFG_1   0x2004
+#define BRIDGE_CFG_4   0x2010
+#define BRIDGE_CFG_32  0x2030
+#define BRIDGE_CFG_14  0x2038
+#define BRIDGE_CTRL_1  0x2204
+#define BRIDGE_CTRL_2  0x2208
+#define BRIDGE_CTRL_5  0x2214
+#define BRIDGE_STATUS_00x2600
+#define MEM_RAM_SHUTDOWN0xd070
+#define BLOCK_MEM_RDY   0xd074
+
+#define DEVICE_PORT_TYPE_MASK  0x03c0
+#define PM_FORCE_RP_MODE_MASK  0x0400
+#define SWITCH_PORT_MODE_MASK  0x0800
+#define CLASS_CODE_MASK0xff00
+#define LINK_UP_MASK   0x0100
+#define AER_OPTIONAL_ERROR_EN  0xffc0
+#define XGENE_PCIE_DEV_CTRL0x2f0f
+#define AXI_EP_CFG_ACCESS  0x1
+#define ENABLE_ASPM0x0800
+#define XGENE_PORT_TYPE_RC 0x0500
+#define BLOCK_MEM_RDY_VAL   0x
+#define EN_COHERENCY   0xF000
+#define EN_REG 0x0001
+#define OB_LO_IO   0x0002
+#define XGENE_PCIE_VENDORID0xE008
+#define XGENE_PCIE_DEVICEID0xE004
+#define XGENE_PCIE_TIMEOUT (500*1000) /* us */
+#define XGENE_LTSSM_DETECT_WAIT20
+#define XGENE_LTSSM_L0_WAIT4
+#define SZ_1T  (SZ_1G*1024ULL)
+
+struct xgene_res_cfg {
+   struct resource res;
+   

Re: [PATCH 3/3] dynamic_debug: replace obselete simple_strtoul() with kstrtouint()

2014-01-24 Thread Greg KH
On Fri, Jan 24, 2014 at 03:03:38PM -0500, Jason Baron wrote:
> Hi,
> 
> I think we want some sort of commit message for this patch. But they
> all look good to me and they tested fine.
> 
> Acked-by: Jason Baron 
> 
> Greg, Can you pick up this series?

Will do, after 3.14-rc1 is out, thanks.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH V3 2/4] arm64: dts: APM X-Gene PCIe device tree nodes

2014-01-24 Thread Tanmay Inamdar
This patch adds the device tree nodes for APM X-Gene PCIe controller and
PCIe clock interface. Since X-Gene SOC supports maximum 5 ports, 5 dts
nodes are added.

Signed-off-by: Tanmay Inamdar 
---
 arch/arm64/boot/dts/apm-mustang.dts |8 ++
 arch/arm64/boot/dts/apm-storm.dtsi  |  155 +++
 2 files changed, 163 insertions(+)

diff --git a/arch/arm64/boot/dts/apm-mustang.dts 
b/arch/arm64/boot/dts/apm-mustang.dts
index 1247ca1..507b6c9 100644
--- a/arch/arm64/boot/dts/apm-mustang.dts
+++ b/arch/arm64/boot/dts/apm-mustang.dts
@@ -24,3 +24,11 @@
reg = < 0x1 0x 0x0 0x8000 >; /* Updated by 
bootloader */
};
 };
+
+ {
+   status = "ok";
+};
+
+ {
+   status = "ok";
+};
diff --git a/arch/arm64/boot/dts/apm-storm.dtsi 
b/arch/arm64/boot/dts/apm-storm.dtsi
index d37d736..e579a6f 100644
--- a/arch/arm64/boot/dts/apm-storm.dtsi
+++ b/arch/arm64/boot/dts/apm-storm.dtsi
@@ -176,6 +176,161 @@
reg-names = "csr-reg";
clock-output-names = "eth8clk";
};
+
+   pcie0clk: pcie0clk@1f2bc000 {
+   status = "disabled";
+   compatible = "apm,xgene-device-clock";
+   #clock-cells = <1>;
+   clocks = < 0>;
+   reg = <0x0 0x1f2bc000 0x0 0x1000>;
+   reg-names = "csr-reg";
+   clock-output-names = "pcie0clk";
+   };
+
+   pcie1clk: pcie1clk@1f2cc000 {
+   status = "disabled";
+   compatible = "apm,xgene-device-clock";
+   #clock-cells = <1>;
+   clocks = < 0>;
+   reg = <0x0 0x1f2cc000 0x0 0x1000>;
+   reg-names = "csr-reg";
+   clock-output-names = "pcie1clk";
+   };
+
+   pcie2clk: pcie2clk@1f2dc000 {
+   status = "disabled";
+   compatible = "apm,xgene-device-clock";
+   #clock-cells = <1>;
+   clocks = < 0>;
+   reg = <0x0 0x1f2dc000 0x0 0x1000>;
+   reg-names = "csr-reg";
+   clock-output-names = "pcie2clk";
+   };
+
+   pcie3clk: pcie3clk@1f50c000 {
+   status = "disabled";
+   compatible = "apm,xgene-device-clock";
+   #clock-cells = <1>;
+   clocks = < 0>;
+   reg = <0x0 0x1f50c000 0x0 0x1000>;
+   reg-names = "csr-reg";
+   clock-output-names = "pcie3clk";
+   };
+
+   pcie4clk: pcie4clk@1f51c000 {
+   status = "disabled";
+   compatible = "apm,xgene-device-clock";
+   #clock-cells = <1>;
+   clocks = < 0>;
+   reg = <0x0 0x1f51c000 0x0 0x1000>;
+   reg-names = "csr-reg";
+   clock-output-names = "pcie4clk";
+   };
+   };
+
+   pcie0: pcie@1f2b {
+   status = "disabled";
+   device_type = "pci";
+   compatible = "apm,xgene-pcie";
+   #interrupt-cells = <1>;
+   #size-cells = <2>;
+   #address-cells = <3>;
+   reg = < 0x00 0x1f2b 0x0 0x0001   /* Controller 
registers */
+   0xe0 0xd000 0x0 0x0020>; /* PCI config 
space */
+   reg-names = "csr", "cfg";
+   ranges = <0x0100 0x00 0x 0xe0 0x 
0x00 0x0001   /* io */
+ 0x0200 0x00 0x1000 0xe0 0x1000 
0x00 0x8000>; /* mem */
+   dma-ranges = <0x4200 0x40 0x 0x40 
0x 0x40 0x>;
+   interrupt-map-mask = <0x0 0x0 0x0 0x7>;
+   interrupt-map = <0x0 0x0 0x0 0x1  0x0 0xc2 0x1
+0x0 0x0 0x0 0x2  0x0 0xc3 0x1
+0x0 0x0 0x0 0x3  0x0 0xc4 0x1
+0x0 0x0 0x0 0x4  0x0 0xc5 0x1>;
+   clocks = < 0>;
+   };
+
+   pcie1: pcie@1f2c {
+   status = "disabled";
+   device_type = 

Re: [PATCH v5 6/8] ARM: brcmstb: add misc. DT bindings for brcmstb

2014-01-24 Thread Marc C
Hi Mark,

>> +reboot
>> +---
>> +Required properties
>> +
>> +- compatible
>> +The string property "brcm,brcmstb-reboot".
>> +
>> +- syscon
>> +A phandle / integer array that points to the syscon node which 
>> describes
>> +the general system reset registers.
>> +o a phandle to "sun_top_ctrl"
>> +o offset to the "reset source enable" register
>> +o offset to the "software master reset" register
>
> How variable are these values?

Very much so. Future chips will have different register maps. Because of this, 
Arnd
suggested that we use 'syscon' and 'regmap' to alleviate this maintenance 
burden.

>> +example:
>> +smpboot {
>> +compatible = "brcm,brcmstb-smpboot";
>> +syscon-cpu = <_cpubiuctrl 0x88 0x178>;
>> +syscon-cont = <_continuation>;
>> +};
>
> This looks odd. This doesn't seem like a device, but rather a grouping
> of disparate devices used for a particular software purpose.
>
>> +
>> +example:
>> +reboot {
>> +compatible = "brcm,brcmstb-reboot";
>> +syscon = <_top_ctrl 0x304 0x308>;
>> +};
>
> As with smpboot, this seems odd.

Sure. Our H/W designers unfortunately didn't put the boot and restart registers 
into a
logical grouping, or standard register interface. Instead, they're all over the 
place.
How do you suggest naming the nodes to indicate this?

Thanks,
Marc C

On 01/24/2014 03:03 AM, Mark Rutland wrote:
> On Wed, Jan 22, 2014 at 03:30:50AM +, Marc Carino wrote:
>> Document the bindings that the Broadcom STB platform needs
>> for proper bootup.
>>
>> Signed-off-by: Marc Carino 
>> Acked-by: Florian Fainelli 
>> ---
>>  .../devicetree/bindings/arm/brcm-brcmstb.txt   |   95 
>> 
>>  1 files changed, 95 insertions(+), 0 deletions(-)
>>  create mode 100644 Documentation/devicetree/bindings/arm/brcm-brcmstb.txt
>>
>> diff --git a/Documentation/devicetree/bindings/arm/brcm-brcmstb.txt 
>> b/Documentation/devicetree/bindings/arm/brcm-brcmstb.txt
>> new file mode 100644
>> index 000..3c436cc
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/brcm-brcmstb.txt
>> @@ -0,0 +1,95 @@
>> +ARM Broadcom STB platforms Device Tree Bindings
>> +---
>> +Boards with Broadcom Brahma15 ARM-based BCM (generally BCM7xxx variants)
>> +SoC shall have the following DT organization:
>> +
>> +Required root node properties:
>> +- compatible: "brcm,bcm", "brcm,brcmstb"
>> +
>> +example:
>> +/ {
>> +#address-cells = <2>;
>> +#size-cells = <2>;
>> +model = "Broadcom STB (bcm7445)";
>> +compatible = "brcm,bcm7445", "brcm,brcmstb";
>> +
>> +Further, syscon nodes that map platform-specific registers used for general
>> +system control is required:
>> +
>> +- compatible: "brcm,bcm-sun-top-ctrl", "syscon"
>> +- compatible: "brcm,bcm-hif-cpubiuctrl", "syscon"
>> +- compatible: "brcm,bcm-hif-continuation", "syscon"
>> +
>> +example:
>> +rdb {
>> +#address-cells = <1>;
>> +#size-cells = <1>;
>> +compatible = "simple-bus";
>> +ranges = <0 0x00 0xf000 0x100>;
>> +
>> +sun_top_ctrl: syscon@404000 {
>> +compatible = "brcm,bcm7445-sun-top-ctrl", "syscon";
>> +reg = <0x404000 0x51c>;
>> +};
>> +
>> +hif_cpubiuctrl: syscon@3e2400 {
>> +compatible = "brcm,bcm7445-hif-cpubiuctrl", "syscon";
>> +reg = <0x3e2400 0x5b4>;
>> +};
>> +
>> +hif_continuation: syscon@452000 {
>> +compatible = "brcm,bcm7445-hif-continuation", "syscon";
>> +reg = <0x452000 0x100>;
>> +};
>> +};
>> +
>> +Lastly, nodes that allow for support of SMP initialization and reboot are
>> +required:
>> +
>> +smpboot
>> +---
>> +Required properties:
>> +
>> +- compatible
>> +The string "brcm,brcmstb-smpboot".
>> +
>> +- syscon-cpu
>> +A phandle / integer array property which lets the BSP know the 
>> location
>> +of certain CPU power-on registers.
>> +
>> +The layout of the property is as follows:
>> +o a phandle to the "hif_cpubiuctrl" syscon node
>> +o offset to the base CPU power zone register
>> +o offset to the base CPU reset register
> 
> How variable are these values?
> 
>> +
>> +- syscon-cont
>> +A phandle pointing to the syscon node which describes the CPU boot
>> +continuation registers.
>> +o a phandle to the "hif_continuation" syscon node
>> +
>> +example:
>> +smpboot {
>> +compatible = "brcm,brcmstb-smpboot";
>> +syscon-cpu = <_cpubiuctrl 0x88 0x178>;
>> +syscon-cont = <_continuation>;
>> +};
> 
> This looks odd. This doesn't seem like a device, but rather a grouping
> of disparate devices used for a particular software purpose.
> 
>> +
>> +reboot
>> +---
>> +Required properties
>> +
>> +- compatible
>> 

[RFC PATCH V3 0/4] APM X-Gene PCIe controller

2014-01-24 Thread Tanmay Inamdar
This patch adds support for AppliedMicro X-Gene PCIe host controller. The
driver is tested on X-Gene platform with different gen1/2/3 PCIe endpoint
cards.

X-Gene PCIe controller driver has depedency on the pcie arch support for
arm64. The arm64 pcie arch support is not yet part of mainline Linux kernel
and approach for arch support is under discussion with arm64 maintainers.
The reference patch can be found here --> https://lkml.org/lkml/2013/10/23/244

If someone wishes to test PCIe on X-Gene, arch support patch must be applied
before the patches in this patch set.

changes since V2:
1. redefined each PCI port in different PCI domain correctly.
2. removed setup_lane and setup_link functions from driver.
3. removed scan_bus wrapper and set_primary_bus hack.
4. added pci_ioremap_io for io resources.

changes since V1:
1. added PCI domain support
2. reading cpu and pci addresses from device tree to configure regions.
3. got rid of unnecessary wrappers for readl and writel.
4. got rid of endpoint configuration code.
5. added 'dma-ranges' property support to read inbound region configuration.
6. renamed host driver file to 'pci-xgene.c' from 'pcie-xgene.c'
7. dropped 'clock-names' property from bindings
8. added comments whereever requested.

Tanmay Inamdar (4):
  pci: APM X-Gene PCIe controller driver
  arm64: dts: APM X-Gene PCIe device tree nodes
  dt-bindings: pci: xgene pcie device tree bindings
  MAINTAINERS: entry for APM X-Gene PCIe host driver

 .../devicetree/bindings/pci/xgene-pci.txt  |   52 ++
 MAINTAINERS|7 +
 arch/arm64/boot/dts/apm-mustang.dts|8 +
 arch/arm64/boot/dts/apm-storm.dtsi |  155 
 drivers/pci/host/Kconfig   |   10 +
 drivers/pci/host/Makefile  |1 +
 drivers/pci/host/pci-xgene.c   |  784 
 7 files changed, 1017 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/xgene-pci.txt
 create mode 100644 drivers/pci/host/pci-xgene.c

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH V2 1/4] pci: APM X-Gene PCIe controller driver

2014-01-24 Thread Tanmay Inamdar
On Thu, Jan 16, 2014 at 5:10 PM, Tanmay Inamdar  wrote:
> On Wed, Jan 15, 2014 at 4:39 AM, Arnd Bergmann  wrote:
>> On Wednesday 15 January 2014, Tanmay Inamdar wrote:
>>> This patch adds the AppliedMicro X-Gene SOC PCIe controller driver.
>>> X-Gene PCIe controller supports maxmum upto 8 lanes and GEN3 speed.
>>> X-Gene has maximum 5 PCIe ports supported.
>>>
>>> Signed-off-by: Tanmay Inamdar 
>>
>> This already looks much better than the first version, but I have a more
>> comments. Most importantly, it would help to know how the root ports
>> are structured. Is this a standard root complex and multiple ports,
>> multiple root complexes with one port each, or a nonstandard organization
>> that is a mix of those two models?
>
> This is multiple root complexes with one port each.
>
>>
>>> +
>>> +/* When the address bit [17:16] is 2'b01, the Configuration access will be
>>> + * treated as Type 1 and it will be forwarded to external PCIe device.
>>> + */
>>> +static void __iomem *xgene_pcie_get_cfg_base(struct pci_bus *bus)
>>> +{
>>> + struct xgene_pcie_port *port = xgene_pcie_bus_to_port(bus);
>>> + u64 addr = (u64)port->cfg_base;
>>> +
>>> + if (bus->number >= (port->first_busno + 1))
>>> + addr |= AXI_EP_CFG_ACCESS;
>>> +
>>> + return (void *)addr;
>>> +}
>>
>> Wrong type, it should be 'void __iomem *'. Also you can't assume that
>> bit operations work on virtual __iomem addresses, so it should be better
>> to just add a constant integer to the pointer, which is a valid
>> operation.
>
> ok.
>
>>
>> I also wonder why you need to do this at all. If there isn't a global
>> config space for all ports, but rather a separate type0/type1 config
>> cycle based on the bus number, I see that as an indication that the
>> ports are in fact separate domains and should each start with bus 0.
>
> It is not a standard ECAM layout. We also have a separate RTDID
> register as well to program bus, device, function. While accessing EP
> config space, we have to set the bit 17:16 as 2b'01. The same config
> space address is utilized for enabling a customized nonstandard PCIe
> DMA feature. The bits are defined to differentiate the access purpose.
> The feature is not supported in this driver yet.
>
> Secondly I don't think it will matter if each port starts with bus 0.
> As long as we set the correct BDF in RTDID and set correct bits in
> config address, the config reads and writes would work. Right?
>
>>
>>> +static void xgene_pcie_setup_lanes(struct xgene_pcie_port *port)
>>> +{
>>> + void *csr_base = port->csr_base;
>>> + u32 val;
>>> +
>>> + val = readl(csr_base + BRIDGE_8G_CFG_8);
>>> + val = eq_pre_cursor_lane0_set(val, 0x7);
>>> + val = eq_pre_cursor_lane1_set(val, 0x7);
>>> + writel(val, csr_base + BRIDGE_8G_CFG_8);
>>> +
>>> + val = readl(csr_base + BRIDGE_8G_CFG_9);
>>> + val = eq_pre_cursor_lane0_set(val, 0x7);
>>> + val = eq_pre_cursor_lane1_set(val, 0x7);
>>> + writel(val, csr_base + BRIDGE_8G_CFG_9);
>>> +
>>> + val = readl(csr_base + BRIDGE_8G_CFG_10);
>>> + val = eq_pre_cursor_lane0_set(val, 0x7);
>>> + val = eq_pre_cursor_lane1_set(val, 0x7);
>>> + writel(val, csr_base + BRIDGE_8G_CFG_10);
>>> +
>>> + val = readl(csr_base + BRIDGE_8G_CFG_11);
>>> + val = eq_pre_cursor_lane0_set(val, 0x7);
>>> + val = eq_pre_cursor_lane1_set(val, 0x7);
>>> + writel(val, csr_base + BRIDGE_8G_CFG_11);
>>> +
>>> + val = readl(csr_base + BRIDGE_8G_CFG_4);
>>> + val = (val & ~0x30) | (1 << 4);
>>> + writel(val, csr_base + BRIDGE_8G_CFG_4);
>>> +}
>>
>> Please document what you are actually setting here. If the configuration
>> of the lanes is always the same, why do you have to set it here. If not,
>> why do you set constant values?
>
> Good point. Let me check if these values should be constant or tune-able.
>
>>
>>> +static void xgene_pcie_setup_link(struct xgene_pcie_port *port)
>>> +{
>>> + void *csr_base = port->csr_base;
>>> + u32 val;
>>> +
>>> + val = readl(csr_base + BRIDGE_CFG_14);
>>> + val |= DIRECT_TO_8GTS_MASK;
>>> + val |= SUPPORT_5GTS_MASK;
>>> + val |= SUPPORT_8GTS_MASK;
>>> + val |= DIRECT_TO_5GTS_MASK;
>>> + writel(val, csr_base + BRIDGE_CFG_14);
>>> +
>>> + val = readl(csr_base + BRIDGE_CFG_14);
>>> + val &= ~ADVT_INFINITE_CREDITS;
>>> + writel(val, csr_base + BRIDGE_CFG_14);
>>> +
>>> + val = readl(csr_base + BRIDGE_8G_CFG_0);
>>> + val |= (val & ~0xf) | 7;
>>> + val |= (val & ~0xf00) | ((7 << 8) & 0xf00);
>>> + writel(val, csr_base + BRIDGE_8G_CFG_0);
>>> +
>>> + val = readl(csr_base + BRIDGE_8G_CFG_0);
>>> + val |= DWNSTRM_EQ_SKP_PHS_2_3;
>>> + writel(val, csr_base + BRIDGE_8G_CFG_0);
>>> +}
>>
>> Same here.
>>
>>> +static void xgene_pcie_program_core(void *csr_base)
>>> +{
>>> + u32 val;
>>> +
>>> + val = readl(csr_base + BRIDGE_CFG_0);
>>> + val |= AER_OPTIONAL_ERROR_EN;
>>> + writel(val, csr_base + 

Re: [PATCH v5 1/8] ARM: brcmstb: add infrastructure for ARM-based Broadcom STB SoCs

2014-01-24 Thread Marc C
Hi Mark,

>> +static void __init brcmstb_init_early(void)
>> +{
>> +   add_preferred_console("ttyS", 0, "115200");
>> +}
>
> Is this really required?

I think I can drop this. It was a holdover from our older kernels.

>> +   /*
>> +   * set the reset vector to point to the secondary_startup
>> +   * routine
>> +   */
>> +   cpu_set_boot_addr(cpu, virt_to_phys(brcmstb_secondary_startup));
>> +
>> +   flush_cache_all();
>
> Why? What does the new CPU need before its caches are coherent and up?

Absolutely nothing! I should be able to drop this as well.

Regarding the CPU power-down sequence, I'll review it and make sure it follows 
the
"Processor power domain" sequence in the A15 TRM. For any deviations, I'll 
double-check
with our H/W designers to ensure there aren't any magic requirements 
unaccounted for.

Thank you for taking a deep-dive into the code! I'll make the appropriate 
modifications
per your suggestions.

Regards,
Marc C


On 01/24/2014 02:14 AM, Mark Rutland wrote:
> On Wed, Jan 22, 2014 at 03:30:45AM +, Marc Carino wrote:
>> The BCM7xxx series of Broadcom SoCs are used primarily in set-top boxes.
>>
>> This patch adds machine support for the ARM-based Broadcom SoCs.
>>
>> Signed-off-by: Marc Carino 
>> Acked-by: Florian Fainelli 
>> ---
>>  arch/arm/configs/multi_v7_defconfig |1 +
>>  arch/arm/mach-bcm/Kconfig   |   14 ++
>>  arch/arm/mach-bcm/Makefile  |4 +
>>  arch/arm/mach-bcm/brcmstb.c |  110 
>>  arch/arm/mach-bcm/brcmstb.h |   38 
>>  arch/arm/mach-bcm/headsmp-brcmstb.S |   34 
>>  arch/arm/mach-bcm/hotplug-brcmstb.c |  334 
>> +++
>>  7 files changed, 535 insertions(+), 0 deletions(-)
>>  create mode 100644 arch/arm/mach-bcm/brcmstb.c
>>  create mode 100644 arch/arm/mach-bcm/brcmstb.h
>>  create mode 100644 arch/arm/mach-bcm/headsmp-brcmstb.S
>>  create mode 100644 arch/arm/mach-bcm/hotplug-brcmstb.c
>>
>> diff --git a/arch/arm/configs/multi_v7_defconfig 
>> b/arch/arm/configs/multi_v7_defconfig
>> index c1df4e9..7028d11 100644
>> --- a/arch/arm/configs/multi_v7_defconfig
>> +++ b/arch/arm/configs/multi_v7_defconfig
>> @@ -7,6 +7,7 @@ CONFIG_MACH_ARMADA_370=y
>>  CONFIG_MACH_ARMADA_XP=y
>>  CONFIG_ARCH_BCM=y
>>  CONFIG_ARCH_BCM_MOBILE=y
>> +CONFIG_ARCH_BRCMSTB=y
>>  CONFIG_GPIO_PCA953X=y
>>  CONFIG_ARCH_HIGHBANK=y
>>  CONFIG_ARCH_KEYSTONE=y
>> diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig
>> index 9fe6d88..2c1ae83 100644
>> --- a/arch/arm/mach-bcm/Kconfig
>> +++ b/arch/arm/mach-bcm/Kconfig
>> @@ -31,6 +31,20 @@ config ARCH_BCM_MOBILE
>>   BCM11130, BCM11140, BCM11351, BCM28145 and
>>   BCM28155 variants.
>>
>> +config ARCH_BRCMSTB
>> +   bool "Broadcom BCM7XXX based boards" if ARCH_MULTI_V7
>> +   depends on MMU
>> +   select ARM_GIC
>> +   select MIGHT_HAVE_PCI
>> +   select HAVE_SMP
>> +   select HAVE_ARM_ARCH_TIMER
>> +   help
>> + Say Y if you intend to run the kernel on a Broadcom ARM-based STB
>> + chipset.
>> +
>> + This enables support for Broadcom ARM-based set-top box chipsets,
>> + including the 7445 family of chips.
>> +
>>  endmenu
>>
>>  endif
>> diff --git a/arch/arm/mach-bcm/Makefile b/arch/arm/mach-bcm/Makefile
>> index c2ccd5a..b744a12 100644
>> --- a/arch/arm/mach-bcm/Makefile
>> +++ b/arch/arm/mach-bcm/Makefile
>> @@ -13,3 +13,7 @@
>>  obj-$(CONFIG_ARCH_BCM_MOBILE)  := board_bcm281xx.o bcm_kona_smc.o 
>> bcm_kona_smc_asm.o kona.o
>>  plus_sec := $(call as-instr,.arch_extension sec,+sec)
>>  AFLAGS_bcm_kona_smc_asm.o  :=-Wa,-march=armv7-a$(plus_sec)
>> +
>> +obj-$(CONFIG_ARCH_BRCMSTB) := brcmstb.o
>> +obj-$(CONFIG_SMP)  += headsmp-brcmstb.o
>> +obj-$(CONFIG_HOTPLUG_CPU)  += hotplug-brcmstb.o
>> diff --git a/arch/arm/mach-bcm/brcmstb.c b/arch/arm/mach-bcm/brcmstb.c
>> new file mode 100644
>> index 000..7a6093d
>> --- /dev/null
>> +++ b/arch/arm/mach-bcm/brcmstb.c
>> @@ -0,0 +1,110 @@
>> +/*
>> + * Copyright (C) 2013 Broadcom Corporation
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License as
>> + * published by the Free Software Foundation version 2.
>> + *
>> + * This program is distributed "as is" WITHOUT ANY WARRANTY of any
>> + * kind, whether express or implied; without even the implied warranty
>> + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "brcmstb.h"
>> +
>> +/***
>> + * 

Re: [PATCH] x86, hyperv: bypass the timer_irq_works() check

2014-01-24 Thread H. Peter Anvin
On 01/23/2014 10:02 PM, Jason Wang wrote:
> This patch bypass the timer_irq_works() check for hyperv guest since:
> 
> - It was guaranteed to work.
> - timer_irq_works() may fail sometime due to the lpj calibration were 
> inaccurate
>   in a hyperv guest or a buggy host.
> 
> In the future, we should get the tsc frequency from hypervisor and use preset
> lpj instead.
> 
> Cc: K. Y. Srinivasan 
> Cc: Haiyang Zhang 
> Signed-off-by: Jason Wang 

This should be in -stable, right?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] x86, microcode: Add option to allow downgrading of microcode

2014-01-24 Thread Andi Kleen
From: Andi Kleen 

For testing purposes it can be useful to downgrade microcode.
Normally the driver only allows upgrading.

Add a module_param (default off) that allows downgrading.

Note the module_param can currently not be set for early
ucode update, only for late.

Signed-off-by: Andi Kleen 
---
 arch/x86/kernel/microcode_intel_lib.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/microcode_intel_lib.c 
b/arch/x86/kernel/microcode_intel_lib.c
index ce69320..18d5325 100644
--- a/arch/x86/kernel/microcode_intel_lib.c
+++ b/arch/x86/kernel/microcode_intel_lib.c
@@ -26,11 +26,16 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 #include 
 
+static bool allow_downgrade;
+module_param(allow_downgrade, bool, 0644);
+MODULE_PARM_DESC(allow_downgrade, "Allow downgrading microcode");
+
 static inline int
 update_match_cpu(unsigned int csig, unsigned int cpf,
 unsigned int sig, unsigned int pf)
@@ -41,6 +46,8 @@ update_match_cpu(unsigned int csig, unsigned int cpf,
 int
 update_match_revision(struct microcode_header_intel *mc_header, int rev)
 {
+   if (allow_downgrade)
+   return 1;
return (mc_header->rev <= rev) ? 0 : 1;
 }
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] x86, microcode: Do Intel microcode revision check signed v2

2014-01-24 Thread Andi Kleen
From: Andi Kleen 

The Intel SDM Vol 3 9.11.1 Microcode update states that
the update revision field is signed. However we do the comparison
unsigned, as the comparison gets promoted. Change the field
to be signed, so that comparision is really signed.

v2: Change field.
Signed-off-by: Andi Kleen 
---
 arch/x86/include/asm/microcode_intel.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/microcode_intel.h 
b/arch/x86/include/asm/microcode_intel.h
index 9067166..ed1884b 100644
--- a/arch/x86/include/asm/microcode_intel.h
+++ b/arch/x86/include/asm/microcode_intel.h
@@ -5,7 +5,7 @@
 
 struct microcode_header_intel {
unsigned inthdrver;
-   unsigned intrev;
+   int rev;
unsigned intdate;
unsigned intsig;
unsigned intcksum;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] x86, hyperv: bypass the timer_irq_works() check

2014-01-24 Thread KY Srinivasan


> -Original Message-
> From: Jason Wang [mailto:jasow...@redhat.com]
> Sent: Thursday, January 23, 2014 10:03 PM
> To: KY Srinivasan; Haiyang Zhang; t...@linutronix.de; mi...@redhat.com;
> h...@zytor.com; x...@kernel.org; de...@linuxdriverproject.org; linux-
> ker...@vger.kernel.org
> Cc: Jason Wang
> Subject: [PATCH] x86, hyperv: bypass the timer_irq_works() check
> 
> This patch bypass the timer_irq_works() check for hyperv guest since:
> 
> - It was guaranteed to work.
> - timer_irq_works() may fail sometime due to the lpj calibration were 
> inaccurate
>   in a hyperv guest or a buggy host.
> 
> In the future, we should get the tsc frequency from hypervisor and use preset
> lpj instead.
> 
> Cc: K. Y. Srinivasan 
> Cc: Haiyang Zhang 
> Signed-off-by: Jason Wang 

Thanks Jason.
Acked-by: K. Y. Srinivasan 
> ---
>  arch/x86/kernel/cpu/mshyperv.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 9f7ca26..832d05a 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  struct ms_hyperv_info ms_hyperv;
>  EXPORT_SYMBOL_GPL(ms_hyperv);
> @@ -105,6 +106,11 @@ static void __init ms_hyperv_init_platform(void)
> 
>   if (ms_hyperv.features & HV_X64_MSR_TIME_REF_COUNT_AVAILABLE)
>   clocksource_register_hz(_cs, NSEC_PER_SEC/100);
> +
> +#ifdef CONFIG_X86_IO_APIC
> + no_timer_check = 1;
> +#endif
> +
>  }
> 
>  const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
> --
> 1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] slab: fix wrong retval on kmem_cache_create_memcg error path

2014-01-24 Thread Vladimir Davydov
From: Dave Jones 

On kmem_cache_create_memcg() error path we set 'err', but leave 's' (the
new cache ptr) undefined. The latter can be NULL if we could not
allocate the cache, or pointing to a freed area if we failed somewhere
later while trying to initialize it. Initially we checked 'err'
immediately before exiting the function and returned NULL if it was set
ignoring the value of 's':

out_unlock:
...
if (err) {
...
return NULL;
}
return s;

Recently this check was, in fact, broken by commit f717eb3abb5e ("slab:
do not panic if we fail to create memcg cache"), which turned it to:

out_unlock:
...
if (err && !memcg) {
...
return NULL;
}
return s;

As a result, if we are failing creating a cache for a memcg, we will
skip the check and return 's' that can contain crap. Let's fix it by
assuring that on error path there are always two conditions satisfied at
the same time, err != 0 and s == NULL, by explicitly zeroing 's' after
freeing it on error path.

Signed-off-by: Dave Jones 
Signed-off-by: Vladimir Davydov 
Cc: Pekka Enberg 
Cc: Christoph Lameter 
---
 mm/slab_common.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8e40321..499b53c 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -249,7 +249,6 @@ out_unlock:
name, err);
dump_stack();
}
-   return NULL;
}
return s;
 
@@ -257,6 +256,7 @@ out_free_cache:
memcg_free_cache_params(s);
kfree(s->name);
kmem_cache_free(kmem_cache, s);
+   s = NULL;
goto out_unlock;
 }
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: slab: clean up kmem_cache_create_memcg() error handling

2014-01-24 Thread Vladimir Davydov
On 01/24/2014 10:20 PM, Dave Jones wrote:
> On Fri, Jan 24, 2014 at 03:33:41AM +, Linux Kernel wrote:
>  > Gitweb: 
> http://git.kernel.org/linus/;a=commit;h=3965fc3652244651006ebb31c8c45318ce84818f
>  > Commit: 3965fc3652244651006ebb31c8c45318ce84818f
>  > Parent: 309381feaee564281c3d9e90fbca8963bb7428ad
>  > Author: Vladimir Davydov 
>  > AuthorDate: Thu Jan 23 15:52:55 2014 -0800
>  > Committer:  Linus Torvalds 
>  > CommitDate: Thu Jan 23 16:36:50 2014 -0800
>  > 
>  > slab: clean up kmem_cache_create_memcg() error handling
>  > 
>  > Currently kmem_cache_create_memcg() backoffs on failure inside
>  > conditionals, without using gotos.  This results in the rollback code
>  > duplication, which makes the function look cumbersome even though on
>  > error we should only free the allocated cache.  Since in the next patch
>  > I am going to add yet another rollback function call on error path
>  > there, let's employ labels instead of conditionals for undoing any
>  > changes on failure to keep things clean.
>
> ...
>
>  > +out_unlock:
>  >mutex_unlock(_mutex);
>  >put_online_cpus();
>  >  
>  >if (err) {
>  > -
>  >if (flags & SLAB_PANIC)
>  >panic("kmem_cache_create: Failed to create slab '%s'. 
> Error %d\n",
>  >name, err);
>  > @@ -236,11 +230,14 @@ out_locked:
>  >name, err);
>  >dump_stack();
>  >}
>  > -
>  >return NULL;
>  >}
>  > -
>  >return s;
>  > +
>  > +out_free_cache:
>  > +  kfree(s->name);
>  > +  kmem_cache_free(kmem_cache, s);
>  > +  goto out_unlock;
>  >  }
>
> This is now returning a freed pointer as 's' if an error occurs.

If we go to out_free_cache, we set err, and since under out_unlock we have:

> if (err) {
...
> return NULL
>}

we will return NULL, which is right.

However this behavior was broken by another my 'fix' :-(

commitf717eb3abb5ea38f60e671dbfdbf512c2c93d22e
slab: do not panic if we fail to create memcg cache
> -if (err) {
> +/*
> + * There is no point in flooding logs with warnings or especially
> + * crashing the system if we fail to create a cache for a memcg. In
> + * this case we will be accounting the memcg allocation to the root
> + * cgroup until we succeed to create its own cache, but it isn't that
> + * critical.
> + */
> +if (err && !memcg) {
>  if (flags & SLAB_PANIC)
>  panic("kmem_cache_create: Failed to create slab '%s'.
> Error %d\n",
>  name, err);

In case memcg != NULL we can return crap on error. So you are right in
the end. Thank you for catching this!

> Perhaps the patch below ?
>
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 8e40321da091..2c62294cee23 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -257,6 +257,7 @@ out_free_cache:
>   memcg_free_cache_params(s);
>   kfree(s->name);
>   kmem_cache_free(kmem_cache, s);
> + s = NULL;
>   goto out_unlock;
>  }

This one looks correct to me. I'll send it on behalf of you.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ALSA: hda - Fix silent output on MacBook Air 1,1

2014-01-24 Thread Paul Bolle
On Fri, 2014-01-24 at 15:46 -0500, Adrien Vergé wrote:
> I think very few people use Linux on their MBA 1,1. Moreover, part of
> them remained on v2.6.

I see. Well, if your analysis is correct I think you're supposed to add
 Fixes: 1a97b7f22774 ("ALSA: hda/realtek - Remove the last static quirks 
for ALC882")

to your commit explanation. Not sure, since this tag is not documented,
as far as I can see.


Paul Bolle

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pull request: wireless-next 2014-01-24

2014-01-24 Thread David Miller
From: "John W. Linville" 
Date: Fri, 24 Jan 2014 14:39:33 -0500

> Please pull these fixes for the 3.14 stream!

Pulled, thanks John.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sched_rr_get_interval NULL pointer OOPS

2014-01-24 Thread Dave Jones
On Fri, Jan 24, 2014 at 10:55:56PM +0200, Tommi Rantala wrote:
 > Hello,
 > 
 > Trinity triggered the following bug in two separate qemu virtual
 > machines after fuzzing v3.13-3995-g0dc3fd0 for a day or two. I have
 > not been running Trinity in a while, so no idea if this is a
 > regression or not.
 
Probably been there a while. I noticed on Tuesday that I hadn't annotated
the 'policy' argument to sched_setscheduler().  Now that it's passing sensible
arguments, I'm not surprised there's some fallout if subsequent syscalls
call misc scheduler functions.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/20] ARM64 / ACPI: Implement core functions for parsing MADT table

2014-01-24 Thread Arnd Bergmann
On Friday 24 January 2014, Hanjun Guo wrote:
> >> diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
> >> index e108d9c..c335c6d 100644
> >> --- a/arch/arm64/include/asm/acpi.h
> >> +++ b/arch/arm64/include/asm/acpi.h
> >> @@ -83,6 +83,9 @@ void arch_fix_phys_package_id(int num, u32 slot);
> >>   extern int (*acpi_suspend_lowlevel)(void);
> >>   #define acpi_wakeup_address (0)
> >>   
> >> +#define MAX_GIC_CPU_INTERFACE 256
> > I'll bite. Where on Earth is this value coming from?
> 
> I just thought 256 is big enough for now :(
> Yes, should be a larger number for GICv3.

Could this just be set to NR_CPUS? That way it will be large enough for
any system you can actually run on.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


sched_rr_get_interval NULL pointer OOPS

2014-01-24 Thread Tommi Rantala
Hello,

Trinity triggered the following bug in two separate qemu virtual
machines after fuzzing v3.13-3995-g0dc3fd0 for a day or two. I have
not been running Trinity in a while, so no idea if this is a
regression or not.

If I'm reading this right, it's oopsing in kernel/sched/core.c:

SYSCALL_DEFINE2(sched_rr_get_interval, pid_t, pid,
struct timespec __user *, interval)
{
...
rq = task_rq_lock(p, );
time_slice = p->sched_class->get_rr_interval(rq, p);   <==
task_rq_unlock(rq, p, );
...

The first trace:

[21451.975552] trinity-c9: vm86 mode not supported on 64 bit kernel
[21452.242792] trinity-c23: vm86 mode not supported on 64 bit kernel
[21452.309518] trinity-c30: vm86 mode not supported on 64 bit kernel
[21456.862415] type=1401 audit(1390484421.888:396): SELinux:
unrecognized netlink message type=0 for sclass=34
[21456.862415]
[21472.032599] BUG: unable to handle kernel NULL pointer dereference
at   (null)
[21472.034764] IP: [<  (null)>]   (null)
[21472.036117] PGD a6243067 PUD a712a067 PMD 0
[21472.037345] Oops: 0010 [#1] SMP DEBUG_PAGEALLOC
[21472.038616] CPU: 0 PID: 15522 Comm: trinity-c8 Not tainted 3.13.0+ #1
[21472.040309] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[21472.041823] task: 88006f8f ti: 8800a101e000 task.ti:
8800a101e000
[21472.043814] RIP: 0010:[<>]  [<  (null)>]
   (null)
[21472.045823] RSP: 0018:8800a101ff30  EFLAGS: 00010046
[21472.047225] RAX: 82434ae0 RBX: 8800b926ca40 RCX: 02c0
[21472.049143] RDX: 8800bf60e460 RSI: 8800b926ca40 RDI: 8800bf7d4fc0
[21472.050900] RBP: 8800a101ff78 R08: fffe8fd25bb38016 R09: 0001
[21472.052621] R10: 88006f8f R11:  R12: 0004
[21472.054469] R13: 8800bf7d4fc0 R14: 0094 R15: 20008465485f
[21472.056303] FS:  7f904f260700() GS:8800bf60()
knlGS:
[21472.058211] CS:  0010 DS:  ES:  CR0: 8005003b
[21472.059516] CR2:  CR3: 44ec3000 CR4: 06f0
[21472.061143] DR0: 0276a000 DR1: 0276aff8 DR2: 
[21472.062762] DR3:  DR6: 0ff0 DR7: 0600
[21472.064445] Stack:
[21472.064975]  81160cdf 81160c23 0282
0001
[21472.067017]  04ae 0008 0008
7f904f233de0
[21472.069053]  0094 0094 8235ba79
0246
[21472.071089] Call Trace:
[21472.071761]  [] ? SyS_sched_rr_get_interval+0xdf/0x230
[21472.073570]  [] ? SyS_sched_rr_get_interval+0x23/0x230
[21472.075401]  [] system_call_fastpath+0x16/0x1b
[21472.076987] Code:  Bad RIP value.
[21472.077929] RIP  [<  (null)>]   (null)
[21472.079302]  RSP 
[21472.080247] CR2: 
[21472.117066] ---[ end trace cc44b07941fc4905 ]---

The second trace looks more or less identical:

[106143.588795] RDS: rds_bind() could not find a transport, load
rds_tcp or rds_rdma?
[106146.597725] trinity-c1: vm86 mode not supported on 64 bit kernel
[106146.865957] trinity-c36: vm86 mode not supported on 64 bit kernel
[106156.562726] BUG: unable to handle kernel NULL pointer dereference
at   (null)
[106156.565411] IP: [<  (null)>]   (null)
[106156.567021] PGD a61e6067 PUD a03a4067 PMD 0
[106156.568451] Oops: 0010 [#1] SMP DEBUG_PAGEALLOC
[106156.569929] CPU: 0 PID: 19875 Comm: trinity-c23 Not tainted 3.13.0+ #1
[106156.571987] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[106156.573758] task: 8800b65d8000 ti: 880009ac8000 task.ti:
880009ac8000
[106156.576051] RIP: 0010:[<>]  [<  (null)>]
(null)
[106156.578322] RSP: 0018:880009ac9f30  EFLAGS: 00010046
[106156.579920] RAX: 82434ae0 RBX: 8800b4cb2520 RCX:
02c0
[106156.582122] RDX: 8800bf60e460 RSI: 8800b4cb2520 RDI:
8800bf7d4fc0
[106156.584225] RBP: 880009ac9f78 R08: fffe8fd25bb38016 R09:
0001
[106156.586340] R10: 8800b65d8000 R11:  R12:
008c8000
[106156.588513] R13: 8800bf7d4fc0 R14: 0094 R15:
40004a1b
[106156.590684] FS:  7f75c3e23700() GS:8800bf60()
knlGS:
[106156.593171] CS:  0010 DS:  ES:  CR0: 8005003b
[106156.594922] CR2:  CR3: a69c1000 CR4:
06f0
[106156.597114] DR0: 008c8000 DR1: 00ca5000 DR2:
024dc000
[106156.599295] DR3: 026df000 DR6: 0ff0 DR7:
00030602
[106156.601449] Stack:
[106156.602085]  81160cdf 81160c23 0282
0001
[106156.604423]  0003d7dc 0017 0017
7f75c3df6de0
[106156.606758]  0094 0094 8235ba79
0246
[106156.609117] Call Trace:
[106156.609913]  [] ? 

Re: 3.13: BUG: unable to handle kernel paging request at 00000000b4343e88

2014-01-24 Thread Meelis Roos
> >> It looks like gcov exploded when running a module's constructors or
> >> init function, but I'm unable to work out which module it was :(
> > [...]
> > 
> >> Maybe it's tg3.
> >>
> >> Could you add `ignore_loglevel' to the kernel boot parameters?  That
> >> should make all pr_debug()s come out and they include the module's
> >> name.
> 
> I'm not sure if this related, but all 3 kernel logs consistently contain
> this error message:
> 
> > [0.617401] gcov: could not create file
> 
> which should only be shown in case of severe out-of-memory situations or
> duplicate object file names.
> 
> Could you retry with the following patch applied (2 times if possible)
> and send dmesg output?

This seems to be relevant - now there is a reproducible crash during the 
printk. Captured end of the backtrace from HP ILO as image, attached. 
This is reproducible.

-- 
Meelis Roos (mr...@linux.ee)<>

Re: [PATCH] ALSA: hda - Fix silent output on MacBook Air 1,1

2014-01-24 Thread Adrien Vergé
I think very few people use Linux on their MBA 1,1. Moreover, part of
them remained on v2.6.

2014/1/24 Paul Bolle :
> On Fri, 2014-01-24 at 14:56 -0500, Adrien Vergé wrote:
>> Similarly to other Apple products, MBA 1,1 needs a specific quirk.
>> Pin 0x18 must be set to VREF_50 to have sound output.  This was no
>> longer done since commit 1a97b7f, resulting in a mute built-in speaker.
>
> Commit 1a97b7f ("ALSA: hda/realtek - Remove the last static quirks for
> ALC882") was included in v3.4. Did no-one notice this for 9 releases?
>
>> This patch corrects the regression by creating a fixup for the MBA 1,1.
>>
>> Tested-by: Adrien Vergé 
>> Signed-off-by: Adrien Vergé 
>
>
> Paul Bolle
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RT v2] timer: Raise softirq if there's irq_work

2014-01-24 Thread Sebastian Andrzej Siewior
On 01/24/2014 09:35 PM, Steven Rostedt wrote:
> 
> I know we discussed this on IRC, but I wanted to publicly state that
> the missing irq work callback was the RCU's rsp_wakeup() function.

Let me add that part to that commit message since I can't find it.

> 
> -- Steve

Sebastian

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RT v2] timer: Raise softirq if there's irq_work

2014-01-24 Thread Steven Rostedt
On Fri, 24 Jan 2014 21:20:39 +0100
Sebastian Andrzej Siewior  wrote:

> * Steven Rostedt | 2014-01-24 15:09:33 [-0500]:
> 
> >[ Talking with Sebastian on IRC, it seems that doing the irq_work_run()
> >  from the interrupt in -rt is a bad thing. Here we simply raise the
> >  softirq if there's irq work to do. This too boots on my i7 ]
> 
> It is okay in general because most of the users should not run in bare
> interrupt context. The only exception here is the nohz_full_kick_work
> thing.
> 

I know we discussed this on IRC, but I wanted to publicly state that
the missing irq work callback was the RCU's rsp_wakeup() function.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Odd 'unable to find transceiver' messages from USB with v3.13-3260-g03d11a0 and later

2014-01-24 Thread Josh Boyer
On Fri, Jan 24, 2014 at 2:08 PM, Felipe Balbi  wrote:
> Hi,
>
> On Fri, Jan 24, 2014 at 08:47:07AM -0500, Josh Boyer wrote:
>> We've had a report [1] of the USB layer throwing out 'unable to find
>> transceiver' messages during boot with the 3.14 merge window kernels.
>> I've seen this on my personal machine as well and included the dmesg
>> section below.  This does not happen with the 3.13 kernel.
>>
>> There are only a handful of files in git that have that error, but I
>> haven't seen anything that immediately strikes me as causing this.
>> From the dmesg output it looks like it is spit out right before a host
>> controller is registered?  USB seems to be still working OK in my
>> minimal testing, so the error message is confusing.
>>
>> Thoughts?
>
> looks like it was caused because of this commit:
>
> commit 1ae5799ef63176cc75ec10e545cb65f620a82747
> Author: Valentine Barshak 
> Date:   Wed Dec 4 01:42:22 2013 +0400
>
> usb: hcd: Initialize USB phy if needed

Yeah, I'm pretty sure my bisect agrees with you.

> usb_get_phy_device() will pr_err() when a PHY isn't found. Looks like
> that should be pr_debug() since everything still works even without a
> PHY.

Seems correct.  Who should write up the patch?

josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RT v2] timer: Raise softirq if there's irq_work

2014-01-24 Thread Sebastian Andrzej Siewior
* Steven Rostedt | 2014-01-24 15:09:33 [-0500]:

>[ Talking with Sebastian on IRC, it seems that doing the irq_work_run()
>  from the interrupt in -rt is a bad thing. Here we simply raise the
>  softirq if there's irq work to do. This too boots on my i7 ]

It is okay in general because most of the users should not run in bare
interrupt context. The only exception here is the nohz_full_kick_work
thing.

>After trying hard to figure out why my i7 box was locking up with the
>new active_timers code, that does not run the timer softirq if there
>are no active timers, I took an extra look at the softirq handler and
>noticed that it doesn't just run timer softirqs, it also runs irq work.
>
>This was the bug that was locking up the system. It wasn't missing a
>timer, it was missing irq work. By always doing the irq work callbacks,
>the system boots fine.
>
>No need to check for defined(CONFIG_IRQ_WORK). When that's not set the
>"irq_work_needs_cpu()" is a static inline that returns false.
>
>Signed-off-by: Steven Rostedt 

Thank you Steven, this makes sense.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] Create eeprom_dev hardware class for EEPROM devices

2014-01-24 Thread Laszlo Papp
On Fri, Jan 24, 2014 at 7:27 PM, Curt Brune  wrote:
> On Fri Jan 24 18:42, Laszlo Papp wrote:
>> > Note: The class cannot be called 'eeprom' as that is the name of the
>> > I/O file created by the driver.  The class name appears as a
>> > sub-directory within the main device directory.  Hence the class name
>> > 'eeprom_dev'.
>>
>> I am not sure I follow the reasoning here, but it is possibly because
>> I lack some knowledge. Could you please describe bad thing would
>> happen if "/sys/class/eeprom/eeprom0/label" would be used as opposed
>> to "/sys/class/eeprom_dev/eeprom0/label"?
>
> By way of example -- let's say I have an at24 device on i2c bus 2,
> with address 0x54.  In sysfs the device can be found by its bus
> address as:
>
>   $ cd /sys/bus/i2c/devices/2-0054
>   $ ls -l
>
>   total 0
>   lrwxrwxrwx 1 root root0 Jan 24 19:11 driver ->  
> ../../../../../../bus/i2c/drivers/at24
>   -rw--- 1 root root  256 Jan 23 23:33 eeprom
>   drwxr-xr-x 3 root root0 Jan 23 23:33 eeprom_dev
>   -r--r--r-- 1 root root 4096 Jan 24 19:11 modalias
>   -r--r--r-- 1 root root 4096 Jan 24 19:11 name
>   lrwxrwxrwx 1 root root0 Jan 24 19:11 subsystem ->  
> ../../../../../../bus/i2c
>   -rw-r--r-- 1 root root 4096 Jan 24 19:11 uevent
>
> The file "/sys/bus/i2c/devices/2-0054/eeprom" comes from the at24
> driver.  That is the file name the EEPROM driver exports for I/O to
> the device.  User space applications read/write this file to
> read/write the physical EEPROM via the at24 driver.
>
> The directory "/sys/bus/i2c/devices/2-0054/eeprom_dev" comes from the
> sysfs class name "eeprom_dev".  All sysfs class names appear as
> directories with the corresponding device directory.
>
> See the conflict?  If the class was also called "eeprom" it would
> clash with the existing "eeprom" file.  There cannot be two things
> named /sys/bus/i2c/devices/2-0054/eeprom.

Yes, this is more comprehensive for a newcomer, thanks.

> The files under /sys/class/eeprom_dev are symlinks to the "eeprom_dev"
> directories of the physical devices.  For this example:
>
>   $ cd /sys/class/eeprom_dev
>   $ ls -l eeprom0
>   lrwxrwxrwx 1 root root 0 Jan 23 23:33 eeprom0 -> 
> ../../devices/soc.0/ffe03000.i2c/i2c-0/i2c-2/2-0054/eeprom_dev/eeprom0
>
> Believe me I wanted to use "eeprom" as the class name originally, as
> it makes a lot of sense.  But the sysfs file creation failed due to
> the duplicate name.
>
> I was not about to change the at24 driver as user space expects the
> "eeprom" name.
>
> Hence the class name is eeprom_dev.
>
> Hope that helps.

Yes, it does and I most certainly believe you.

I am not the maintainer of this code, nor do I have any knowledge
about the API promise in the kernel, but this case seems to be a major
upgrade to the Linux eeprom stack, and hence I would not personally
worry about compatibility.

If the API is kept, the Linux kernel will have an IMHO broken stack
for many upcoming years. IMO, the benefits of the different name does
not outweigh the disadvantages, but I will leave it with the
corresponding maintainer...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] memblock, x86: fix big numa system booting

2014-01-24 Thread Andrew Morton
On Fri, 24 Jan 2014 14:33:19 -0500 Santosh Shilimkar  
wrote:

> Yinghai,
> 
> On Friday 24 January 2014 02:11 PM, Yinghai Lu wrote:
> > Big numa system boot get broken while switch API from bootmem to
> > memblock_virt.
> > 
> > Revert the offending patch, and also address swiotlb regression.
> > 
> Thanks a lot for fixes and help to narrow down and fix these
> regressions. For all the patches in the series,
> 
> Acked-by: Santosh Shilimkar 

That patchset continues to boot happily on my swiotlb test box.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fix ccp_run_passthru_cmd dma variable assignments

2014-01-24 Thread Tom Lendacky
On 01/24/2014 12:39 PM, Dave Jones wrote:
> There are some suspicious looking lines of code in the new ccp driver, 
> including
> one that assigns a variable to itself, and another that overwrites a previous 
> assignment.
> 
> This may have been a cut-and-paste error where 'src' was forgotten to be 
> changed to 'dst'.
> I have no hardware to test this, so this is untested.

Yes, this was a cut-and-paste error that was not discovered with my tests. I've
updated my testcases and tested/verified this fix.

Herbert, this should probably go through the cryptodev-2.6 tree right?

Acked-by: Tom Lendacky 

Thanks,
Tom

> 
> Signed-off-by: Dave Jones 
> 
> diff --git a/drivers/crypto/ccp/ccp-ops.c b/drivers/crypto/ccp/ccp-ops.c
> index 71ed3ade7e12..c266a7b154bb 100644
> --- a/drivers/crypto/ccp/ccp-ops.c
> +++ b/drivers/crypto/ccp/ccp-ops.c
> @@ -1666,8 +1666,8 @@ static int ccp_run_passthru_cmd(struct ccp_cmd_queue 
> *cmd_q,
>   
>   op.dst.type = CCP_MEMTYPE_SYSTEM;
>   op.dst.u.dma.address = sg_dma_address(dst.sg_wa.sg);
> - op.src.u.dma.offset = dst.sg_wa.sg_used;
> - op.src.u.dma.length = op.src.u.dma.length;
> + op.dst.u.dma.offset = dst.sg_wa.sg_used;
> + op.dst.u.dma.length = op.src.u.dma.length;
>   
>   ret = ccp_perform_passthru();
>   if (ret) {
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ALSA: hda - Fix silent output on MacBook Air 1,1

2014-01-24 Thread Paul Bolle
On Fri, 2014-01-24 at 14:56 -0500, Adrien Vergé wrote:
> Similarly to other Apple products, MBA 1,1 needs a specific quirk.
> Pin 0x18 must be set to VREF_50 to have sound output.  This was no
> longer done since commit 1a97b7f, resulting in a mute built-in speaker.

Commit 1a97b7f ("ALSA: hda/realtek - Remove the last static quirks for
ALC882") was included in v3.4. Did no-one notice this for 9 releases?

> This patch corrects the regression by creating a fixup for the MBA 1,1.
> 
> Tested-by: Adrien Vergé 
> Signed-off-by: Adrien Vergé 


Paul Bolle

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: allocate cpumask during check irq vectors

2014-01-24 Thread Yinghai Lu
Fix warning:
arch/x86/kernel/irq.c: In function check_irq_vectors_for_cpu_disable:
arch/x86/kernel/irq.c:337:1: warning: the frame size of 2052 bytes is larger 
than 2048 bytes

when NR_CPUS=8192

We should use zalloc_cpumask_var() instead.

Signed-off-by: Yinghai Lu 
Cc: Prarit Bhargava 

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index dbb6087..b114ee4 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -277,11 +277,14 @@ int check_irq_vectors_for_cpu_disable(void)
unsigned int this_cpu, vector, this_count, count;
struct irq_desc *desc;
struct irq_data *data;
-   struct cpumask affinity_new, online_new;
+   cpumask_var_t affinity_new, online_new;
+
+   zalloc_cpumask_var(_new, GFP_KERNEL);
+   zalloc_cpumask_var(_new, GFP_KERNEL);
 
this_cpu = smp_processor_id();
-   cpumask_copy(_new, cpu_online_mask);
-   cpu_clear(this_cpu, online_new);
+   cpumask_copy(online_new, cpu_online_mask);
+   cpumask_clear_cpu(this_cpu, online_new);
 
this_count = 0;
for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
@@ -289,8 +292,8 @@ int check_irq_vectors_for_cpu_disable(void)
if (irq >= 0) {
desc = irq_to_desc(irq);
data = irq_desc_get_irq_data(desc);
-   cpumask_copy(_new, data->affinity);
-   cpu_clear(this_cpu, affinity_new);
+   cpumask_copy(affinity_new, data->affinity);
+   cpumask_clear_cpu(this_cpu, affinity_new);
 
/* Do not count inactive or per-cpu irqs. */
if (!irq_has_action(irq) || irqd_is_per_cpu(data))
@@ -311,8 +314,8 @@ int check_irq_vectors_for_cpu_disable(void)
 * mask is not zero; that is the down'd cpu is the
 * last online cpu in a user set affinity mask.
 */
-   if (cpumask_empty(_new) ||
-   !cpumask_subset(_new, _new))
+   if (cpumask_empty(affinity_new) ||
+   !cpumask_subset(affinity_new, online_new))
this_count++;
}
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RT v2] timer: Raise softirq if there's irq_work

2014-01-24 Thread Steven Rostedt
[ Talking with Sebastian on IRC, it seems that doing the irq_work_run()
  from the interrupt in -rt is a bad thing. Here we simply raise the
  softirq if there's irq work to do. This too boots on my i7 ]

After trying hard to figure out why my i7 box was locking up with the
new active_timers code, that does not run the timer softirq if there
are no active timers, I took an extra look at the softirq handler and
noticed that it doesn't just run timer softirqs, it also runs irq work.

This was the bug that was locking up the system. It wasn't missing a
timer, it was missing irq work. By always doing the irq work callbacks,
the system boots fine.

No need to check for defined(CONFIG_IRQ_WORK). When that's not set the
"irq_work_needs_cpu()" is a static inline that returns false.

Signed-off-by: Steven Rostedt 

diff --git a/kernel/timer.c b/kernel/timer.c
index 46467be..c01a0d2 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1464,8 +1464,13 @@ void run_local_timers(void)
raise_softirq(TIMER_SOFTIRQ);
return;
}
-   if (!base->active_timers)
-   goto out;
+   if (!base->active_timers) {
+#ifdef CONFIG_PREEMPT_RT_FULL
+   /* On RT, irq work runs from softirq */
+   if (!irq_work_needs_cpu())
+#endif
+   goto out;
+   }
 
/* Check whether the next pending timer has expired */
if (time_before_eq(base->next_timer, jiffies))
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] ipmi: use USEC_PER_SEC instead of 1000000 for more meaningful

2014-01-24 Thread minyard
From: Xie XiuQi 

Use USEC_PER_SEC instead of 100, that making the later bugfix
more clearly.

Signed-off-by: Xie XiuQi 
Signed-off-by: Corey Minyard 
---
 drivers/char/ipmi/ipmi_bt_sm.c   | 8 
 drivers/char/ipmi/ipmi_kcs_sm.c  | 4 ++--
 drivers/char/ipmi/ipmi_smic_sm.c | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_bt_sm.c b/drivers/char/ipmi/ipmi_bt_sm.c
index a22a7a5..f5e4cd7 100644
--- a/drivers/char/ipmi/ipmi_bt_sm.c
+++ b/drivers/char/ipmi/ipmi_bt_sm.c
@@ -201,7 +201,7 @@ static unsigned int bt_init_data(struct si_sm_data *bt, 
struct si_sm_io *io)
}
bt->state = BT_STATE_IDLE;  /* start here */
bt->complete = BT_STATE_IDLE;   /* end here */
-   bt->BT_CAP_req2rsp = BT_NORMAL_TIMEOUT * 100;
+   bt->BT_CAP_req2rsp = BT_NORMAL_TIMEOUT * USEC_PER_SEC;
bt->BT_CAP_retries = BT_NORMAL_RETRY_LIMIT;
/* BT_CAP_outreqs == zero is a flag to read BT Capabilities */
return 3; /* We claim 3 bytes of space; ought to check SPMI table */
@@ -613,7 +613,7 @@ static enum si_sm_result bt_event(struct si_sm_data *bt, 
long time)
HOST2BMC(42);   /* Sequence number */
HOST2BMC(3);/* Cmd == Soft reset */
BT_CONTROL(BT_H2B_ATN);
-   bt->timeout = BT_RESET_DELAY * 100;
+   bt->timeout = BT_RESET_DELAY * USEC_PER_SEC;
BT_STATE_CHANGE(BT_STATE_RESET3,
SI_SM_CALL_WITH_DELAY);
 
@@ -651,14 +651,14 @@ static enum si_sm_result bt_event(struct si_sm_data *bt, 
long time)
bt_init_data(bt, bt->io);
if ((i == 8) && !BT_CAP[2]) {
bt->BT_CAP_outreqs = BT_CAP[3];
-   bt->BT_CAP_req2rsp = BT_CAP[6] * 100;
+   bt->BT_CAP_req2rsp = BT_CAP[6] * USEC_PER_SEC;
bt->BT_CAP_retries = BT_CAP[7];
} else
printk(KERN_WARNING "IPMI BT: using default values\n");
if (!bt->BT_CAP_outreqs)
bt->BT_CAP_outreqs = 1;
printk(KERN_WARNING "IPMI BT: req2rsp=%ld secs retries=%d\n",
-   bt->BT_CAP_req2rsp / 100L, bt->BT_CAP_retries);
+   bt->BT_CAP_req2rsp / USEC_PER_SEC, bt->BT_CAP_retries);
bt->timeout = bt->BT_CAP_req2rsp;
return SI_SM_CALL_WITHOUT_DELAY;
 
diff --git a/drivers/char/ipmi/ipmi_kcs_sm.c b/drivers/char/ipmi/ipmi_kcs_sm.c
index e53fc24..6a4bdc1 100644
--- a/drivers/char/ipmi/ipmi_kcs_sm.c
+++ b/drivers/char/ipmi/ipmi_kcs_sm.c
@@ -118,8 +118,8 @@ enum kcs_states {
 #define MAX_KCS_WRITE_SIZE IPMI_MAX_MSG_LENGTH
 
 /* Timeouts in microseconds. */
-#define IBF_RETRY_TIMEOUT 500
-#define OBF_RETRY_TIMEOUT 500
+#define IBF_RETRY_TIMEOUT (5*USEC_PER_SEC)
+#define OBF_RETRY_TIMEOUT (5*USEC_PER_SEC)
 #define MAX_ERROR_RETRIES 10
 #define ERROR0_OBF_WAIT_JIFFIES (2*HZ)
 
diff --git a/drivers/char/ipmi/ipmi_smic_sm.c b/drivers/char/ipmi/ipmi_smic_sm.c
index faed929..c8e77af 100644
--- a/drivers/char/ipmi/ipmi_smic_sm.c
+++ b/drivers/char/ipmi/ipmi_smic_sm.c
@@ -80,7 +80,7 @@ enum smic_states {
 #define SMIC_MAX_ERROR_RETRIES 3
 
 /* Timeouts in microseconds. */
-#define SMIC_RETRY_TIMEOUT 200
+#define SMIC_RETRY_TIMEOUT (2*USEC_PER_SEC)
 
 /* SMIC Flags Register Bits */
 #define SMIC_RX_DATA_READY 0x80
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] ipmi: fix timeout calculation when bmc is disconnected

2014-01-24 Thread minyard
From: Xie XiuQi 

Loading ipmi_si module while bmc is disconnected, we found the timeout
is longer than 5 secs. Actually it takes about 3 mins and 20 secs.(HZ=250)

error message as below:
Dec 12 19:08:59 linux kernel: IPMI BT: timeout in RD_WAIT [ ] 1 retries left
Dec 12 19:08:59 linux kernel: BT: write 4 bytes seq=0x01 03 18 00 01
[...]
Dec 12 19:12:19 linux kernel: IPMI BT: timeout in RD_WAIT [ ]
Dec 12 19:12:19 linux kernel: failed 2 retries, sending error response
Dec 12 19:12:19 linux kernel: IPMI: BT reset (takes 5 secs)
Dec 12 19:12:19 linux kernel: IPMI BT: flag reset [ ]

Function wait_for_msg_done() use schedule_timeout_uninterruptible(1) to
sleep 1 tick, so we should subtract jiffies_to_usecs(1) instead of 100
usecs from timeout.

Reported-by: Hu Shiyuan 
Signed-off-by: Xie XiuQi 
Signed-off-by: Corey Minyard 
---
 drivers/char/ipmi/ipmi_si_intf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index 68c5ef5..a5e048f 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -2773,7 +2773,7 @@ static int wait_for_msg_done(struct smi_info *smi_info)
smi_result == SI_SM_CALL_WITH_TICK_DELAY) {
schedule_timeout_uninterruptible(1);
smi_result = smi_info->handlers->event(
-   smi_info->si_sm, 100);
+   smi_info->si_sm, jiffies_to_usecs(1));
} else if (smi_result == SI_SM_CALL_WITHOUT_DELAY) {
smi_result = smi_info->handlers->event(
smi_info->si_sm, 0);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] ipmi: Cleanup error return

2014-01-24 Thread minyard
From: Corey Minyard 

Return proper errors for a lot of IPMI failure cases.  Also call
pci_disable_device when IPMI PCI devices are removed.

Signed-off-by: Corey Minyard 
---
 drivers/char/ipmi/ipmi_si_intf.c | 44 ++--
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index a5e048f..671c385 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -1849,11 +1849,15 @@ static int hotmod_handler(const char *val, struct 
kernel_param *kp)
info->irq_setup = std_irq_setup;
info->slave_addr = ipmb;
 
-   if (!add_smi(info)) {
-   if (try_smi_init(info))
-   cleanup_one_si(info);
-   } else {
+   rv = add_smi(info);
+   if (rv) {
kfree(info);
+   goto out;
+   }
+   rv = try_smi_init(info);
+   if (rv) {
+   cleanup_one_si(info);
+   goto out;
}
} else {
/* remove */
@@ -2067,6 +2071,7 @@ struct SPMITable {
 static int try_init_spmi(struct SPMITable *spmi)
 {
struct smi_info  *info;
+   int rv;
 
if (spmi->IPMIlegacy != 1) {
printk(KERN_INFO PFX "Bad SPMI legacy %d\n", spmi->IPMIlegacy);
@@ -2141,10 +2146,11 @@ static int try_init_spmi(struct SPMITable *spmi)
 info->io.addr_data, info->io.regsize, info->io.regspacing,
 info->irq);
 
-   if (add_smi(info))
+   rv = add_smi(info);
+   if (rv)
kfree(info);
 
-   return 0;
+   return rv;
 }
 
 static void spmi_find_bmc(void)
@@ -2178,6 +2184,7 @@ static int ipmi_pnp_probe(struct pnp_dev *dev,
acpi_handle handle;
acpi_status status;
unsigned long long tmp;
+   int rv;
 
acpi_dev = pnp_acpi_device(dev);
if (!acpi_dev)
@@ -2259,10 +2266,11 @@ static int ipmi_pnp_probe(struct pnp_dev *dev,
 res, info->io.regsize, info->io.regspacing,
 info->irq);
 
-   if (add_smi(info))
-   goto err_free;
+   rv = add_smi(info);
+   if (rv)
+   kfree(info);
 
-   return 0;
+   return rv;
 
 err_free:
kfree(info);
@@ -2566,16 +2574,20 @@ static int ipmi_pci_probe(struct pci_dev *pdev,
>resource[0], info->io.regsize, info->io.regspacing,
info->irq);
 
-   if (add_smi(info))
+   rv = add_smi(info);
+   if (rv) {
kfree(info);
+   pci_disable_device(pdev);
+   }
 
-   return 0;
+   return rv;
 }
 
 static void ipmi_pci_remove(struct pci_dev *pdev)
 {
struct smi_info *info = pci_get_drvdata(pdev);
cleanup_one_si(info);
+   pci_disable_device(pdev);
 }
 
 static struct pci_device_id ipmi_pci_devices[] = {
@@ -2670,9 +2682,10 @@ static int ipmi_probe(struct platform_device *dev)
 
dev_set_drvdata(>dev, info);
 
-   if (add_smi(info)) {
+   ret = add_smi(info);
+   if (ret) {
kfree(info);
-   return -EBUSY;
+   return ret;
}
 #endif
return 0;
@@ -2736,9 +2749,10 @@ static int ipmi_parisc_probe(struct parisc_device *dev)
 
dev_set_drvdata(>dev, info);
 
-   if (add_smi(info)) {
+   rv = add_smi(info);
+   if (rv) {
kfree(info);
-   return -EBUSY;
+   return rv;
}
 
return 0;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] dynamic_debug: replace obselete simple_strtoul() with kstrtouint()

2014-01-24 Thread Jason Baron
Hi,

I think we want some sort of commit message for this patch. But they
all look good to me and they tested fine.

Acked-by: Jason Baron 

Greg, Can you pick up this series?

Thanks,

-Jason

On 01/23/2014 08:20 AM, Andrey Ryabinin wrote:
> Signed-off-by: Andrey Ryabinin 
> ---
>  lib/dynamic_debug.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
> index e488d9a..7288e38 100644
> --- a/lib/dynamic_debug.c
> +++ b/lib/dynamic_debug.c
> @@ -268,14 +268,12 @@ static int ddebug_tokenize(char *buf, char *words[], 
> int maxwords)
>   */
>  static inline int parse_lineno(const char *str, unsigned int *val)
>  {
> - char *end = NULL;
>   BUG_ON(str == NULL);
>   if (*str == '\0') {
>   *val = 0;
>   return 0;
>   }
> - *val = simple_strtoul(str, , 10);
> - if (end == NULL || end == str || *end != '\0') {
> + if (kstrtouint(str, 10, val) < 0) {
>   pr_err("bad line-number: %s\n", str);
>   return -EINVAL;
>   }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


ipmi: Some minor fixes

2014-01-24 Thread minyard
Just some collected fixes for 3.14.  Nothing huge.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] ipmi: remove deprecated IRQF_DISABLED

2014-01-24 Thread minyard
From: Michael Opdenacker 

This patch proposes to remove the use of the IRQF_DISABLED flag

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker 
Signed-off-by: Corey Minyard 
---
 drivers/char/ipmi/ipmi_si_intf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index 15e4a60..68c5ef5 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -1358,7 +1358,7 @@ static int std_irq_setup(struct smi_info *info)
if (info->si_type == SI_BT) {
rv = request_irq(info->irq,
 si_bt_irq_handler,
-IRQF_SHARED | IRQF_DISABLED,
+IRQF_SHARED,
 DEVICE_NAME,
 info);
if (!rv)
@@ -1368,7 +1368,7 @@ static int std_irq_setup(struct smi_info *info)
} else
rv = request_irq(info->irq,
 si_irq_handler,
-IRQF_SHARED | IRQF_DISABLED,
+IRQF_SHARED,
 DEVICE_NAME,
 info);
if (rv) {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ALSA: hda - Fix silent output on MacBook Air 1,1

2014-01-24 Thread Adrien Vergé
Similarly to other Apple products, MBA 1,1 needs a specific quirk.
Pin 0x18 must be set to VREF_50 to have sound output.  This was no
longer done since commit 1a97b7f, resulting in a mute built-in speaker.

This patch corrects the regression by creating a fixup for the MBA 1,1.

Tested-by: Adrien Vergé 
Signed-off-by: Adrien Vergé 
---
 sound/pci/hda/patch_realtek.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index c564694..723de28 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -1780,6 +1780,7 @@ enum {
ALC889_FIXUP_DAC_ROUTE,
ALC889_FIXUP_MBP_VREF,
ALC889_FIXUP_IMAC91_VREF,
+   ALC889_FIXUP_MBA11_VREF,
ALC889_FIXUP_MBA21_VREF,
ALC882_FIXUP_INV_DMIC,
ALC882_FIXUP_NO_PRIMARY_HP,
@@ -1910,6 +1911,16 @@ static void alc889_fixup_imac91_vref(struct hda_codec 
*codec,
alc889_fixup_mac_pins(codec, nids, ARRAY_SIZE(nids));
 }
 
+/* Set VREF on speaker pins on mba11 */
+static void alc889_fixup_mba11_vref(struct hda_codec *codec,
+   const struct hda_fixup *fix, int action)
+{
+   static hda_nid_t nids[1] = { 0x18 };
+
+   if (action == HDA_FIXUP_ACT_INIT)
+   alc889_fixup_mac_pins(codec, nids, ARRAY_SIZE(nids));
+}
+
 /* Set VREF on speaker pins on mba21 */
 static void alc889_fixup_mba21_vref(struct hda_codec *codec,
const struct hda_fixup *fix, int action)
@@ -2119,6 +2130,12 @@ static const struct hda_fixup alc882_fixups[] = {
.chained = true,
.chain_id = ALC882_FIXUP_GPIO1,
},
+   [ALC889_FIXUP_MBA11_VREF] = {
+   .type = HDA_FIXUP_FUNC,
+   .v.func = alc889_fixup_mba11_vref,
+   .chained = true,
+   .chain_id = ALC889_FIXUP_MBP_VREF,
+   },
[ALC889_FIXUP_MBA21_VREF] = {
.type = HDA_FIXUP_FUNC,
.v.func = alc889_fixup_mba21_vref,
@@ -2194,7 +2211,7 @@ static const struct snd_pci_quirk alc882_fixup_tbl[] = {
SND_PCI_QUIRK(0x106b, 0x2c00, "MacbookPro rev3", ALC889_FIXUP_MBP_VREF),
SND_PCI_QUIRK(0x106b, 0x3000, "iMac", ALC889_FIXUP_MBP_VREF),
SND_PCI_QUIRK(0x106b, 0x3200, "iMac 7,1 Aluminum", ALC882_FIXUP_EAPD),
-   SND_PCI_QUIRK(0x106b, 0x3400, "MacBookAir 1,1", ALC889_FIXUP_MBP_VREF),
+   SND_PCI_QUIRK(0x106b, 0x3400, "MacBookAir 1,1", 
ALC889_FIXUP_MBA11_VREF),
SND_PCI_QUIRK(0x106b, 0x3500, "MacBookAir 2,1", 
ALC889_FIXUP_MBA21_VREF),
SND_PCI_QUIRK(0x106b, 0x3600, "Macbook 3,1", ALC889_FIXUP_MBP_VREF),
SND_PCI_QUIRK(0x106b, 0x3800, "MacbookPro 4,1", ALC889_FIXUP_MBP_VREF),
-- 
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   >