date:20131211

[PATCHv7 01/12] of: introduce of_property_for_each_phandle_with_args()

2013-12-11 Thread Hiroshi Doyu

Iterating over a property containing a list of phandles with arguments
is a common operation for device drivers. This patch adds a new
of_property_for_each_phandle_with_args() macro to make the iteration
simpler.

Signed-off-by: Hiroshi Doyu 
Cc: Rob Herring 
Cc: Grant Likely 
---
v7:
Fixed some minors pointed by Rob and Stephen.

v6:
Iterate without intrducing a new struct.

v6+++:
Introduced a new struct "of_phandle_iter" to keep the state when
iterating over the list.

v6++:
Optimized to avoid O(n^2), suggested by Stephen Warren.
http://lists.linuxfoundation.org/pipermail/iommu/2013-November/007066.html

I didn't introduce any struct to hold params and state here.

v6+:
Use the description, which Grant Likely proposed, to be full enough
that a future reader can figure out why a patch was written.

v5:
New patch for v5.

Signed-off-by: Hiroshi Doyu 
---
 drivers/of/base.c  | 46 ++
 include/linux/of.h | 32 
 2 files changed, 78 insertions(+)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index f807d0e..cd4ab05 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1201,6 +1201,52 @@ void of_print_phandle_args(const char *msg, const struct 
of_phandle_args *args)
printk("\n");
 }
 
+const __be32 *of_phandle_iter_next(const char *cells_name, int cell_count,
+  const __be32 *cur, const __be32 *end,
+  struct of_phandle_args *out_args)
+{
+   struct device_node *dn;
+   int i;
+
+   if (!cells_name && !cell_count)
+   return NULL;
+
+   if (!cur || (cur >= end))
+   return NULL;
+
+   dn = of_find_node_by_phandle(be32_to_cpup(cur++));
+   if (!dn)
+   return NULL;
+
+   if (cells_name)
+   if (of_property_read_u32(dn, cells_name, _count))
+   return NULL;
+
+   out_args->np = dn;
+   out_args->args_count = cell_count;
+   for (i = 0; i < cell_count; i++)
+   out_args->args[i] = be32_to_cpup(cur++);
+
+   return cur;
+}
+EXPORT_SYMBOL_GPL(of_phandle_iter_next);
+
+const __be32 *of_phandle_iter_init(const struct device_node *np,
+  const char *list_name,
+  const __be32 **end)
+{
+   size_t bytes;
+   const __be32 *cur;
+
+   cur = of_get_property(np, list_name, );
+   *end = cur;
+   if (bytes)
+   *end += bytes / sizeof(*cur);
+
+   return cur;
+}
+EXPORT_SYMBOL_GPL(of_phandle_iter_init);
+
 static int __of_parse_phandle_with_args(const struct device_node *np,
const char *list_name,
const char *cells_name,
diff --git a/include/linux/of.h b/include/linux/of.h
index 276c546..4345582 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -303,6 +303,14 @@ extern int of_parse_phandle_with_fixed_args(const struct 
device_node *np,
 extern int of_count_phandle_with_args(const struct device_node *np,
const char *list_name, const char *cells_name);
 
+extern const __be32 *of_phandle_iter_init(const struct device_node *np,
+ const char *list_name,
+ const __be32 **end);
+extern const __be32 *of_phandle_iter_next(const char *cells_name,
+ int cell_count,
+ const __be32 *cur, const __be32 *end,
+ struct of_phandle_args *out_args);
+
 extern void of_alias_scan(void * (*dt_alloc)(u64 size, u64 align));
 extern int of_alias_get_id(struct device_node *np, const char *stem);
 
@@ -527,6 +535,22 @@ static inline int of_count_phandle_with_args(struct 
device_node *np,
return -ENOSYS;
 }
 
+static inline const __be32 *of_phandle_iter_init(const struct device_node *np,
+const char *list_name,
+const __be32 **end)
+{
+   return NULL;
+}
+
+static inline const __be32 *of_phandle_iter_next(const char *cells_name,
+int cell_count,
+const __be32 *cur,
+const __be32 *end,
+struct of_phandle_args 
*out_args);
+{
+   return NULL;
+}
+
 static inline int of_alias_get_id(struct device_node *np, const char *stem)
 {
return -ENOSYS;
@@ -613,6 +637,14 @@ static inline int of_property_read_u32(const struct 
device_node *np,
s;  \
s = of_prop_next_string(prop, s))
 
+#define of_property_for_each_phandle_with_args(node, list_name, cells_name, \
+

[PATCHv7 00/12] Unifying SMMU driver among Tegra SoCs

2013-12-11 Thread Hiroshi Doyu

Hi,

This series provide:

(0) IOMMU standard DT binding("iommus")
(1) Unified IOMMU(SMMU) driver among Tegra SoCs
(2) Multiple Address Space support(MASID) in IOMMU(SMMMU)
(3) Tegra IOMMU'able devices, most of platform devices are IOMMU'able.

There's been some discussion[1] about device population order. Some
devices needs to be populated earlier than other devices regardless of
their bus topology. For the solution I implemented an IOMMU hook in
driver core:

  [PATCHv7 04/13] driver/core: populate devices in order for IOMMUs

which is based on:
  http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006933.html

The main problem here is,

IOMMU devices on the bus need to be poplulated first, then iommu
master devices are done later.

With CONFIG_OF_IOMMU, "iommus=" DT binding would be used to identify
whether a device can be an iommu msater or not. If a device can, we'll
defer to populate that device till an iommu device is populated. Then,
those defered iommu master devices are populated and configured with
help of the already populated iommu device via a new IOMMU API
iommu_ops->driver_bound().

This "iommus=" binding is expected used as the global/standard binding.

Tested IOMMU functionality with T30 SD/MMC. Any further testing with
T114 and/or other devices would be really appreciated.

v6:
Minior fixes.
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/213082.html

v5:
Use "iommus=" DT bindings as a standard IOMMU binding.
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/212331.html

v4:
Add a hook in driver core to control device populatin order.
Introduced arm,smmu "mmu-master" binding instead of tegra own.
Removed DT patches from this series.
  http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006931.html

v3:
Updated based on Stephen Warren's feedback
  http://lists.linuxfoundation.org/pipermail/iommu/2013-October/006724.html

v2:
Updated based on Thierry Reding's and Stephen Warren's feedback
  http://lists.infradead.org/pipermail/linux-arm-kernel/2013-July/181888.html

v1:
  http://lists.infradead.org/pipermail/linux-arm-kernel/2013-June/180267.html

Available in the git repository at:

  git://g...@nv-tegra.nvidia.com/user/hdoyu/linux.git smmu-upstreaming@20131212

Hiroshi Doyu (12):
  of: introduce of_property_for_each_phandle_with_args()
  iommu/of: introduce a global iommu device list
  iommu/of: check if dependee iommu is ready or not
  driver/core: populate devices in order for IOMMUs
  iommu/core: add ops->{bound,unbind}_driver()
  ARM: tegra: create a DT header defining SWGROUP ID
  iommu/tegra: smmu: register device to iommu dynamically
  iommu/tegra: smmu: calculate ASID register offset by ID
  iommu/tegra: smmu: get swgroups from DT "iommus="
  iommu/tegra: smmu: allow duplicate ASID wirte
  iommu/tegra: smmu: Rename hwgrp -> swgroups
  iommu/tegra: smmu: add SMMU to an global iommu list

 .../bindings/iommu/nvidia,tegra30-smmu.txt |  30 +-
 drivers/base/dd.c  |   5 +
 drivers/iommu/Kconfig  |   1 +
 drivers/iommu/iommu.c  |  13 +-
 drivers/iommu/of_iommu.c   |  51 +++
 drivers/iommu/tegra-smmu.c | 383 +
 drivers/of/base.c  |  46 +++
 include/dt-bindings/memory/tegra-swgroup.h |  50 +++
 include/linux/iommu.h  |   4 +
 include/linux/of.h |  32 ++
 include/linux/of_iommu.h   |  22 ++
 11 files changed, 487 insertions(+), 150 deletions(-)
 create mode 100644 include/dt-bindings/memory/tegra-swgroup.h

-- 
1.8.1.5

[1]
  "[RFC] early init and DT platform devices allocation/registration"

https://lists.ozlabs.org/pipermail/devicetree-discuss/2013-June/thread.html#36542
  "Report from 2013 ARM kernel summit"

http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/210426.html
  "[RFC PATCH] Documentation: devicetree: add description for generic bus 
properties"

http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/215042.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v11] PPC: POWERNV: move iommu_add_device earlier

2013-12-11 Thread Alexey Kardashevskiy

The current implementation of IOMMU on sPAPR does not use iommu_ops
and therefore does not call IOMMU API's bus_set_iommu() which
1) sets iommu_ops for a bus
2) registers a bus notifier
Instead, PCI devices are added to IOMMU groups from
subsys_initcall_sync(tce_iommu_init) which does basically the same
thing without using iommu_ops callbacks.

However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158)
implements iommu_ops and when tce_iommu_init is called, every PCI device
is already added to some group so there is a conflict.

This patch does 2 things:
1. removes the loop in which PCI devices were added to groups and
adds explicit iommu_add_device() calls to add devices as soon as they get
the iommu_table pointer assigned to them.
2. moves a bus notifier to powernv code in order to avoid conflict with
the notifier from Freescale driver.

iommu_add_device() and iommu_del_device() are public now.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v11:
* rebased on upstream

v10:
* fixed linker error when IOMMU_API is not enabled

v9:
* removed "KVM" from the subject as it is not really a KVM patch so
PPC mainainter (hi Ben!) can review/include it into his tree

v8:
* added the check for iommu_group!=NULL before removing device from a group
as suggested by Wei Yang 

v2:
* added a helper - set_iommu_table_base_and_group - which does
set_iommu_table_base() and iommu_add_device()
---
 arch/powerpc/include/asm/iommu.h| 26 
 arch/powerpc/kernel/iommu.c | 11 --
 arch/powerpc/platforms/powernv/pci-ioda.c   |  8 
 arch/powerpc/platforms/powernv/pci-p5ioc2.c |  2 +-
 arch/powerpc/platforms/powernv/pci.c| 31 -
 arch/powerpc/platforms/pseries/iommu.c  |  8 +---
 6 files changed, 70 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index c34656a..774fa27 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -101,8 +101,34 @@ extern void iommu_free_table(struct iommu_table *tbl, 
const char *node_name);
  */
 extern struct iommu_table *iommu_init_table(struct iommu_table * tbl,
int nid);
+#ifdef CONFIG_IOMMU_API
 extern void iommu_register_group(struct iommu_table *tbl,
 int pci_domain_number, unsigned long pe_num);
+extern int iommu_add_device(struct device *dev);
+extern void iommu_del_device(struct device *dev);
+#else
+static inline void iommu_register_group(struct iommu_table *tbl,
+   int pci_domain_number,
+   unsigned long pe_num)
+{
+}
+
+static inline int iommu_add_device(struct device *dev)
+{
+   return 0;
+}
+
+static inline void iommu_del_device(struct device *dev)
+{
+}
+#endif /* !CONFIG_IOMMU_API */
+
+static inline void set_iommu_table_base_and_group(struct device *dev,
+ void *base)
+{
+   set_iommu_table_base(dev, base);
+   iommu_add_device(dev);
+}
 
 extern int iommu_map_sg(struct device *dev, struct iommu_table *tbl,
struct scatterlist *sglist, int nelems,
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 572bb5b..818a092 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -1105,7 +1105,7 @@ void iommu_release_ownership(struct iommu_table *tbl)
 }
 EXPORT_SYMBOL_GPL(iommu_release_ownership);
 
-static int iommu_add_device(struct device *dev)
+int iommu_add_device(struct device *dev)
 {
struct iommu_table *tbl;
int ret = 0;
@@ -1134,11 +1134,13 @@ static int iommu_add_device(struct device *dev)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(iommu_add_device);
 
-static void iommu_del_device(struct device *dev)
+void iommu_del_device(struct device *dev)
 {
iommu_group_remove_device(dev);
 }
+EXPORT_SYMBOL_GPL(iommu_del_device);
 
 static int iommu_bus_notifier(struct notifier_block *nb,
  unsigned long action, void *data)
@@ -1162,13 +1164,8 @@ static struct notifier_block tce_iommu_bus_nb = {
 
 static int __init tce_iommu_init(void)
 {
-   struct pci_dev *pdev = NULL;
-
BUILD_BUG_ON(PAGE_SIZE < IOMMU_PAGE_SIZE);
 
-   for_each_pci_dev(pdev)
-   iommu_add_device(>dev);
-
bus_register_notifier(_bus_type, _iommu_bus_nb);
return 0;
 }
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 2c6d173..f0e6871 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -460,7 +460,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, 
struct pci_dev *pdev
return;
 
pe = >ioda.pe_array[pdn->pe_number];
-   set_iommu_table_base(>dev, >tce32_table);
+

[PATCH 3/6] perf tools: Fix perf list --raw-dump option bug.

2013-12-11 Thread Dongsheng Yang

Perf completion will use perf list --raw-dump to get the events available for 
'-e'.
But currently, it does not work well.

Example:
# perf stat -e kvmm[TAB]  Error: unknown option `raw-dump'

usage: perf list [hw|sw|cache|tracepoint|pmu|event_glob]

Because perf-completion.sh use 'perf list --raw-dump' to get the all events, but
as we introduced the parse_options() for perf list. We will get a error when we
use --raw-dump option.

This patch add an hiden option raw_dump for perf list. Then the --raw-dump will
work well to get the all event names and it will not be noise in perf list -h.

Verification:
# ./perf stat -e kvmmmu:[TAB]
fast_page_fault   kvm_mmu_get_page  
kvm_mmu_paging_elementkvm_mmu_set_accessed_bit  kvm_mmu_sync_page 
kvm_mmu_walker_error
handle_mmio_page_faultkvm_mmu_pagetable_walk
kvm_mmu_prepare_zap_page  kvm_mmu_set_dirty_bit kvm_mmu_unsync_page   
mark_mmio_spte

Signed-off-by: Dongsheng Yang 
---
 tools/perf/builtin-list.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 011195e..82d54b6 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -19,7 +19,9 @@
 int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
 {
int i;
+   bool raw_dump = false;
const struct option list_options[] = {
+   OPT_BOOLEAN_HIDEN(0, "raw-dump", _dump, NULL),
OPT_END()
};
const char * const list_usage[] = {
@@ -30,6 +32,11 @@ int cmd_list(int argc, const char **argv, const char *prefix 
__maybe_unused)
argc = parse_options(argc, argv, list_options, list_usage,
 PARSE_OPT_STOP_AT_NON_OPTION);
 
+   if (raw_dump) {
+   print_events(NULL, true);
+   return 0;
+   }
+
setup_pager();
 
if (argc == 0) {
@@ -53,8 +60,6 @@ int cmd_list(int argc, const char **argv, const char *prefix 
__maybe_unused)
print_hwcache_events(NULL, false);
else if (strcmp(argv[i], "pmu") == 0)
print_pmu_events(NULL, false);
-   else if (strcmp(argv[i], "--raw-dump") == 0)
-   print_events(NULL, true);
else {
char *sep = strchr(argv[i], ':'), *s;
int sep_idx;
-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/6] perf tools: Enhancement for perf list for unexpected input.

2013-12-11 Thread Dongsheng Yang

Example:
# perf list test

List of pre-defined events (to be used in -e):
# echo $?
0

Verification:
# perf list test

Error: No event for test.
Usage:
perf list [hw|sw|cache|tracepoint|pmu|event_glob]
# echo $?
255

Signed-off-by: Dongsheng Yang 
---
 tools/perf/builtin-list.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 82d54b6..ba23f65 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -63,9 +63,11 @@ int cmd_list(int argc, const char **argv, const char *prefix 
__maybe_unused)
else {
char *sep = strchr(argv[i], ':'), *s;
int sep_idx;
+   unsigned int count;
 
if (sep == NULL) {
-   print_events(argv[i], false);
+   if(!(count = print_events(argv[i], false)))
+   goto err_out;
continue;
}
sep_idx = sep - argv[i];
@@ -74,9 +76,16 @@ int cmd_list(int argc, const char **argv, const char *prefix 
__maybe_unused)
return -1;
 
s[sep_idx] = '\0';
-   print_tracepoint_events(s, s + sep_idx + 1, false);
+   if (!(count = print_tracepoint_events(s, s + sep_idx + 
1, false)))
+   goto err_out;
free(s);
}
}
+
return 0;
+
+err_out:
+   printf("\nError: No event for %s.\n", argv[i]);
+   printf("Usage:\n\t%s\n", list_usage[0]);
+   return -1;
 }
-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/6] perf tools: Introduce an OPT_BOOLEAN_HIDEN in to parse-options.h.

2013-12-11 Thread Dongsheng Yang

Signed-off-by: Dongsheng Yang 
---
 tools/perf/util/parse-options.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-options.h b/tools/perf/util/parse-options.h
index cbf0149..f395a21a 100644
--- a/tools/perf/util/parse-options.h
+++ b/tools/perf/util/parse-options.h
@@ -84,7 +84,7 @@ typedef int parse_opt_cb(const struct option *, const char 
*arg, int unset);
  *   CALLBACKS can use it like they want.
  *
  * `set`::
- *   whether an option was set by the user
+ *   whether an option was set by the user.
  */
 struct option {
enum parse_opt_type type;
@@ -111,6 +111,8 @@ struct option {
{ .type = OPTION_BOOLEAN, .short_name = (s), .long_name = (l), \
.value = check_vtype(v, bool *), .help = (h), \
.set = check_vtype(os, bool *)}
+#define OPT_BOOLEAN_HIDEN(s, l, v, h) \
+   { .type = OPTION_BOOLEAN, .short_name = (s), .long_name = (l), .value = 
check_vtype(v, bool *), .flags = PARSE_OPT_HIDDEN, .help = (h)}
 #define OPT_INCR(s, l, v, h){ .type = OPTION_INCR, .short_name = (s), 
.long_name = (l), .value = check_vtype(v, int *), .help = (h) }
 #define OPT_SET_UINT(s, l, v, h, i)  { .type = OPTION_SET_UINT, .short_name = 
(s), .long_name = (l), .value = check_vtype(v, unsigned int *), .help = (h), 
.defval = (i) }
 #define OPT_SET_PTR(s, l, v, h, p)  { .type = OPTION_SET_PTR, .short_name = 
(s), .long_name = (l), .value = (v), .help = (h), .defval = (p) }
-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/6] perf tools: Add long_name for call-graph option.

2013-12-11 Thread Dongsheng Yang

I am not sure why the -g option in record and top has no long_name
for it.

Example:
# perf record --[TAB]
--all-cpus  --count --freq  --no-delay  
--period--realtime  --uid
--branch-any--cpu   --group 
--no-inherit--per-thread--stat  --verbose
--branch-filter --data  --mmap-pages
--no-samples--pid   --tid   --weight
--call-graph--event --no-buildid--(null)
--quiet --timestamp
--cgroup--filter--no-buildid-cache  --output
--raw-samples   --transaction

There is a --(null) here, it is not clear enough to user.

This patch add a "call-graph" long_name for it.

Verification:
# perf record --[TAB]
--all-cpus  --count --freq  --no-delay  
--per-thread--stat  --verbose
--branch-any--cpu   --group 
--no-inherit--pid   --tid   --weight
--branch-filter --data  --mmap-pages
--no-samples--quiet --timestamp
--call-graph--event --no-buildid--output
--raw-samples   --transaction
--cgroup--filter--no-buildid-cache  --period
--realtime  --uid

Signed-off-by: Dongsheng Yang 
---
 tools/perf/builtin-record.c | 2 +-
 tools/perf/builtin-top.c| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index c1c1200..7460c8f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -844,7 +844,7 @@ const struct option record_options[] = {
 perf_evlist__parse_mmap_pages),
OPT_BOOLEAN(0, "group", ,
"put the counters into a counter group"),
-   OPT_CALLBACK_NOOPT('g', NULL, ,
+   OPT_CALLBACK_NOOPT('g', "call-graph", ,
   NULL, "enables call-graph recording" ,
   _callchain_opt),
OPT_CALLBACK(0, "call-graph", ,
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 03d37a7..a734a1b 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1084,7 +1084,7 @@ int cmd_top(int argc, const char **argv, const char 
*prefix __maybe_unused)
   " abort, in_tx, transaction"),
OPT_BOOLEAN('n', "show-nr-samples", _conf.show_nr_samples,
"Show a column with the number of samples"),
-   OPT_CALLBACK_NOOPT('g', NULL, _opts,
+   OPT_CALLBACK_NOOPT('g', "call-graph", _opts,
   NULL, "enables call-graph recording",
   _opt),
OPT_CALLBACK(0, "call-graph", _opts,
-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/6] perf tools: Fix bug in 'perf list event_glob'.

2013-12-11 Thread Dongsheng Yang

Example:
# perf list kvmmmu

List of pre-defined events (to be used in -e):

Verification:
# perf list kvmmmu

List of pre-defined events (to be used in -e):
  kvmmmu:kvm_mmu_pagetable_walk  [Tracepoint event]
  kvmmmu:kvm_mmu_paging_element  [Tracepoint event]
  kvmmmu:kvm_mmu_set_accessed_bit[Tracepoint event]
  kvmmmu:kvm_mmu_set_dirty_bit   [Tracepoint event]
  kvmmmu:kvm_mmu_walker_error[Tracepoint event]
  kvmmmu:kvm_mmu_get_page[Tracepoint event]
  kvmmmu:kvm_mmu_sync_page   [Tracepoint event]
  kvmmmu:kvm_mmu_unsync_page [Tracepoint event]
  kvmmmu:kvm_mmu_prepare_zap_page[Tracepoint event]
  kvmmmu:mark_mmio_spte  [Tracepoint event]
  kvmmmu:handle_mmio_page_fault  [Tracepoint event]
  kvmmmu:fast_page_fault [Tracepoint event]

Signed-off-by: Dongsheng Yang 
---
 tools/perf/util/parse-events.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 969cb8f..8acfa71 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1236,6 +1236,8 @@ void print_events(const char *event_glob, bool name_only)
 
print_pmu_events(event_glob, name_only);
 
+   print_tracepoint_events(event_glob, NULL, name_only);
+
if (event_glob != NULL)
return;
 
@@ -1254,8 +1256,6 @@ void print_events(const char *event_glob, bool name_only)
event_type_descriptors[PERF_TYPE_BREAKPOINT]);
printf("\n");
}
-
-   print_tracepoint_events(NULL, NULL, name_only);
 }
 
 int parse_events__is_hardcoded_term(struct parse_events_term *term)
-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers

2013-12-11 Thread Gregory CLEMENT

Hi Bjorn, ,

On 11/12/2013 19:32, Bjorn Helgaas wrote:
> If this looks reasonable, I'll merge it via the PCI tree for v3.13.
> 
> Bjorn
> 
> 
> MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers
> 
> Add entries for PCI host controller drivers in drivers/pci/host/.
> 
> Signed-off-by: Bjorn Helgaas 
> ---
>  MAINTAINERS |   31 +++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 8285ed4676b6..826c722d92ba 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6449,16 +6449,47 @@ F:drivers/pci/
>  F:   include/linux/pci*
>  F:   arch/x86/pci/
>  
> +PCI DRIVER FOR DESIGNWARE
> +M:   Jingoo Han 
> +L:   linux-...@vger.kernel.org
> +S:   Maintained
> +F:   drivers/pci/host/*designware*
> +
> +PCI DRIVER FOR IMX6
> +M:   Shawn Guo 
> +L:   linux-...@vger.kernel.org
> +L:   linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
> +S:   Maintained
> +F:   drivers/pci/host/*imx6*
> +
> +PCI DRIVER FOR MVEBU (Marvell Armada 370 and Armada XP SOC support)
> +M:   Jason Cooper 
> +L:   linux-...@vger.kernel.org
> +L:   linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
> +S:   Maintained
> +F:   drivers/pci/host/*mvebu*

I think that Thomas Petazzoni would be more appropriate, he worked
on the mvebu PCIe since 6 moths and now know very well the
subject. Until now all the mvebu PCIe related questions were
managed by Thomas.

Regards,

Gregory



-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCHSET 00/14] tools lib traceevent: Get rid of *die() calls from parse-filter.c (v2)

2013-12-11 Thread Namhyung Kim

Hello,

This patchset tries to remove all die() calls in event filter parsing
code.  I changed two main functions of pevent_filter_add_filter_str()
and pevent_filter_match() to return a proper error code (pevent_errno).

The actual error message might be saved in a static buffer in pevent_
filter and it can be accessed by new pevent_filter_strerror() function.
The old pevent_strerror() still works for them too.

The only remaining bits are in trace-seq.c which implement print
functions and I want to hear what's the best way we can handle the
error case during the print.

I also put this patches on libtraceevent/die-removal-v2 branch in my tree

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks
Namhyung


Namhyung Kim (14):
  tools lib traceevent: Get rid of malloc_or_die() in show_error()
  tools lib traceevent: Get rid of die in add_filter_type()
  tools lib traceevent: Get rid of malloc_or_die() allocate_arg()
  tools lib traceevent: Get rid of malloc_or_die() in read_token()
  tools lib traceevent: Get rid of malloc_or_die() in find_event()
  tools lib traceevent: Get rid of die() in add_right()
  tools lib traceevent: Make add_left() return pevent_errno
  tools lib traceevent: Get rid of die() in reparent_op_arg()
  tools lib traceevent: Refactor create_arg_item()
  tools lib traceevent: Refactor process_filter()
  tools lib traceevent: Make pevent_filter_add_filter_str() return
pevent_errno
  tools lib traceevent: Refactor pevent_filter_match() to get rid of
die()
  tools lib traceevent: Get rid of die() in some string conversion
funcitons
  tools lib traceevent: Introduce pevent_filter_strerror()

 tools/lib/traceevent/event-parse.c  |  17 +-
 tools/lib/traceevent/event-parse.h  |  48 ++-
 tools/lib/traceevent/parse-filter.c | 615 ++--
 3 files changed, 417 insertions(+), 263 deletions(-)

-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 03/14] tools lib traceevent: Get rid of malloc_or_die() allocate_arg()

2013-12-11 Thread Namhyung Kim

Also check return value and handle it.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 48 ++---
 1 file changed, 40 insertions(+), 8 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 767de4f1e8ee..ab9cefe320b4 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -211,12 +211,7 @@ struct event_filter *pevent_filter_alloc(struct pevent 
*pevent)
 
 static struct filter_arg *allocate_arg(void)
 {
-   struct filter_arg *arg;
-
-   arg = malloc_or_die(sizeof(*arg));
-   memset(arg, 0, sizeof(*arg));
-
-   return arg;
+   return calloc(1, sizeof(struct filter_arg));
 }
 
 static void free_arg(struct filter_arg *arg)
@@ -369,6 +364,10 @@ create_arg_item(struct event_format *event, const char 
*token,
struct filter_arg *arg;
 
arg = allocate_arg();
+   if (arg == NULL) {
+   show_error(error_str, "failed to allocate filter arg");
+   return NULL;
+   }
 
switch (type) {
 
@@ -422,6 +421,9 @@ create_arg_op(enum filter_op_type btype)
struct filter_arg *arg;
 
arg = allocate_arg();
+   if (!arg)
+   return NULL;
+
arg->type = FILTER_ARG_OP;
arg->op.type = btype;
 
@@ -434,6 +436,9 @@ create_arg_exp(enum filter_exp_type etype)
struct filter_arg *arg;
 
arg = allocate_arg();
+   if (!arg)
+   return NULL;
+
arg->type = FILTER_ARG_EXP;
arg->op.type = etype;
 
@@ -446,6 +451,9 @@ create_arg_cmp(enum filter_exp_type etype)
struct filter_arg *arg;
 
arg = allocate_arg();
+   if (!arg)
+   return NULL;
+
/* Use NUM and change if necessary */
arg->type = FILTER_ARG_NUM;
arg->op.type = etype;
@@ -909,8 +917,10 @@ static struct filter_arg *collapse_tree(struct filter_arg 
*arg)
case FILTER_VAL_FALSE:
free_arg(arg);
arg = allocate_arg();
-   arg->type = FILTER_ARG_BOOLEAN;
-   arg->boolean.value = ret == FILTER_VAL_TRUE;
+   if (arg) {
+   arg->type = FILTER_ARG_BOOLEAN;
+   arg->boolean.value = ret == FILTER_VAL_TRUE;
+   }
}
 
return arg;
@@ -1057,6 +1067,8 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
switch (op_type) {
case OP_BOOL:
arg = create_arg_op(btype);
+   if (arg == NULL)
+   goto fail_alloc;
if (current_op)
ret = add_left(arg, current_op);
else
@@ -1067,6 +1079,8 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
 
case OP_NOT:
arg = create_arg_op(btype);
+   if (arg == NULL)
+   goto fail_alloc;
if (current_op)
ret = add_right(current_op, arg, 
error_str);
if (ret < 0)
@@ -1086,6 +1100,8 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
arg = create_arg_exp(etype);
else
arg = create_arg_cmp(ctype);
+   if (arg == NULL)
+   goto fail_alloc;
 
if (current_op)
ret = add_right(current_op, arg, 
error_str);
@@ -1119,11 +1135,16 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
current_op = current_exp;
 
current_op = collapse_tree(current_op);
+   if (current_op == NULL)
+   goto fail_alloc;
 
*parg = current_op;
 
return 0;
 
+ fail_alloc:
+   show_error(error_str, "failed to allocate filter arg");
+   goto fail;
  fail_print:
show_error(error_str, "Syntax error");
  fail:
@@ -1154,6 +1175,10 @@ process_event(struct event_format *event, const char 
*filter_str,
/* If parg is NULL, then make it into FALSE */
if (!*parg) {
*parg = allocate_arg();
+   if (*parg == NULL) {
+   show_error(error_str, "failed to allocate filter arg");
+   return -1;
+   }
(*parg)->type = FILTER_ARG_BOOLEAN;
(*parg)->boolean.value = FILTER_FALSE;
}
@@ -1177,6 +1202,10 @@ static int filter_event(struct event_filter *filter,
} else {
/* just add a TRUE arg */
arg = allocate_arg();
+

[PATCH 14/14] tools lib traceevent: Introduce pevent_filter_strerror()

2013-12-11 Thread Namhyung Kim

From: Namhyung Kim 

The pevent_filter_strerror() function is for receiving actual error
message from pevent_errno value.  To do that, add a static buffer to
event_filter for saving internal error message

If a failed function saved other information in the static buffer
returns the information, otherwise returns generic error message.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/event-parse.c  | 17 +--
 tools/lib/traceevent/event-parse.h  |  7 ++-
 tools/lib/traceevent/parse-filter.c | 98 -
 3 files changed, 61 insertions(+), 61 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index 22566c271275..2ce565a73dd5 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -5230,22 +5230,7 @@ int pevent_strerror(struct pevent *pevent __maybe_unused,
 
idx = errnum - __PEVENT_ERRNO__START - 1;
msg = pevent_error_str[idx];
-
-   switch (errnum) {
-   case PEVENT_ERRNO__MEM_ALLOC_FAILED:
-   case PEVENT_ERRNO__PARSE_EVENT_FAILED:
-   case PEVENT_ERRNO__READ_ID_FAILED:
-   case PEVENT_ERRNO__READ_FORMAT_FAILED:
-   case PEVENT_ERRNO__READ_PRINT_FAILED:
-   case PEVENT_ERRNO__OLD_FTRACE_ARG_FAILED:
-   case PEVENT_ERRNO__INVALID_ARG_TYPE:
-   snprintf(buf, buflen, "%s", msg);
-   break;
-
-   default:
-   /* cannot reach here */
-   break;
-   }
+   snprintf(buf, buflen, "%s", msg);
 
return 0;
 }
diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 3ad784f5f647..cf5db9013f2c 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -851,10 +851,13 @@ struct filter_type {
struct filter_arg   *filter;
 };
 
+#define PEVENT_FILTER_ERROR_BUFSZ  1024
+
 struct event_filter {
struct pevent   *pevent;
int filters;
struct filter_type  *event_filters;
+   charerror_buffer[PEVENT_FILTER_ERROR_BUFSZ];
 };
 
 struct event_filter *pevent_filter_alloc(struct pevent *pevent);
@@ -874,10 +877,12 @@ enum filter_trivial_type {
 enum pevent_errno pevent_filter_add_filter_str(struct event_filter *filter,
   const char *filter_str);
 
-
 enum pevent_errno pevent_filter_match(struct event_filter *filter,
  struct pevent_record *record);
 
+int pevent_filter_strerror(struct event_filter *filter, enum pevent_errno err,
+  char *buf, size_t buflen);
+
 int pevent_event_filtered(struct event_filter *filter,
  int event_id);
 
diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 32ab4396653c..c28b1a912a0c 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -38,55 +38,31 @@ struct event_list {
struct event_format *event;
 };
 
-#define MAX_ERR_STR_SIZE 256
-
-static void show_error(char **error_str, const char *fmt, ...)
+static void show_error(char *error_buf, const char *fmt, ...)
 {
unsigned long long index;
const char *input;
-   char *error;
va_list ap;
int len;
int i;
 
-   if (!error_str)
-   return;
-
input = pevent_get_input_buf();
index = pevent_get_input_buf_ptr();
len = input ? strlen(input) : 0;
 
-   error = malloc(MAX_ERR_STR_SIZE + (len*2) + 3);
-   if (error == NULL) {
-   /*
-* Maybe it's due to len is too long.
-* Retry without the input buffer part.
-*/
-   len = 0;
-
-   error = malloc(MAX_ERR_STR_SIZE);
-   if (error == NULL) {
-   /* no memory */
-   *error_str = NULL;
-   return;
-   }
-   }
-
if (len) {
-   strcpy(error, input);
-   error[len] = '\n';
+   strcpy(error_buf, input);
+   error_buf[len] = '\n';
for (i = 1; i < len && i < index; i++)
-   error[len+i] = ' ';
-   error[len + i] = '^';
-   error[len + i + 1] = '\n';
+   error_buf[len+i] = ' ';
+   error_buf[len + i] = '^';
+   error_buf[len + i + 1] = '\n';
len += i+2;
}
 
va_start(ap, fmt);
-   vsnprintf(error + len, MAX_ERR_STR_SIZE, fmt, ap);
+   vsnprintf(error_buf + len, PEVENT_FILTER_ERROR_BUFSZ - len, fmt, ap);
va_end(ap);
-
-   *error_str = error;
 }
 
 static void free_token(char *token)
@@ -370,7 +346,7 @@ static void free_events(struct event_list *events)
 
 static enum pevent_errno
 create_arg_item(struct event_format *event, const char *token,
-

[PATCH 13/14] tools lib traceevent: Get rid of die() in some string conversion funcitons

2013-12-11 Thread Namhyung Kim

Those functions are for stringify filter arguments.  As caller of
those functions handles NULL string properly, it seems that it's
enough to return NULL rather than calling die().

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 58 +++--
 1 file changed, 36 insertions(+), 22 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 9303c55128db..32ab4396653c 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -1361,8 +1361,10 @@ enum pevent_errno pevent_filter_add_filter_str(struct 
event_filter *filter,
if (ret >= 0 && pevent->test_filters) {
char *test;
test = pevent_filter_make_string(filter, 
event->event->id);
-   printf(" '%s: %s'\n", event->event->name, test);
-   free(test);
+   if (test) {
+   printf(" '%s: %s'\n", event->event->name, test);
+   free(test);
+   }
}
}
 
@@ -2097,7 +2099,9 @@ static char *op_to_str(struct event_filter *filter, 
struct filter_arg *arg)
default:
break;
}
-   str = malloc_or_die(6);
+   str = malloc(6);
+   if (str == NULL)
+   break;
if (val)
strcpy(str, "TRUE");
else
@@ -2120,7 +2124,9 @@ static char *op_to_str(struct event_filter *filter, 
struct filter_arg *arg)
}
 
len = strlen(left) + strlen(right) + strlen(op) + 10;
-   str = malloc_or_die(len);
+   str = malloc(len);
+   if (str == NULL)
+   break;
snprintf(str, len, "(%s) %s (%s)",
 left, op, right);
break;
@@ -2138,7 +2144,9 @@ static char *op_to_str(struct event_filter *filter, 
struct filter_arg *arg)
right_val = 0;
if (right_val >= 0) {
/* just return the opposite */
-   str = malloc_or_die(6);
+   str = malloc(6);
+   if (str == NULL)
+   break;
if (right_val)
strcpy(str, "FALSE");
else
@@ -2146,8 +2154,9 @@ static char *op_to_str(struct event_filter *filter, 
struct filter_arg *arg)
break;
}
len = strlen(right) + strlen(op) + 3;
-   str = malloc_or_die(len);
-   snprintf(str, len, "%s(%s)", op, right);
+   str = malloc(len);
+   if (str)
+   snprintf(str, len, "%s(%s)", op, right);
break;
 
default:
@@ -2163,9 +2172,9 @@ static char *val_to_str(struct event_filter *filter, 
struct filter_arg *arg)
 {
char *str;
 
-   str = malloc_or_die(30);
-
-   snprintf(str, 30, "%lld", arg->value.val);
+   str = malloc(30);
+   if (str)
+   snprintf(str, 30, "%lld", arg->value.val);
 
return str;
 }
@@ -2220,12 +2229,14 @@ static char *exp_to_str(struct event_filter *filter, 
struct filter_arg *arg)
op = "^";
break;
default:
-   die("oops in exp");
+   op = "[ERROR IN EXPRESSION TYPE]";
+   break;
}
 
len = strlen(op) + strlen(lstr) + strlen(rstr) + 4;
-   str = malloc_or_die(len);
-   snprintf(str, len, "%s %s %s", lstr, op, rstr);
+   str = malloc(len);
+   if (str)
+   snprintf(str, len, "%s %s %s", lstr, op, rstr);
 out:
free(lstr);
free(rstr);
@@ -2271,9 +2282,9 @@ static char *num_to_str(struct event_filter *filter, 
struct filter_arg *arg)
op = "<=";
 
len = strlen(lstr) + strlen(op) + strlen(rstr) + 4;
-   str = malloc_or_die(len);
-   sprintf(str, "%s %s %s", lstr, op, rstr);
-
+   str = malloc(len);
+   if (str)
+   sprintf(str, "%s %s %s", lstr, op, rstr);
break;
 
default:
@@ -2311,10 +2322,11 @@ static char *str_to_str(struct event_filter *filter, 
struct filter_arg *arg)
 
len = strlen(arg->str.field->name) + strlen(op) +
strlen(arg->str.val) + 6;
-   str = malloc_or_die(len);
-   snprintf(str, len, "%s %s \"%s\"",
-arg->str.field->name,
-op, arg->str.val);
+   str =

[PATCH 02/14] tools lib traceevent: Get rid of die in add_filter_type()

2013-12-11 Thread Namhyung Kim

The realloc() should check return value and not to overwrite previous
pointer in case of error.

Reviewed-by: Steven Rostedt 
Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index d4b0bac80dc8..767de4f1e8ee 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -161,11 +161,13 @@ add_filter_type(struct event_filter *filter, int id)
if (filter_type)
return filter_type;
 
-   filter->event_filters = realloc(filter->event_filters,
-   sizeof(*filter->event_filters) *
-   (filter->filters + 1));
-   if (!filter->event_filters)
-   die("Could not allocate filter");
+   filter_type = realloc(filter->event_filters,
+ sizeof(*filter->event_filters) *
+ (filter->filters + 1));
+   if (!filter_type)
+   return NULL;
+
+   filter->event_filters = filter_type;
 
for (i = 0; i < filter->filters; i++) {
if (filter->event_filters[i].event_id > id)
@@ -1180,6 +1182,12 @@ static int filter_event(struct event_filter *filter,
}
 
filter_type = add_filter_type(filter, event->id);
+   if (filter_type == NULL) {
+   show_error(error_str, "failed to add a new filter: %s",
+  filter_str ? filter_str : "true");
+   return -1;
+   }
+
if (filter_type->filter)
free_arg(filter_type->filter);
filter_type->filter = arg;
@@ -1417,6 +1425,9 @@ static int copy_filter_type(struct event_filter *filter,
arg->boolean.value = 0;
 
filter_type = add_filter_type(filter, event->id);
+   if (filter_type == NULL)
+   return -1;
+
filter_type->filter = arg;
 
free(str);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 10/14] tools lib traceevent: Refactor process_filter()

2013-12-11 Thread Namhyung Kim

From: Namhyung Kim 

So that it can return a proper pevent_errno value.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/event-parse.h  |  6 +++-
 tools/lib/traceevent/parse-filter.c | 64 +
 2 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 57b66aed8122..da942d59cc3a 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -368,7 +368,11 @@ enum pevent_flag {
_PE(REPARENT_NOT_OP,"cannot reparent other than OP"), \
_PE(REPARENT_FAILED,"failed to reparent filter OP"),  \
_PE(BAD_FILTER_ARG, "bad arg in filter tree"),\
-   _PE(UNEXPECTED_TYPE,"unexpected type (not a value)")
+   _PE(UNEXPECTED_TYPE,"unexpected type (not a value)"), \
+   _PE(ILLEGAL_TOKEN,  "illegal token"), \
+   _PE(INVALID_PAREN,  "open parenthesis cannot come here"), \
+   _PE(UNBALANCED_PAREN,   "unbalanced number of parenthesis"),  \
+   _PE(UNKNOWN_TOKEN,  "unknown token")
 
 #undef _PE
 #define _PE(__code, __str) PEVENT_ERRNO__ ## __code
diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 8d71208f0131..5aa5012a17ee 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -937,9 +937,10 @@ static int test_arg(struct filter_arg *parent, struct 
filter_arg *arg,
 }
 
 /* Remove any unknown event fields */
-static struct filter_arg *collapse_tree(struct filter_arg *arg, char 
**error_str)
+static int collapse_tree(struct filter_arg *arg,
+struct filter_arg **arg_collapsed, char **error_str)
 {
-   enum filter_vals ret;
+   int ret;
 
ret = test_arg(arg, arg, error_str);
switch (ret) {
@@ -955,6 +956,7 @@ static struct filter_arg *collapse_tree(struct filter_arg 
*arg, char **error_str
arg->boolean.value = ret == FILTER_VAL_TRUE;
} else {
show_error(error_str, "Failed to allocate filter arg");
+   ret = PEVENT_ERRNO__MEM_ALLOC_FAILED;
}
break;
 
@@ -965,10 +967,11 @@ static struct filter_arg *collapse_tree(struct filter_arg 
*arg, char **error_str
break;
}
 
-   return arg;
+   *arg_collapsed = arg;
+   return ret;
 }
 
-static int
+static enum pevent_errno
 process_filter(struct event_format *event, struct filter_arg **parg,
   char **error_str, int not)
 {
@@ -982,7 +985,7 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
enum filter_op_type btype;
enum filter_exp_type etype;
enum filter_cmp_type ctype;
-   int ret;
+   enum pevent_errno ret;
 
*parg = NULL;
 
@@ -1007,20 +1010,20 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
if (not) {
arg = NULL;
if (current_op)
-   goto fail_print;
+   goto fail_syntax;
free(token);
*parg = current_exp;
return 0;
}
} else
-   goto fail_print;
+   goto fail_syntax;
arg = NULL;
break;
 
case EVENT_DELIM:
if (*token == ',') {
-   show_error(error_str,
-  "Illegal token ','");
+   show_error(error_str, "Illegal token ','");
+   ret = PEVENT_ERRNO__ILLEGAL_TOKEN;
goto fail;
}
 
@@ -1028,19 +1031,23 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
if (left_item) {
show_error(error_str,
   "Open paren can not come 
after item");
+   ret = PEVENT_ERRNO__INVALID_PAREN;
goto fail;
}
if (current_exp) {
show_error(error_str,
   "Open paren can not come 
after expression");
+   ret = PEVENT_ERRNO__INVALID_PAREN;
goto fail;
}

[PATCH 04/14] tools lib traceevent: Get rid of malloc_or_die() in read_token()

2013-12-11 Thread Namhyung Kim

Reviewed-by: Steven Rostedt 
Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index ab9cefe320b4..246ee81e1f93 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -109,7 +109,11 @@ static enum event_type read_token(char **tok)
(strcmp(token, "=") == 0 || strcmp(token, "!") == 0) &&
pevent_peek_char() == '~') {
/* append it */
-   *tok = malloc_or_die(3);
+   *tok = malloc(3);
+   if (*tok == NULL) {
+   free_token(token);
+   return EVENT_ERROR;
+   }
sprintf(*tok, "%c%c", *token, '~');
free_token(token);
/* Now remove the '~' from the buffer */
@@ -1123,6 +1127,8 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
break;
case EVENT_NONE:
break;
+   case EVENT_ERROR:
+   goto fail_alloc;
default:
goto fail_print;
}
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 01/14] tools lib traceevent: Get rid of malloc_or_die() in show_error()

2013-12-11 Thread Namhyung Kim

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index ab402fb2dcf7..d4b0bac80dc8 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -56,7 +56,21 @@ static void show_error(char **error_str, const char *fmt, 
...)
index = pevent_get_input_buf_ptr();
len = input ? strlen(input) : 0;
 
-   error = malloc_or_die(MAX_ERR_STR_SIZE + (len*2) + 3);
+   error = malloc(MAX_ERR_STR_SIZE + (len*2) + 3);
+   if (error == NULL) {
+   /*
+* Maybe it's due to len is too long.
+* Retry without the input buffer part.
+*/
+   len = 0;
+
+   error = malloc(MAX_ERR_STR_SIZE);
+   if (error == NULL) {
+   /* no memory */
+   *error_str = NULL;
+   return;
+   }
+   }
 
if (len) {
strcpy(error, input);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 06/14] tools lib traceevent: Get rid of die() in add_right()

2013-12-11 Thread Namhyung Kim

Refactor it to return appropriate pevent_errno value.

Reviewed-by: Steven Rostedt 
Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/event-parse.h  |  8 +++-
 tools/lib/traceevent/parse-filter.c | 34 +++---
 2 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index abdfd3c606ed..89e4dfd40db6 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -358,7 +358,13 @@ enum pevent_flag {
_PE(OLD_FTRACE_ARG_FAILED,"failed to allocate field name for ftrace"),\
_PE(INVALID_ARG_TYPE,   "invalid argument type"), \
_PE(INVALID_EVENT_NAME, "invalid event name"),\
-   _PE(EVENT_NOT_FOUND,"No event found")
+   _PE(EVENT_NOT_FOUND,"no event found"),\
+   _PE(SYNTAX_ERROR,   "syntax error"),  \
+   _PE(ILLEGAL_RVALUE, "illegal rvalue"),\
+   _PE(ILLEGAL_LVALUE, "illegal lvalue for string comparison"),  \
+   _PE(INVALID_REGEX,  "regex did not compute"), \
+   _PE(ILLEGAL_STRING_CMP, "illegal comparison for string"), \
+   _PE(ILLEGAL_INTEGER_CMP,"illegal comparison for integer")
 
 #undef _PE
 #define _PE(__code, __str) PEVENT_ERRNO__ ## __code
diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index a0ab040e8f71..c08ce594cabe 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -473,8 +473,8 @@ create_arg_cmp(enum filter_exp_type etype)
return arg;
 }
 
-static int add_right(struct filter_arg *op, struct filter_arg *arg,
-char **error_str)
+static enum pevent_errno
+add_right(struct filter_arg *op, struct filter_arg *arg, char **error_str)
 {
struct filter_arg *left;
char *str;
@@ -505,9 +505,8 @@ static int add_right(struct filter_arg *op, struct 
filter_arg *arg,
case FILTER_ARG_FIELD:
break;
default:
-   show_error(error_str,
-  "Illegal rvalue");
-   return -1;
+   show_error(error_str, "Illegal rvalue");
+   return PEVENT_ERRNO__ILLEGAL_RVALUE;
}
 
/*
@@ -554,7 +553,7 @@ static int add_right(struct filter_arg *op, struct 
filter_arg *arg,
if (left->type != FILTER_ARG_FIELD) {
show_error(error_str,
   "Illegal lvalue for string 
comparison");
-   return -1;
+   return PEVENT_ERRNO__ILLEGAL_LVALUE;
}
 
/* Make sure this is a valid string compare */
@@ -573,25 +572,31 @@ static int add_right(struct filter_arg *op, struct 
filter_arg *arg,
show_error(error_str,
   "RegEx '%s' did not compute",
   str);
-   return -1;
+   return PEVENT_ERRNO__INVALID_REGEX;
}
break;
default:
show_error(error_str,
   "Illegal comparison for string");
-   return -1;
+   return PEVENT_ERRNO__ILLEGAL_STRING_CMP;
}
 
op->type = FILTER_ARG_STR;
op->str.type = op_type;
op->str.field = left->field.field;
op->str.val = strdup(str);
-   if (!op->str.val)
-   die("malloc string");
+   if (!op->str.val) {
+   show_error(error_str, "Failed to allocate 
string filter");
+   return PEVENT_ERRNO__MEM_ALLOC_FAILED;
+   }
/*
 * Need a buffer to copy data for tests
 */
-   op->str.buffer = malloc_or_die(op->str.field->size + 1);
+   op->str.buffer = malloc(op->str.field->size + 1);
+   if (!op->str.buffer) {
+   show_error(error_str, "Failed to allocate 
string filter");
+   return PEVENT_ERRNO__MEM_ALLOC_FAILED;
+   }
/* Null terminate this buffer */
op->str.buffer[op->str.field->size] = 0;
 
@@ -609,7 +614,7 @@ static

[PATCH 08/14] tools lib traceevent: Get rid of die() in reparent_op_arg()

2013-12-11 Thread Namhyung Kim

To do that, make the function returns the error code.  Also pass
error_str so that it can set proper error message when error occurred.

Reviewed-by: Steven Rostedt 
Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/event-parse.h  |  5 +-
 tools/lib/traceevent/parse-filter.c | 94 +++--
 2 files changed, 64 insertions(+), 35 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 89e4dfd40db6..5e4392d8e2d4 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -364,7 +364,10 @@ enum pevent_flag {
_PE(ILLEGAL_LVALUE, "illegal lvalue for string comparison"),  \
_PE(INVALID_REGEX,  "regex did not compute"), \
_PE(ILLEGAL_STRING_CMP, "illegal comparison for string"), \
-   _PE(ILLEGAL_INTEGER_CMP,"illegal comparison for integer")
+   _PE(ILLEGAL_INTEGER_CMP,"illegal comparison for integer"),\
+   _PE(REPARENT_NOT_OP,"cannot reparent other than OP"), \
+   _PE(REPARENT_FAILED,"failed to reparent filter OP"),  \
+   _PE(BAD_FILTER_ARG, "bad arg in filter tree")
 
 #undef _PE
 #define _PE(__code, __str) PEVENT_ERRNO__ ## __code
diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 774c3e4c1d9f..9b05892566e0 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -784,15 +784,18 @@ enum filter_vals {
FILTER_VAL_TRUE,
 };
 
-void reparent_op_arg(struct filter_arg *parent, struct filter_arg *old_child,
- struct filter_arg *arg)
+static enum pevent_errno
+reparent_op_arg(struct filter_arg *parent, struct filter_arg *old_child,
+   struct filter_arg *arg, char **error_str)
 {
struct filter_arg *other_child;
struct filter_arg **ptr;
 
if (parent->type != FILTER_ARG_OP &&
-   arg->type != FILTER_ARG_OP)
-   die("can not reparent other than OP");
+   arg->type != FILTER_ARG_OP) {
+   show_error(error_str, "can not reparent other than OP");
+   return PEVENT_ERRNO__REPARENT_NOT_OP;
+   }
 
/* Get the sibling */
if (old_child->op.right == arg) {
@@ -801,8 +804,10 @@ void reparent_op_arg(struct filter_arg *parent, struct 
filter_arg *old_child,
} else if (old_child->op.left == arg) {
ptr = _child->op.left;
other_child = old_child->op.right;
-   } else
-   die("Error in reparent op, find other child");
+   } else {
+   show_error(error_str, "Error in reparent op, find other child");
+   return PEVENT_ERRNO__REPARENT_FAILED;
+   }
 
/* Detach arg from old_child */
*ptr = NULL;
@@ -813,23 +818,29 @@ void reparent_op_arg(struct filter_arg *parent, struct 
filter_arg *old_child,
*parent = *arg;
/* Free arg without recussion */
free(arg);
-   return;
+   return 0;
}
 
if (parent->op.right == old_child)
ptr = >op.right;
else if (parent->op.left == old_child)
ptr = >op.left;
-   else
-   die("Error in reparent op");
+   else {
+   show_error(error_str, "Error in reparent op");
+   return PEVENT_ERRNO__REPARENT_FAILED;
+   }
+
*ptr = arg;
 
free_arg(old_child);
+   return 0;
 }
 
-enum filter_vals test_arg(struct filter_arg *parent, struct filter_arg *arg)
+/* Returns either filter_vals (success) or pevent_errno (failfure) */
+static int test_arg(struct filter_arg *parent, struct filter_arg *arg,
+   char **error_str)
 {
-   enum filter_vals lval, rval;
+   int lval, rval;
 
switch (arg->type) {
 
@@ -844,63 +855,68 @@ enum filter_vals test_arg(struct filter_arg *parent, 
struct filter_arg *arg)
return FILTER_VAL_NORM;
 
case FILTER_ARG_EXP:
-   lval = test_arg(arg, arg->exp.left);
+   lval = test_arg(arg, arg->exp.left, error_str);
if (lval != FILTER_VAL_NORM)
return lval;
-   rval = test_arg(arg, arg->exp.right);
+   rval = test_arg(arg, arg->exp.right, error_str);
if (rval != FILTER_VAL_NORM)
return rval;
return FILTER_VAL_NORM;
 
case FILTER_ARG_NUM:
-   lval = test_arg(arg, arg->num.left);
+   lval = test_arg(arg, arg->num.left, error_str);
if (lval != FILTER_VAL_NORM)
return lval;
-   rval = test_arg(arg, arg->num.right);
+   rval = test_arg(arg, arg->num.right, error_str);
if (rval != FILTER_VAL_NORM)
return rval;
return

[PATCH 12/14] tools lib traceevent: Refactor pevent_filter_match() to get rid of die()

2013-12-11 Thread Namhyung Kim

The test_filter() function is for testing given filter is matched to a
given record.  However it doesn't handle error cases properly so add a
new argument err to save error info during the test and also pass it
to internal test functions.

The return value of pevent_filter_match() also converted to
pevent_errno to indicate an exact error case.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/event-parse.h  |  21 --
 tools/lib/traceevent/parse-filter.c | 135 +++-
 2 files changed, 99 insertions(+), 57 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 089964e56ed4..3ad784f5f647 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -357,6 +357,8 @@ enum pevent_flag {
_PE(READ_PRINT_FAILED,  "failed to read event print fmt"),\
_PE(OLD_FTRACE_ARG_FAILED,"failed to allocate field name for ftrace"),\
_PE(INVALID_ARG_TYPE,   "invalid argument type"), \
+   _PE(INVALID_EXP_TYPE,   "invalid expression type"),   \
+   _PE(INVALID_OP_TYPE,"invalid operator type"), \
_PE(INVALID_EVENT_NAME, "invalid event name"),\
_PE(EVENT_NOT_FOUND,"no event found"),\
_PE(SYNTAX_ERROR,   "syntax error"),  \
@@ -373,12 +375,16 @@ enum pevent_flag {
_PE(INVALID_PAREN,  "open parenthesis cannot come here"), \
_PE(UNBALANCED_PAREN,   "unbalanced number of parenthesis"),  \
_PE(UNKNOWN_TOKEN,  "unknown token"), \
-   _PE(FILTER_NOT_FOUND,   "no filter found")
+   _PE(FILTER_NOT_FOUND,   "no filter found"),   \
+   _PE(NOT_A_NUMBER,   "must have number field"),\
+   _PE(NO_FILTER,  "no filters exists"), \
+   _PE(FILTER_MISS,"record does not match to filter")
 
 #undef _PE
 #define _PE(__code, __str) PEVENT_ERRNO__ ## __code
 enum pevent_errno {
PEVENT_ERRNO__SUCCESS   = 0,
+   PEVENT_ERRNO__FILTER_MATCH  = PEVENT_ERRNO__SUCCESS,
 
/*
 * Choose an arbitrary negative big number not to clash with standard
@@ -853,10 +859,11 @@ struct event_filter {
 
 struct event_filter *pevent_filter_alloc(struct pevent *pevent);
 
-#define FILTER_NONE-2
-#define FILTER_NOEXIST -1
-#define FILTER_MISS0
-#define FILTER_MATCH   1
+/* for backward compatibility */
+#define FILTER_NONEPEVENT_ERRNO__FILTER_NOT_FOUND
+#define FILTER_NOEXIST PEVENT_ERRNO__NO_FILTER
+#define FILTER_MISSPEVENT_ERRNO__FILTER_MISS
+#define FILTER_MATCH   PEVENT_ERRNO__FILTER_MATCH
 
 enum filter_trivial_type {
FILTER_TRIVIAL_FALSE,
@@ -868,8 +875,8 @@ enum pevent_errno pevent_filter_add_filter_str(struct 
event_filter *filter,
   const char *filter_str);
 
 
-int pevent_filter_match(struct event_filter *filter,
-   struct pevent_record *record);
+enum pevent_errno pevent_filter_match(struct event_filter *filter,
+ struct pevent_record *record);
 
 int pevent_event_filtered(struct event_filter *filter,
  int event_id);
diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 78440d73e0ad..9303c55128db 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -1678,8 +1678,8 @@ int pevent_filter_event_has_trivial(struct event_filter 
*filter,
}
 }
 
-static int test_filter(struct event_format *event,
-  struct filter_arg *arg, struct pevent_record *record);
+static int test_filter(struct event_format *event, struct filter_arg *arg,
+  struct pevent_record *record, enum pevent_errno *err);
 
 static const char *
 get_comm(struct event_format *event, struct pevent_record *record)
@@ -1725,15 +1725,24 @@ get_value(struct event_format *event,
 }
 
 static unsigned long long
-get_arg_value(struct event_format *event, struct filter_arg *arg, struct 
pevent_record *record);
+get_arg_value(struct event_format *event, struct filter_arg *arg,
+ struct pevent_record *record, enum pevent_errno *err);
 
 static unsigned long long
-get_exp_value(struct event_format *event, struct filter_arg *arg, struct 
pevent_record *record)
+get_exp_value(struct event_format *event, struct filter_arg *arg,
+ struct pevent_record *record, enum pevent_errno *err)
 {
unsigned long long lval, rval;
 
-   lval = get_arg_value(event, arg->exp.left, record);
-   rval = get_arg_value(event, arg->exp.right, record);
+   lval = get_arg_value(event, arg->exp.left, record,

[PATCH 11/14] tools lib traceevent: Make pevent_filter_add_filter_str() return pevent_errno

2013-12-11 Thread Namhyung Kim

From: Namhyung Kim 

Refactor the pevent_filter_add_filter_str() to return a proper error
code and get rid of the third error_str argument.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/event-parse.h  |  8 ++--
 tools/lib/traceevent/parse-filter.c | 78 +++--
 2 files changed, 27 insertions(+), 59 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index da942d59cc3a..089964e56ed4 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -372,7 +372,8 @@ enum pevent_flag {
_PE(ILLEGAL_TOKEN,  "illegal token"), \
_PE(INVALID_PAREN,  "open parenthesis cannot come here"), \
_PE(UNBALANCED_PAREN,   "unbalanced number of parenthesis"),  \
-   _PE(UNKNOWN_TOKEN,  "unknown token")
+   _PE(UNKNOWN_TOKEN,  "unknown token"), \
+   _PE(FILTER_NOT_FOUND,   "no filter found")
 
 #undef _PE
 #define _PE(__code, __str) PEVENT_ERRNO__ ## __code
@@ -863,9 +864,8 @@ enum filter_trivial_type {
FILTER_TRIVIAL_BOTH,
 };
 
-int pevent_filter_add_filter_str(struct event_filter *filter,
-const char *filter_str,
-char **error_str);
+enum pevent_errno pevent_filter_add_filter_str(struct event_filter *filter,
+  const char *filter_str);
 
 
 int pevent_filter_match(struct event_filter *filter,
diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 5aa5012a17ee..78440d73e0ad 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -1209,7 +1209,7 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
return ret;
 }
 
-static int
+static enum pevent_errno
 process_event(struct event_format *event, const char *filter_str,
  struct filter_arg **parg, char **error_str)
 {
@@ -1218,21 +1218,15 @@ process_event(struct event_format *event, const char 
*filter_str,
pevent_buffer_init(filter_str, strlen(filter_str));
 
ret = process_filter(event, parg, error_str, 0);
-   if (ret == 1) {
-   show_error(error_str,
-  "Unbalanced number of ')'");
-   return -1;
-   }
if (ret < 0)
return ret;
 
/* If parg is NULL, then make it into FALSE */
if (!*parg) {
*parg = allocate_arg();
-   if (*parg == NULL) {
-   show_error(error_str, "failed to allocate filter arg");
-   return -1;
-   }
+   if (*parg == NULL)
+   return PEVENT_ERRNO__MEM_ALLOC_FAILED;
+
(*parg)->type = FILTER_ARG_BOOLEAN;
(*parg)->boolean.value = FILTER_FALSE;
}
@@ -1240,13 +1234,13 @@ process_event(struct event_format *event, const char 
*filter_str,
return 0;
 }
 
-static int filter_event(struct event_filter *filter,
-   struct event_format *event,
-   const char *filter_str, char **error_str)
+static enum pevent_errno
+filter_event(struct event_filter *filter, struct event_format *event,
+const char *filter_str, char **error_str)
 {
struct filter_type *filter_type;
struct filter_arg *arg;
-   int ret;
+   enum pevent_errno ret;
 
if (filter_str) {
ret = process_event(event, filter_str, , error_str);
@@ -1256,20 +1250,16 @@ static int filter_event(struct event_filter *filter,
} else {
/* just add a TRUE arg */
arg = allocate_arg();
-   if (arg == NULL) {
-   show_error(error_str, "failed to allocate filter arg");
-   return -1;
-   }
+   if (arg == NULL)
+   return PEVENT_ERRNO__MEM_ALLOC_FAILED;
+
arg->type = FILTER_ARG_BOOLEAN;
arg->boolean.value = FILTER_TRUE;
}
 
filter_type = add_filter_type(filter, event->id);
-   if (filter_type == NULL) {
-   show_error(error_str, "failed to add a new filter: %s",
-  filter_str ? filter_str : "true");
-   return -1;
-   }
+   if (filter_type == NULL)
+   return PEVENT_ERRNO__MEM_ALLOC_FAILED;
 
if (filter_type->filter)
free_arg(filter_type->filter);
@@ -1282,18 +1272,12 @@ static int filter_event(struct event_filter *filter,
  * pevent_filter_add_filter_str - add a new filter
  * @filter: the event filter to add to
  * @filter_str: the filter string that contains the filter
- * @error_str: string containing reason for failed filter
- *
- * Returns 0 if the filter was successfully added
- *   -1 if there was an error.
  *
- * On

[PATCH 09/14] tools lib traceevent: Refactor create_arg_item()

2013-12-11 Thread Namhyung Kim

From: Namhyung Kim 

So that it can return a proper pevent_errno value.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/event-parse.h  |  3 ++-
 tools/lib/traceevent/parse-filter.c | 20 ++--
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 5e4392d8e2d4..57b66aed8122 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -367,7 +367,8 @@ enum pevent_flag {
_PE(ILLEGAL_INTEGER_CMP,"illegal comparison for integer"),\
_PE(REPARENT_NOT_OP,"cannot reparent other than OP"), \
_PE(REPARENT_FAILED,"failed to reparent filter OP"),  \
-   _PE(BAD_FILTER_ARG, "bad arg in filter tree")
+   _PE(BAD_FILTER_ARG, "bad arg in filter tree"),\
+   _PE(UNEXPECTED_TYPE,"unexpected type (not a value)")
 
 #undef _PE
 #define _PE(__code, __str) PEVENT_ERRNO__ ## __code
diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 9b05892566e0..8d71208f0131 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -368,9 +368,9 @@ static void free_events(struct event_list *events)
}
 }
 
-static struct filter_arg *
+static enum pevent_errno
 create_arg_item(struct event_format *event, const char *token,
-   enum event_type type, char **error_str)
+   enum event_type type, struct filter_arg **parg, char 
**error_str)
 {
struct format_field *field;
struct filter_arg *arg;
@@ -378,7 +378,7 @@ create_arg_item(struct event_format *event, const char 
*token,
arg = allocate_arg();
if (arg == NULL) {
show_error(error_str, "failed to allocate filter arg");
-   return NULL;
+   return PEVENT_ERRNO__MEM_ALLOC_FAILED;
}
 
switch (type) {
@@ -392,7 +392,7 @@ create_arg_item(struct event_format *event, const char 
*token,
if (!arg->value.str) {
free_arg(arg);
show_error(error_str, "failed to allocate string filter 
arg");
-   return NULL;
+   return PEVENT_ERRNO__MEM_ALLOC_FAILED;
}
break;
case EVENT_ITEM:
@@ -420,11 +420,11 @@ create_arg_item(struct event_format *event, const char 
*token,
break;
default:
free_arg(arg);
-   show_error(error_str, "expected a value but found %s",
-  token);
-   return NULL;
+   show_error(error_str, "expected a value but found %s", token);
+   return PEVENT_ERRNO__UNEXPECTED_TYPE;
}
-   return arg;
+   *parg = arg;
+   return 0;
 }
 
 static struct filter_arg *
@@ -993,8 +993,8 @@ process_filter(struct event_format *event, struct 
filter_arg **parg,
case EVENT_SQUOTE:
case EVENT_DQUOTE:
case EVENT_ITEM:
-   arg = create_arg_item(event, token, type, error_str);
-   if (!arg)
+   ret = create_arg_item(event, token, type, , 
error_str);
+   if (ret < 0)
goto fail;
if (!left_item)
left_item = arg;
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 05/14] tools lib traceevent: Get rid of malloc_or_die() in find_event()

2013-12-11 Thread Namhyung Kim

Make it return pevent_errno to distinguish malloc allocation failure.
Since it'll be returned to user later, add more error code.

Reviewed-by: Steven Rostedt 
Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/event-parse.h  |  4 +++-
 tools/lib/traceevent/parse-filter.c | 27 +++
 2 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 6e23f197175f..abdfd3c606ed 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -356,7 +356,9 @@ enum pevent_flag {
_PE(READ_FORMAT_FAILED, "failed to read event format"),   \
_PE(READ_PRINT_FAILED,  "failed to read event print fmt"),\
_PE(OLD_FTRACE_ARG_FAILED,"failed to allocate field name for ftrace"),\
-   _PE(INVALID_ARG_TYPE,   "invalid argument type")
+   _PE(INVALID_ARG_TYPE,   "invalid argument type"), \
+   _PE(INVALID_EVENT_NAME, "invalid event name"),\
+   _PE(EVENT_NOT_FOUND,"No event found")
 
 #undef _PE
 #define _PE(__code, __str) PEVENT_ERRNO__ ## __code
diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 246ee81e1f93..a0ab040e8f71 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -287,7 +287,7 @@ static int event_match(struct event_format *event,
!regexec(ereg, event->name, 0, NULL, 0);
 }
 
-static int
+static enum pevent_errno
 find_event(struct pevent *pevent, struct event_list **events,
   char *sys_name, char *event_name)
 {
@@ -306,23 +306,31 @@ find_event(struct pevent *pevent, struct event_list 
**events,
sys_name = NULL;
}
 
-   reg = malloc_or_die(strlen(event_name) + 3);
+   reg = malloc(strlen(event_name) + 3);
+   if (reg == NULL)
+   return PEVENT_ERRNO__MEM_ALLOC_FAILED;
+
sprintf(reg, "^%s$", event_name);
 
ret = regcomp(, reg, REG_ICASE|REG_NOSUB);
free(reg);
 
if (ret)
-   return -1;
+   return PEVENT_ERRNO__INVALID_EVENT_NAME;
 
if (sys_name) {
-   reg = malloc_or_die(strlen(sys_name) + 3);
+   reg = malloc(strlen(sys_name) + 3);
+   if (reg == NULL) {
+   regfree();
+   return PEVENT_ERRNO__MEM_ALLOC_FAILED;
+   }
+
sprintf(reg, "^%s$", sys_name);
ret = regcomp(, reg, REG_ICASE|REG_NOSUB);
free(reg);
if (ret) {
regfree();
-   return -1;
+   return PEVENT_ERRNO__INVALID_EVENT_NAME;
}
}
 
@@ -342,9 +350,9 @@ find_event(struct pevent *pevent, struct event_list 
**events,
regfree();
 
if (!match)
-   return -1;
+   return PEVENT_ERRNO__EVENT_NOT_FOUND;
if (fail)
-   return -2;
+   return PEVENT_ERRNO__MEM_ALLOC_FAILED;
 
return 0;
 }
@@ -1312,7 +1320,10 @@ int pevent_filter_add_filter_str(struct event_filter 
*filter,
/* Find this event */
ret = find_event(pevent, , strim(sys_name), 
strim(event_name));
if (ret < 0) {
-   if (event_name)
+   if (ret == PEVENT_ERRNO__MEM_ALLOC_FAILED)
+   show_error(error_str,
+  "Memory allocation failure");
+   else if (event_name)
show_error(error_str,
   "No event found under '%s.%s'",
   sys_name, event_name);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 07/14] tools lib traceevent: Make add_left() return pevent_errno

2013-12-11 Thread Namhyung Kim

From: Namhyung Kim 

So that it can propagate error properly.

Signed-off-by: Namhyung Kim 
---
 tools/lib/traceevent/parse-filter.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index c08ce594cabe..774c3e4c1d9f 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -648,7 +648,7 @@ rotate_op_right(struct filter_arg *a, struct filter_arg *b)
return arg;
 }
 
-static int add_left(struct filter_arg *op, struct filter_arg *arg)
+static enum pevent_errno add_left(struct filter_arg *op, struct filter_arg 
*arg)
 {
switch (op->type) {
case FILTER_ARG_EXP:
@@ -667,11 +667,11 @@ static int add_left(struct filter_arg *op, struct 
filter_arg *arg)
/* left arg of compares must be a field */
if (arg->type != FILTER_ARG_FIELD &&
arg->type != FILTER_ARG_BOOLEAN)
-   return -1;
+   return PEVENT_ERRNO__INVALID_ARG_TYPE;
op->num.left = arg;
break;
default:
-   return -1;
+   return PEVENT_ERRNO__INVALID_ARG_TYPE;
}
return 0;
 }
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf list: fix --raw-dump

2013-12-11 Thread Ramkumar Ramachandra

David Ahern wrote:
> Why not make raw_dump a proper argument?

Sure, that'd work too. I was thinking of a minimal way to fix the
problem myself.

> diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
> index 011195e38f21..b553d0c4ca82 100644
> --- a/tools/perf/builtin-list.c
> +++ b/tools/perf/builtin-list.c
> @@ -36,6 +38,10 @@ int cmd_list(int argc, const char
>  print_events(NULL, false);
>  return 0;
>}
> +  if (raw_dump) {
> +print_events(NULL, true);
> +return 0;
> +  }

This won't work because you've put it right below the `if (argc ==
0)`, which executes print_events(). You could move it up and get it to
work.
From 7198a494cfef43395e8683ac3a0576277b8d1d80 Mon Sep 17 00:00:00 2001
From: David Ahern 
Date: Wed, 11 Dec 2013 14:00:20 -0700
Subject: [PATCH] perf list: Fix raw-dump arg

Ramkumar reported that perf list --raw-dump was broken by 44d742e.
Fix by making raw-dump a proper argument.

Signed-off-by: David Ahern 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Ramkumar Ramachandra 
Signed-off-by: Ramkumar Ramachandra 
---
 tools/perf/builtin-list.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 011195e..2629c24 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -19,7 +19,9 @@
 int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
 {
 	int i;
+	bool raw_dump = false;
 	const struct option list_options[] = {
+		OPT_BOOLEAN(0, "raw-dump", _dump, "raw dump for completion"),
 		OPT_END()
 	};
 	const char * const list_usage[] = {
@@ -32,6 +34,10 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
 
 	setup_pager();
 
+	if (raw_dump) {
+		print_events(NULL, true);
+		return 0;
+	}
 	if (argc == 0) {
 		print_events(NULL, false);
 		return 0;
@@ -53,8 +59,6 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
 			print_hwcache_events(NULL, false);
 		else if (strcmp(argv[i], "pmu") == 0)
 			print_pmu_events(NULL, false);
-		else if (strcmp(argv[i], "--raw-dump") == 0)
-			print_events(NULL, true);
 		else {
 			char *sep = strchr(argv[i], ':'), *s;
 			int sep_idx;
-- 
1.8.5.1.113.g8cb5bef.dirty

Re: [PATCH 04/10] net: stmmac: sunxi platfrom extensions for GMAC in Allwinner A20 SoC's

2013-12-11 Thread Chen-Yu Tsai

Hi,

On Wed, Dec 11, 2013 at 10:45 PM, srinivas kandagatla
 wrote:
> Hi Chen,
>
> On 11/12/13 12:17, Chen-Yu Tsai wrote:
>
>>>
>>> I would be good to get actual picture of this hw setup, On ST the
>>> additional glue logic which sits on top of the GMAC is to resposible for
>>> selecting the correct retime clock.
>>
>> I would have liked to look at the internal design, how the dwmac core
>> is connected to the clock control, but that is out of the question.
>> Still, based on the documents, I think our clock controller is partially
>> intertwined with the GMAC. It takes GMAC's internally generated clock
>> as one of several inputs, then sends it back to the GMAC to time tx data.
>>
> This is very much similar to ST glue, one of the selected clk is used
> for retime the tx data lines. This selection is more of board dependent.
> It totally depends on how the GMAC is wired up with PHY.
>
>> Judging by the register definitions listed in the A20 manual,
>> the SoC glue layer clocks is something like this:
>>
>>_
>>   MII TX clock from PHY >-|____|> to GMAC core
>>   GMAC Int. RGMII TX clk >|___\__/__gate---|> to PHY
>>   Ext. 125MHz RGMII TX clk >--|__divider__/|
>>   ||
>>
>>
>> For MII mode, the glue layer should select the TX clock from the PHY.
>> The gate to the PHY should be disabled.
>>
>> For RGMII mode, either the internal clock generated by the GMAC core,
>> or the external 125MHz reference generated by the PHY can be selected.
>> And the clock gate to the PHY should be enabled.
>> If the 125MHz reference is used, the glue layer should select the proper
>> divider (/1, /5, /50) based on the link speed.
>>
>> For GMII mode, under 10/100 speeds, the operation matches MII mode.
>> For gigabit speeds, should use a 125MHz clock (internal or external)
>> and enable the output gate.
>>
>> The glue layer may indeed sit on top or around the GMAC core.
>> Nevertheless, its operational state does depend on the GMAC.
>> The current callbacks present in the stmmac driver are a good model
>> for this.
>
> Callbacks are OK with me, as they give good level of abstraction as you
> said.
>
> But I don't like the idea of glue drivers passing the full platform data
> to stmmac or glue driver parsing the platform data, which is going to
> look as very ugly fixups.
>
> Also, currently callbacks just take pdev, which seems to be forcing glue
> drivers to use platform data as the only data structure to pass information.
>
> My recommendation would be to add new parameter to these callbacks ,
> which can be used for to store glue private datastructure, we could
> actually use bsp_priv variable from platform data.

I agree. The original design provided .custom_data, .custom_cfg,
.bsp_priv fields in the platform data for the callbacks.

I am not aware of any users of these fields in the current kernel.
Maybe the intended users, ST platforms, have migrated to DT.

Merging the three fields would be nice, but may break some unsuspecting
user.

> So the of_data structure would have some thing like:
>
> struct stmmac_of_data {
> void * (*setup)(struct platform_device *pdev);
> void (*bus_setup)(struct platform_device *pdev, void *priv, void
> __iomem *ioaddr);
> int (*init)(struct platform_device *pdev, void *priv);
> void (*exit)(struct platform_device *pdev, void *priv);
> void (*fix_mac_speed)(struct platform_device *pdev, void *priv,
> unnsigned int speed);
>
> };
>
> setup() would return a private data struct of glue driver which can be
> stored in plat->bsp_priv. Should be done at DT parsing level.

So this would be called at the end of stmmac_probe_config_dt.
And for non-DT platforms, they should provide .bsp_priv themselves.

> Regarding the bindings, If Peppe is happy to allow optional SOC specific
> binding in it, it is Ok with me too.
>
> But all SOC specific resources names and properties have to be properly
> prefexed so that its not confused with dwmac properties.

I agree. SOC specific bindings should have different prefixes and
documented separately, along with the compatible strings.

> Regarding reset, I think we can add the support in stmmac driver itself.

Will do.

> Regarding clocks, on STi glue we can not represent the configuration in
> proper clock infrastructure.

I see. Could you give me a description of the 4 tx clock inputs?
I would like to learn a bit more about STi glue.

> Am happy to change sti glue driver to this interface style, if you are
> Ok with this approach Or if you have any other better ideas, lets discuss.
>
> Feel free to change the above proposed new APIs..

I think the original .fix_mac_speed (without *pdev) is ok.
It likely requires link speed, interface type, and any SoC data.
interface type is buried in platform data, so .setup should take
care to copy it into .bsp_priv.

About

[PATCH v8 3/4] sched/numa: use wrapper function task_faults_idx to calculate index in group_faults

2013-12-11 Thread Wanpeng Li

Use wrapper function task_faults_idx to calculate index in group_faults.

Reviewed-by: Naoya Horiguchi 
Acked-by: Mel Gorman 
Acked-by: David Rientjes 
Signed-off-by: Wanpeng Li 
---
 kernel/sched/fair.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c3f6ff9..8a00879 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -935,7 +935,8 @@ static inline unsigned long group_faults(struct task_struct 
*p, int nid)
if (!p->numa_group)
return 0;
 
-   return p->numa_group->faults[2*nid] + p->numa_group->faults[2*nid+1];
+   return p->numa_group->faults[task_faults_idx(nid, 0)] +
+   p->numa_group->faults[task_faults_idx(nid, 1)];
 }
 
 /*
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v8 1/4] sched/numa: drop sysctl_numa_balancing_settle_count sysctl

2013-12-11 Thread Wanpeng Li

Changelog:
 v7 -> v8:
  * remove references to it in Documentation/sysctl/kernel.txt 

commit 887c290e (sched/numa: Decide whether to favour task or group weights
based on swap candidate relationships) drop the check against
sysctl_numa_balancing_settle_count, this patch remove the sysctl.

Acked-by: Mel Gorman 
Reviewed-by: Rik van Riel 
Acked-by: David Rientjes 
Signed-off-by: Wanpeng Li 
---
 Documentation/sysctl/kernel.txt |5 -
 include/linux/sched/sysctl.h|1 -
 kernel/sched/fair.c |9 -
 kernel/sysctl.c |7 ---
 4 files changed, 0 insertions(+), 22 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 26b7ee4..6d48640 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -428,11 +428,6 @@ rate for each task.
 numa_balancing_scan_size_mb is how many megabytes worth of pages are
 scanned for a given scan.
 
-numa_balancing_settle_count is how many scan periods must complete before
-the schedule balancer stops pushing the task towards a preferred node. This
-gives the scheduler a chance to place the task on an alternative node if the
-preferred node is overloaded.
-
 numa_balancing_migrate_deferred is how many page migrations get skipped
 unconditionally, after a page migration is skipped because a page is shared
 with other tasks. This reduces page migration overhead, and determines
diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
index 41467f8..31e0193 100644
--- a/include/linux/sched/sysctl.h
+++ b/include/linux/sched/sysctl.h
@@ -48,7 +48,6 @@ extern unsigned int sysctl_numa_balancing_scan_delay;
 extern unsigned int sysctl_numa_balancing_scan_period_min;
 extern unsigned int sysctl_numa_balancing_scan_period_max;
 extern unsigned int sysctl_numa_balancing_scan_size;
-extern unsigned int sysctl_numa_balancing_settle_count;
 
 #ifdef CONFIG_SCHED_DEBUG
 extern unsigned int sysctl_sched_migration_cost;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fd773ad..acdef27 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -886,15 +886,6 @@ static unsigned int task_scan_max(struct task_struct *p)
return max(smin, smax);
 }
 
-/*
- * Once a preferred node is selected the scheduler balancer will prefer moving
- * a task to that node for sysctl_numa_balancing_settle_count number of PTE
- * scans. This will give the process the chance to accumulate more faults on
- * the preferred node but still allow the scheduler to move the task again if
- * the nodes CPUs are overloaded.
- */
-unsigned int sysctl_numa_balancing_settle_count __read_mostly = 4;
-
 static void account_numa_enqueue(struct rq *rq, struct task_struct *p)
 {
rq->nr_numa_running += (p->numa_preferred_nid != -1);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 34a6047..c8da99f 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -385,13 +385,6 @@ static struct ctl_table kern_table[] = {
.proc_handler   = proc_dointvec,
},
{
-   .procname   = "numa_balancing_settle_count",
-   .data   = _numa_balancing_settle_count,
-   .maxlen = sizeof(unsigned int),
-   .mode   = 0644,
-   .proc_handler   = proc_dointvec,
-   },
-   {
.procname   = "numa_balancing_migrate_deferred",
.data   = _numa_balancing_migrate_deferred,
.maxlen = sizeof(unsigned int),
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v8 4/4] sched/numa: fix period_slot recalculation

2013-12-11 Thread Wanpeng Li

Changelog:
 v3 -> v4:
  * remove period_slot recalculation

The original code is as intended and was meant to scale the difference
between the NUMA_PERIOD_THRESHOLD and local/remote ratio when adjusting
the scan period. The period_slot recalculation can be dropped.

Reviewed-by: Naoya Horiguchi 
Acked-by: Mel Gorman 
Acked-by: David Rientjes 
Signed-off-by: Wanpeng Li 
---
 kernel/sched/fair.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8a00879..e7ca79a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1356,7 +1356,6 @@ static void update_task_scan_period(struct task_struct *p,
 * scanning faster if shared accesses dominate as it may
 * simply bounce migrations uselessly
 */
-   period_slot = DIV_ROUND_UP(diff, NUMA_PERIOD_SLOTS);
ratio = DIV_ROUND_UP(private * NUMA_PERIOD_SLOTS, (private + 
shared));
diff = (diff * ratio) / NUMA_PERIOD_SLOTS;
}
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v8 2/4] sched/numa: use wrapper function task_node to get node which task is on

2013-12-11 Thread Wanpeng Li

Changelog:
 v2 -> v3:
  * tranlate cpu_to_node(task_cpu(p)) to task_node(p) in sched/debug.c

Use wrapper function task_node to get node which task is on.

Acked-by: Mel Gorman 
Reviewed-by: Naoya Horiguchi 
Reviewed-by: Rik van Riel 
Acked-by: David Rientjes 
Signed-off-by: Wanpeng Li 
---
 kernel/sched/debug.c |2 +-
 kernel/sched/fair.c  |4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 5c34d18..374fe04 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -139,7 +139,7 @@ print_task(struct seq_file *m, struct rq *rq, struct 
task_struct *p)
0LL, 0LL, 0LL, 0L, 0LL, 0L, 0LL, 0L);
 #endif
 #ifdef CONFIG_NUMA_BALANCING
-   SEQ_printf(m, " %d", cpu_to_node(task_cpu(p)));
+   SEQ_printf(m, " %d", task_node(p));
 #endif
 #ifdef CONFIG_CGROUP_SCHED
SEQ_printf(m, " %s", task_group_path(task_group(p)));
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index acdef27..c3f6ff9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1216,7 +1216,7 @@ static int task_numa_migrate(struct task_struct *p)
 * elsewhere, so there is no point in (re)trying.
 */
if (unlikely(!sd)) {
-   p->numa_preferred_nid = cpu_to_node(task_cpu(p));
+   p->numa_preferred_nid = task_node(p);
return -EINVAL;
}
 
@@ -1283,7 +1283,7 @@ static void numa_migrate_preferred(struct task_struct *p)
p->numa_migrate_retry = jiffies + HZ;
 
/* Success if task is already running on preferred CPU */
-   if (cpu_to_node(task_cpu(p)) == p->numa_preferred_nid)
+   if (task_node(p) == p->numa_preferred_nid)
return;
 
/* Otherwise, try migrate to a CPU on the preferred node */
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch] Read CONFIG_RD_ variables for initramfs compression

2013-12-11 Thread P J P

  Hello Simon, Andrew

+-- On Wed, 11 Dec 2013, Simon Guinot wrote --+
| IIUC this patch, the INITRAMFS_COMPRESSION_* options are now
| ignored/useless. Don't you think we should remove them from the
| usr/Kconfig file ?

  -> https://lkml.org/lkml/2013/11/25/21

I'v pushed a patch from Mr Hristo to the same effect. I guess it's still in 
the queue. I haven't received any review for it yet. (...Andrew?)

| Actually, I think this patch makes the initramfs compression
| configuration quite confusing. Consider the following configuration
| for a 3.13-rc3 kernel:
| 
| CONFIG_RD_GZIP=y
| CONFIG_RD_LZMA=y
| CONFIG_INITRAMFS_COMPRESSION_LZMA=y
| 
| This now produces a gzipped initramfs_data.cpio against a lzma one
| previously. 

  That is because, when multiple options are set, CONFIG_RD_GZIP is checked 
last in the usr/Makefile.
  ...
  # Gzip
  suffix_$(CONFIG_RD_GZIP)   = .gz

Hope it helps.
--
Prasad J Pandit / Red Hat Security Response Team
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 09/14] efi: passing kexec necessary efi data via setup_data

2013-12-11 Thread Dave Young

> > >  
> > > +void __init parse_efi_setup(u64 phys_addr)
> > > +{
> > > + struct setup_data *sd;
> > > +
> > > + if (!efi_enabled(EFI_64BIT)) {
> > > + pr_warn("SETUP_EFI not supported on 32-bit\n");
> > > + return;
> > > + }
> > 
> > Shouldn't this function be in two versions in efi_64.c and efi_32.c?
> > This way you don't need this check with cryptic printk message.
> 
> Ok, will update.

Rethink about this issue, moving them to efi_$(BITS).c I need move the
efi_setup from a static variable to an extern, It looks not worth.

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 08/14] efi: export efi runtime memory mapping to sysfs

2013-12-11 Thread Dave Young

> 
> > 
> > and the EFI_BOOT* tests can be done in save_runtime_map and also the
> > error handling can happen there. This way efi_map_regions() won't
> > need to know about anything. This way, you can later move the whole
> > save_runtime_map() function to efi-kexec.c just by taking it without any
> > need for untangling.
> > 
> > > +out_save_runtime:
> > > +   kfree(efi_runtime_map);
> > > +   nr_efi_runtime_map = 0;
> > > +   efi_runtime_map = NULL;
> > 
> > This can go there too.
> 
> This section can go the save_runtime_map but it looks clearer to put them 
> here.

BTW, I will restructure the whole code when I move them to efi_kexec.c,
so no worry about it? If you have strong opinion I can move them though.

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: core: Add warm reset while reset-resuming SuperSpeed HUBs

2013-12-11 Thread Julius Werner

>> ...although, the spec says that it does not wait for the port resets
>> to complete.  As far as I can see re-issuing a warm reset and waiting
>> is the only way to guarantee the core times the recovery.  Presumably
>> the portstatus debounce in hub_activate() mitigates this, but that
>> 100ms is less than a full reset timeout.

It's definitely not just a timing issue for us. I can't reproduce all
the same cases as Vikas, but when I attach a USB analyzer to the ones
I do see the host controller doesn't even start sending a reset.

>>> The xHCI spec requires that when the xHCI host is reset, a USB reset is
>>> driven down the USB 3.0 ports.  If hot reset fails, the port may migrate
>>> to warm reset.  See table 32 in the xHCI spec, in the definition of
>>> HCRST.  It sounds like this host doesn't drive a USB reset down USB 3.0
>>> ports at all on host controller reset?

Oh, interesting, I hadn't seen that yet. So I guess the spec itself is
fine if it were followed to the letter.

I did some more tests about this on my Exynos machine: when I put a
device to autosuspend (U3) and manually poke the xHC reset bit, I do
see an automatic warm reset on the analyzer and the ports manage to
retrain to U0. But after a system suspend/resume which calls
xhci_reset() in the process, there is no reset on the wire. I also
noticed that it doesn't drive a reset (even after manual poking) when
there is no device connected on the other end of the analyzer.

So this might be our problem: maybe these host controllers (Synopsys
DesignWare) issue the spec-mandated warm reset only on ports where
they think there is a device attached. But after a system
suspend/resume (where the whole IP block on the SoC was powered down),
the host controller cannot know that there is still a device with an
active power session attached, and therefore doesn't drive the reset
on its own.

Even though this is a host controller bug, we still have to deal with
it somehow. I guess we could move the code into xhci_plat_resume() and
hide it behind a quirk to lessen the impact. But since reset_resume is
not a common case for most host controllers, it's hard to say if this
is DesignWare specific or a more widespread implementation mistake.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers

2013-12-11 Thread Shawn Guo

Hi Bjorn,

On Wed, Dec 11, 2013 at 11:32:37AM -0700, Bjorn Helgaas wrote:
> +PCI DRIVER FOR IMX6
> +M:   Shawn Guo 

Thanks for the nomination.  But I think a better person for this
position would be Richard Zhu  (copied).  He knows
the driver and controller much better than myself, and most importantly
he is the driver owner for Freescale kernel and he has the contact to
Freescale PCIe hardware people.

Shawn

> +L:   linux-...@vger.kernel.org
> +L:   linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
> +S:   Maintained
> +F:   drivers/pci/host/*imx6*

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 0/5] net: macb updates

2013-12-11 Thread Olof Johansson

Hi Soren,

On Tue, Dec 10, 2013 at 4:07 PM, Soren Brinkmann
 wrote:

> Soren Brinkmann (5):
>   net: macb: Adjust tx_clk when link speed changes

This patch causes build issues on some at91 platforms, namely
at91sam9263 that lacks programmable clocks. So it doesn't implement
clk_set_rate() and clk_round_rate().

I don't know if there's any reasonable config option to check for
(that wouldn't add at91-specific stuff to the driver which we don't
want). So I suspect the best way would be to implement dummy versions
for at91 when CONFIG_AT91_PROGRAMMABLE_CLOCKS isn't set.

Nicolas, you OK with that? It'd be something like the below
(copy-paste, whitespace damage, just RFC):

diff --git a/arch/arm/mach-at91/clock.c b/arch/arm/mach-at91/clock.c
index 6b2630a..17c52a7 100644
--- a/arch/arm/mach-at91/clock.c
+++ b/arch/arm/mach-at91/clock.c
@@ -459,6 +459,22 @@ static void __init init_programmable_clock(struct clk *clk)
clk->rate_hz = parent->rate_hz / pmc_prescaler_divider(pckr);
 }

+#else  /* CONFIG_AT91_PROGRAMMABLE_CLOCKS */
+
+int clk_set_rate(struct clk *clk, unsigned long rate)
+{
+   if (rate == clk_get_rate(clk))
+   return 0;
+
+   return -EINVAL;
+}
+
+long clk_round_rate(struct clk *clk, unsigned long rate)
+{
+   /* There's really nothing sane to return here. */
+   return clk_get_rate(clk);
+}
+
 #endif /* CONFIG_AT91_PROGRAMMABLE_CLOCKS */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 4/4] sched/numa: fix period_slot recalculation

2013-12-11 Thread David Rientjes

On Thu, 12 Dec 2013, Wanpeng Li wrote:

> Changelog:
>  v3 -> v4:
>   * remove period_slot recalculation
> 
> The original code is as intended and was meant to scale the difference
> between the NUMA_PERIOD_THRESHOLD and local/remote ratio when adjusting
> the scan period. The period_slot recalculation can be dropped.
> 
> Reviewed-by: Naoya Horiguchi 
> Acked-by: Mel Gorman 
> Signed-off-by: Wanpeng Li 

Acked-by: David Rientjes 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Linux 3.12.5

2013-12-11 Thread Greg KH

I'm announcing the release of the 3.12.5 kernel.

All users of the 3.12 kernel series must upgrade.

The updated 3.12.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.12.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile|2 
 arch/arm/boot/dts/armada-370-db.dts |   28 +-
 arch/arm/boot/dts/armada-370-xp.dtsi|2 
 arch/arm/boot/dts/armada-xp-mv78230.dtsi|   24 +-
 arch/arm/boot/dts/armada-xp-mv78260.dtsi|  109 --
 arch/arm/boot/dts/omap4-panda-common.dtsi   |   20 -
 arch/arm/configs/multi_v7_defconfig |2 
 arch/arm/include/asm/pgtable.h  |2 
 arch/arm/mach-at91/sama5d3.c|6 
 arch/arm/mach-footbridge/common.c   |3 
 arch/arm/mach-footbridge/dc21285.c  |2 
 arch/arm/mach-footbridge/ebsa285.c  |   22 +-
 arch/arm/mm/mmap.c  |2 
 arch/arm/mm/pgd.c   |3 
 arch/parisc/kernel/sys_parisc.c |   25 +-
 arch/s390/crypto/aes_s390.c |   31 +--
 arch/x86/Makefile   |8 
 block/blk-cgroup.h  |8 
 crypto/algif_hash.c |3 
 crypto/algif_skcipher.c |3 
 crypto/authenc.c|7 
 crypto/ccm.c|3 
 drivers/ata/libata-scsi.c   |1 
 drivers/char/i8k.c  |7 
 drivers/cpuidle/cpuidle.c   |2 
 drivers/firewire/sbp2.c |1 
 drivers/firmware/efi/efi-pstore.c   |  163 ++--
 drivers/firmware/efi/efivars.c  |   12 -
 drivers/firmware/efi/vars.c |   12 -
 drivers/gpio/gpio-mpc8xxx.c |8 
 drivers/input/Kconfig   |2 
 drivers/input/keyboard/Kconfig  |4 
 drivers/input/serio/Kconfig |6 
 drivers/misc/enclosure.c|7 
 drivers/misc/mei/hw-me-regs.h   |6 
 drivers/misc/mei/pci-me.c   |5 
 drivers/net/can/c_can/c_can.c   |   21 +-
 drivers/net/can/flexcan.c   |2 
 drivers/net/can/sja1000/sja1000.c   |   17 -
 drivers/net/ethernet/broadcom/tg3.c |   12 -
 drivers/net/wireless/iwlwifi/dvm/tx.c   |   14 -
 drivers/pnp/driver.c|   12 +
 drivers/scsi/3w-9xxx.c  |3 
 drivers/scsi/3w-sas.c   |3 
 drivers/scsi/3w-.c  |3 
 drivers/scsi/aacraid/linit.c|1 
 drivers/scsi/arcmsr/arcmsr_hba.c|1 
 drivers/scsi/bfa/bfa_fcs.h  |1 
 drivers/scsi/bfa/bfa_fcs_lport.c|   14 +
 drivers/scsi/bfa/bfad_attr.c|7 
 drivers/scsi/gdth.c |1 
 drivers/scsi/hosts.c|1 
 drivers/scsi/hpsa.c |5 
 drivers/scsi/ipr.c  |3 
 drivers/scsi/ips.c  |1 
 drivers/scsi/libsas/sas_ata.c   |2 
 drivers/scsi/megaraid.c |1 
 drivers/scsi/megaraid/megaraid_mbox.c   |1 
 drivers/scsi/megaraid/megaraid_sas_base.c   |1 
 drivers/scsi/pmcraid.c  |1 
 drivers/scsi/sd.c   |6 
 drivers/scsi/storvsc_drv.c  |1 
 drivers/spi/spi-pxa2xx.c|2 
 drivers/tty/n_tty.c |6 
 drivers/usb/class/cdc-acm.c |2 
 drivers/usb/serial/ftdi_sio.c   |   37 ++-
 drivers/usb/serial/mos7840.c|   32 +--
 drivers/usb/serial/pl2303.c |   30 +-
 drivers/usb/serial/spcp8x5.c|   30 +-

Re: Linux 3.4.74

2013-12-11 Thread Greg KH


diff --git a/Documentation/i2c/busses/i2c-i801 
b/Documentation/i2c/busses/i2c-i801
index 99d4e442b77d..8bb57d7c12ea 100644
--- a/Documentation/i2c/busses/i2c-i801
+++ b/Documentation/i2c/busses/i2c-i801
@@ -22,6 +22,7 @@ Supported adapters:
   * Intel Panther Point (PCH)
   * Intel Lynx Point (PCH)
   * Intel Lynx Point-LP (PCH)
+  * Intel Avoton (SOC)
Datasheets: Publicly available at the Intel website
 
 On Intel Patsburg and later chipsets, both the normal host SMBus controller
diff --git a/Makefile b/Makefile
index 2ea579016292..ce277ff0fd72 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 4
-SUBLEVEL = 73
+SUBLEVEL = 74
 EXTRAVERSION =
 NAME = Saber-toothed Squirrel
 
diff --git a/arch/um/os-Linux/start_up.c b/arch/um/os-Linux/start_up.c
index 425162e22af5..2f53b892fd80 100644
--- a/arch/um/os-Linux/start_up.c
+++ b/arch/um/os-Linux/start_up.c
@@ -15,6 +15,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include "init.h"
 #include "os.h"
diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 850246206b12..585c3b279feb 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -117,6 +117,9 @@ static ssize_t hash_sendpage(struct socket *sock, struct 
page *page,
if (flags & MSG_SENDPAGE_NOTLAST)
flags |= MSG_MORE;
 
+   if (flags & MSG_SENDPAGE_NOTLAST)
+   flags |= MSG_MORE;
+
lock_sock(sk);
sg_init_table(ctx->sgl.sg, 1);
sg_set_page(ctx->sgl.sg, page, size, offset);
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index a19c027b29bd..918a3b4148b8 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -381,6 +381,9 @@ static ssize_t skcipher_sendpage(struct socket *sock, 
struct page *page,
if (flags & MSG_SENDPAGE_NOTLAST)
flags |= MSG_MORE;
 
+   if (flags & MSG_SENDPAGE_NOTLAST)
+   flags |= MSG_MORE;
+
lock_sock(sk);
if (!ctx->more && ctx->used)
goto unlock;
diff --git a/crypto/authenc.c b/crypto/authenc.c
index 5ef7ba6b6a76..d21da2f0f508 100644
--- a/crypto/authenc.c
+++ b/crypto/authenc.c
@@ -368,9 +368,10 @@ static void crypto_authenc_encrypt_done(struct 
crypto_async_request *req,
if (!err) {
struct crypto_aead *authenc = crypto_aead_reqtfm(areq);
struct crypto_authenc_ctx *ctx = crypto_aead_ctx(authenc);
-   struct ablkcipher_request *abreq = aead_request_ctx(areq);
-   u8 *iv = (u8 *)(abreq + 1) +
-crypto_ablkcipher_reqsize(ctx->enc);
+   struct authenc_request_ctx *areq_ctx = aead_request_ctx(areq);
+   struct ablkcipher_request *abreq = (void *)(areq_ctx->tail
+   + ctx->reqoff);
+   u8 *iv = (u8 *)abreq - crypto_ablkcipher_ivsize(ctx->enc);
 
err = crypto_authenc_genicv(areq, iv, 0);
}
diff --git a/crypto/ccm.c b/crypto/ccm.c
index 32fe1bb5decb..18d64ad0433c 100644
--- a/crypto/ccm.c
+++ b/crypto/ccm.c
@@ -271,7 +271,8 @@ static int crypto_ccm_auth(struct aead_request *req, struct 
scatterlist *plain,
}
 
/* compute plaintext into mac */
-   get_data_to_compute(cipher, pctx, plain, cryptlen);
+   if (cryptlen)
+   get_data_to_compute(cipher, pctx, plain, cryptlen);
 
 out:
return err;
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 60662545cd14..c20f1578d393 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -268,6 +268,30 @@ static const struct pci_device_id ahci_pci_tbl[] = {
{ PCI_VDEVICE(INTEL, 0x8c07), board_ahci }, /* Lynx Point RAID */
{ PCI_VDEVICE(INTEL, 0x8c0e), board_ahci }, /* Lynx Point RAID */
{ PCI_VDEVICE(INTEL, 0x8c0f), board_ahci }, /* Lynx Point RAID */
+   { PCI_VDEVICE(INTEL, 0x9c02), board_ahci }, /* Lynx Point-LP AHCI */
+   { PCI_VDEVICE(INTEL, 0x9c03), board_ahci }, /* Lynx Point-LP AHCI */
+   { PCI_VDEVICE(INTEL, 0x9c04), board_ahci }, /* Lynx Point-LP RAID */
+   { PCI_VDEVICE(INTEL, 0x9c05), board_ahci }, /* Lynx Point-LP RAID */
+   { PCI_VDEVICE(INTEL, 0x9c06), board_ahci }, /* Lynx Point-LP RAID */
+   { PCI_VDEVICE(INTEL, 0x9c07), board_ahci }, /* Lynx Point-LP RAID */
+   { PCI_VDEVICE(INTEL, 0x9c0e), board_ahci }, /* Lynx Point-LP RAID */
+   { PCI_VDEVICE(INTEL, 0x9c0f), board_ahci }, /* Lynx Point-LP RAID */
+   { PCI_VDEVICE(INTEL, 0x1f22), board_ahci }, /* Avoton AHCI */
+   { PCI_VDEVICE(INTEL, 0x1f23), board_ahci }, /* Avoton AHCI */
+   { PCI_VDEVICE(INTEL, 0x1f24), board_ahci }, /* Avoton RAID */
+   { PCI_VDEVICE(INTEL, 0x1f25), board_ahci }, /* Avoton RAID */
+   { PCI_VDEVICE(INTEL, 0x1f26), board_ahci }, /* Avoton RAID */
+   { PCI_VDEVICE(INTEL, 0x1f27), board_ahci }, /* Avoton RAID */
+   { PCI_VDEVICE(INTEL, 0x1f2e), board_ahci }, /* Avoton RAID */
+   {

Linux 3.10.24

2013-12-11 Thread Greg KH

I'm announcing the release of the 3.10.24 kernel.

All users of the 3.10 kernel series must upgrade.

The updated 3.10.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.10.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile  |2 
 arch/arm/boot/dts/armada-370-xp.dtsi  |2 
 arch/arm/boot/dts/armada-xp-mv78230.dtsi  |   16 +++---
 arch/arm/boot/dts/armada-xp-mv78260.dtsi  |   78 --
 arch/arm/include/asm/pgtable.h|2 
 arch/arm/mach-at91/sama5d3.c  |6 +-
 arch/arm/mach-footbridge/common.c |3 +
 arch/arm/mach-footbridge/dc21285.c|2 
 arch/arm/mach-footbridge/ebsa285.c|   22 +---
 arch/arm/mm/mmap.c|2 
 arch/arm/mm/pgd.c |3 -
 arch/parisc/kernel/sys_parisc.c   |   25 +
 arch/s390/crypto/aes_s390.c   |   31 ++-
 arch/x86/Makefile |8 ++-
 block/blk-cgroup.h|8 +--
 crypto/algif_hash.c   |3 +
 crypto/algif_skcipher.c   |3 +
 crypto/authenc.c  |7 +-
 crypto/ccm.c  |3 -
 drivers/ata/libata-scsi.c |1 
 drivers/char/i8k.c|7 ++
 drivers/firewire/sbp2.c   |1 
 drivers/gpio/gpio-mpc8xxx.c   |8 ++-
 drivers/hid/hid-ids.h |5 +
 drivers/hid/usbhid/hid-quirks.c   |3 +
 drivers/input/Kconfig |2 
 drivers/input/keyboard/Kconfig|4 -
 drivers/input/serio/Kconfig   |6 +-
 drivers/misc/enclosure.c  |7 ++
 drivers/misc/mei/hw-me-regs.h |6 +-
 drivers/misc/mei/pci-me.c |5 +
 drivers/net/can/c_can/c_can.c |   21 +---
 drivers/net/can/sja1000/sja1000.c |   17 +++---
 drivers/net/ethernet/broadcom/tg3.c   |   12 ++--
 drivers/net/ethernet/smsc/smc91x.h|   22 +---
 drivers/net/wireless/iwlwifi/dvm/tx.c |   14 +
 drivers/scsi/3w-9xxx.c|3 -
 drivers/scsi/3w-sas.c |3 -
 drivers/scsi/3w-.c|3 -
 drivers/scsi/aacraid/linit.c  |1 
 drivers/scsi/arcmsr/arcmsr_hba.c  |1 
 drivers/scsi/bfa/bfa_fcs.h|1 
 drivers/scsi/bfa/bfa_fcs_lport.c  |   14 -
 drivers/scsi/bfa/bfad_attr.c  |7 --
 drivers/scsi/gdth.c   |1 
 drivers/scsi/hosts.c  |1 
 drivers/scsi/hpsa.c   |5 +
 drivers/scsi/ipr.c|3 -
 drivers/scsi/ips.c|1 
 drivers/scsi/libsas/sas_ata.c |2 
 drivers/scsi/megaraid.c   |1 
 drivers/scsi/megaraid/megaraid_mbox.c |1 
 drivers/scsi/megaraid/megaraid_sas_base.c |1 
 drivers/scsi/pmcraid.c|1 
 drivers/scsi/sd.c |6 ++
 drivers/scsi/storvsc_drv.c|1 
 drivers/usb/class/cdc-acm.c   |2 
 drivers/usb/serial/ftdi_sio.c |   37 +-
 drivers/usb/serial/mos7840.c  |   32 ++--
 drivers/usb/serial/pl2303.c   |   32 +---
 drivers/usb/serial/spcp8x5.c  |   30 +--
 drivers/xen/grant-table.c |6 +-
 fs/nfs/nfs4proc.c |   10 +++
 fs/pipe.c |   39 +++
 include/crypto/scatterwalk.h  |3 -
 include/linux/genalloc.h  |4 -
 include/scsi/scsi_host.h  |6 ++
 kernel/irq/pm.c   |2 
 kernel/time/timekeeping.c |2 
 lib/genalloc.c|   19 ---
 net/ipv4/udp.c|3 +
 sound/pci/hda/patch_realtek.c |   55 ++---
 sound/soc/codecs/wm8731.c |4 -
 sound/soc/codecs/wm8990.c |2 
 74 files changed, 456 insertions(+), 256 deletions(-)

AceLan Kao (2):
  HID: usbhid: quirk for Synaptics Large Touchccreen
  HID: usbhid: quirk for SiS Touchscreen

Alan Cox (1):
  drivers/char/i8k.c: add Dell XPLS L421X

Arnaud Ebalard (2):
  ARM: mvebu: fix second and third PCIe unit of Armada XP mv78260
  ARM: mvebu: second PCIe unit of Armada XP mv78230 is only x1 capable

Bo Shen (1):
  ASoC: wm8731: fix dsp mode configuration

Colin Leitner (4):
  USB: pl2303: fixed handling of CS5 setting
  USB: ftdi_sio: fixed

Linux 3.4.74

2013-12-11 Thread Greg KH

I'm announcing the release of the 3.4.74 kernel.

All users of the 3.4 kernel series must upgrade.

The updated 3.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.4.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Documentation/i2c/busses/i2c-i801  |1 +
 Makefile   |2 +-
 arch/um/os-Linux/start_up.c|2 ++
 crypto/algif_hash.c|3 +++
 crypto/algif_skcipher.c|3 +++
 crypto/authenc.c   |7 ---
 crypto/ccm.c   |3 ++-
 drivers/ata/ahci.c |   24 
 drivers/char/i8k.c |7 +++
 drivers/gpio/gpio-mpc8xxx.c|8 ++--
 drivers/i2c/busses/Kconfig |1 +
 drivers/i2c/busses/i2c-i801.c  |3 +++
 drivers/input/Kconfig  |2 +-
 drivers/input/keyboard/Kconfig |4 ++--
 drivers/input/serio/Kconfig|6 +++---
 drivers/misc/enclosure.c   |7 +++
 drivers/net/ethernet/smsc/smc91x.h |   22 --
 drivers/scsi/hpsa.c|4 ++--
 drivers/scsi/libsas/sas_ata.c  |2 +-
 drivers/usb/class/cdc-acm.c|2 ++
 drivers/usb/serial/mos7840.c   |   32 
 drivers/usb/serial/pl2303.c|   32 +++-
 drivers/usb/serial/spcp8x5.c   |   30 ++
 fs/nfs/nfs4proc.c  |   10 --
 include/crypto/scatterwalk.h   |3 ++-
 kernel/irq/pm.c|2 +-
 net/ipv4/udp.c |3 +++
 sound/soc/codecs/wm8731.c  |4 ++--
 sound/soc/codecs/wm8990.c  |2 ++
 29 files changed, 142 insertions(+), 89 deletions(-)

Alan Cox (1):
  drivers/char/i8k.c: add Dell XPLS L421X

Bo Shen (1):
  ASoC: wm8731: fix dsp mode configuration

Colin Leitner (3):
  USB: pl2303: fixed handling of CS5 setting
  USB: mos7840: correct handling of CS5 setting
  USB: spcp8x5: correct handling of CS5 setting

Dan Williams (1):
  SCSI: libsas: fix usage of ata_tf_to_fis

David Cluytens (1):
  USB: cdc-acm: Added support for the Lenovo RD02-D400 USB Modem

Greg Kroah-Hartman (1):
  Linux 3.4.74

Horia Geanta (1):
  crypto: ccm - Fix handling of zero plaintext when computing mac

James Bottomley (1):
  SCSI: enclosure: fix WARN_ON in dual path device removing

James Ralston (1):
  ahci: Add Device IDs for Intel Lynx Point-LP PCH

Laxman Dewangan (1):
  irq: Enable all irqs unconditionally in irq_resume

Linus Walleij (1):
  net: smc91: fix crash regression on the versatile

Liu Gang (1):
  powerpc/gpio: Fix the wrong GPIO input data on MPC8572/MPC8536

Mark Brown (1):
  ASoC: wm8990: Mark the register map as dirty when powering down

Sergei Trofimovich (1):
  um: add missing declaration of 'getrlimit()' and friends

Seth Heasley (2):
  ahci: AHCI-mode SATA patch for Intel Avoton DeviceIDs
  i2c: i801: SMBus patch for Intel Avoton DeviceIDs

Shawn Landden (1):
  net: update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLAST

Stephen M. Cameron (2):
  SCSI: hpsa: do not discard scsi status on aborted commands
  SCSI: hpsa: return 0 from driver probe function on success, not 1

Tom Gundersen (2):
  Input: allow deselecting serio drivers even without CONFIG_EXPERT
  Input: mousedev - allow disabling even without CONFIG_EXPERT

Tom Lendacky (3):
  crypto: scatterwalk - Set the chain pointer indication bit
  crypto: authenc - Find proper IV address in ablkcipher callback
  crypto: scatterwalk - Use sg_chain_ptr on chain entries

Trond Myklebust (1):
  NFSv4: Update list of irrecoverable errors on DELEGRETURN



signature.asc
Description: Digital signature

Re: [PATCH v7 3/4] sched/numa: use wrapper function task_faults_idx to calculate index in group_faults

2013-12-11 Thread David Rientjes

On Thu, 12 Dec 2013, Wanpeng Li wrote:

> Use wrapper function task_faults_idx to calculate index in group_faults.
> 
> Reviewed-by: Naoya Horiguchi 
> Acked-by: Mel Gorman 
> Signed-off-by: Wanpeng Li 

Acked-by: David Rientjes 

The naming of task_faults_idx() is a little unfortunate since it is now 
used to index into both task_faults() and group_faults(), though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 2/4] sched/numa: use wrapper function task_node to get node which task is on

2013-12-11 Thread David Rientjes

On Thu, 12 Dec 2013, Wanpeng Li wrote:

> Changelog:
>  v2 -> v3:
>   * tranlate cpu_to_node(task_cpu(p)) to task_node(p) in sched/debug.c
> 
> Use wrapper function task_node to get node which task is on.
> 
> Acked-by: Mel Gorman 
> Reviewed-by: Naoya Horiguchi 
> Reviewed-by: Rik van Riel 
> Signed-off-by: Wanpeng Li 

Acked-by: David Rientjes 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] hfsplus: Remove hfsplus_file_lookup

2013-12-11 Thread Vyacheslav Dubeyko

On Wed, 2013-12-11 at 21:08 +, Anton Altaparmakov wrote:
> Hi,
> 
> On 11 Dec 2013, at 19:11, Al Viro  wrote:
> > On Wed, Dec 11, 2013 at 10:49:29PM +0300, Vyacheslav Dubeyko wrote:
> >> This feature worked earlier under Linux. So, I suppose that some changes 
> >> in HFS+ driver
> >> or in VFS broke it. And it needs to investigate and fix the reported 
> >> issue. Thank you for the
> >> report.
> > 
> > This "feature" is severely broken and yes, outright removal is what I'd
> > suggest for a fix.  HFS+ allows hardlinks to files, which means that
> > you allow multiple dentries for the same inode with ->lookup() in it,
> > which is asking for deadlocks.
> > 
> > This is fundamentally not supported.  Considering that forks are lousy
> > idea in the first place, I'd seriously suggest to remove that idiocy for
> > good.
> 
> Completely agree with Al.  If anyone really wants access to forks they can 
> implement them via the xattr interface (ok it has the 64k limitation but most 
> forks are quite small so not much of an issue).  That's how I implemented 
> access to named streams in Tuxera NTFS and it works a treat (and allows Linux 
> apps and various security modules that require xattr support to work properly 
> which is also great).
> 

Yes, I have the same considerations about using xattr way for the case
of resource fork after the night.

Usually, a file under HFS+ has or valid data fork, or valid resource
fork. So, HFS+ compressed file has valid resource fork only. Also alias
under Mac OS X has valid resource fork only. Of course, regular file can
have as valid data fork as valid resource fork. Fortunately, such case
is rare now (when file has both forks are valid). So, we can use xattr
way for accessing resource fork for such files. For example, it is
possible to use "osx.ResourceFork" xattr's name. And I suppose that 64
KB is reasonable limitation. Now we have access to FinderInfo fields of
CatalogFile's record for file under HFS+ by means of
"com.apple.FinderInfo" xattr.

I think that I can implement support of resource forks by means of xattr
way. Also, currently, I am implementing HFS+ compressed files support.
So, I can clean up old-fashioned way of resource forks support in HFS+
driver because of necessity to rework it anyway. The suggested patch
doesn't make all necessary cleanup, from my viewpoint.

Any comments?

Thanks,
Vyacheslav Dubeyko.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 1/4] sched/numa: drop sysctl_numa_balancing_settle_count sysctl

2013-12-11 Thread David Rientjes

On Thu, 12 Dec 2013, Wanpeng Li wrote:

> commit 887c290e (sched/numa: Decide whether to favour task or group weights
> based on swap candidate relationships) drop the check against
> sysctl_numa_balancing_settle_count, this patch remove the sysctl.
> 

What about the references to it in Documentation/sysctl/kernel.txt?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] ARM: sunxi: Add an ahci-platform compatible AHCI driver for the Allwinner SUNXi series of SoCs

2013-12-11 Thread Shawn Guo

On Wed, Dec 11, 2013 at 03:51:51PM +0100, Olliver Schinagl wrote:
> Working on this and studying the existing
> ahci_platform/shci_platform drivers the last few days and was
> figuring out why ahci_platform only supports 1 clock. IMX handles
> this by having 3 clocks defined in the DT, the first one gets
> enabled by default via ahci_platform, the other 2 get enabled in
> IMX's probe function.
> 
> Is it an idea to extend this to support all clocks that would be
> required (via a callback)?

Not really.  We did this for ahci_imx driver only because we do not want
to churn generic ahci_platform driver with those imx specific setup
code.  Note, beside the additional two clocks, we have some PHY
parameters to set up in IMX IOMUXC general purpose registers, and vendor
specific register HOST_TIMER1MS to be set up as well.

> Or do we prefer having the clocks
> separated for other technical reasons? Or do we want to handle the
> clocks via the ahci_platform framework and extend hpriv->clk to an
> array of clocks?

The direction of the generic ahci platform driver will be having it be
a library providing helper functions, as discussed as below.

https://lkml.org/lkml/2013/12/6/153

We can ask the helper function to handle the common clocks and leave the
platform specific ones to platform driver.

Shawn

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V7 2/2] arm64: perf: add support for percpu pmu interrupt

2013-12-11 Thread Vinayak Kale

Hi Will,

On Tue, Dec 10, 2013 at 1:00 PM, Vinayak Kale  wrote:
> Hi Will,
>
>
> On Mon, Dec 9, 2013 at 10:20 PM, Will Deacon  wrote:
>> Hi Vinayak,
>>
>> On Wed, Dec 04, 2013 at 10:09:51AM +, Vinayak Kale wrote:
>>> Add support for irq registration when pmu interrupt is percpu.
>>
>> Getting closer...
>>
>>> Signed-off-by: Vinayak Kale 
>>> Signed-off-by: Tuan Phan 
>>> ---
>>>  arch/arm64/kernel/perf_event.c |  108 
>>> +---
>>>  1 file changed, 78 insertions(+), 30 deletions(-)
>>>
>>> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
>>> index cea1594..d8e6667 100644
>>> --- a/arch/arm64/kernel/perf_event.c
>>> +++ b/arch/arm64/kernel/perf_event.c
>>> @@ -22,6 +22,7 @@
>>>
>>>  #include 
>>>  #include 
>>> +#include 
>>>  #include 
>>>  #include 
>>>  #include 
>>> @@ -363,26 +364,52 @@ validate_group(struct perf_event *event)
>>>  }
>>>
>>>  static void
>>> +armpmu_disable_percpu_irq(void *data)
>>> +{
>>> + disable_percpu_irq((long)data);
>>> +}
>>
>> Given that we wait for the CPUs to finish enabling/disabling the IRQ, I
>> actually meant pass the pointer to the IRQ, which removes the horrible
>> casts in the caller.
>>
>>> + if (irq_is_percpu(irq)) {
>>> + cpumask_clear(>active_irqs);
>>
>> Thanks for moving the mask manipulation out. It now makes it obvious that we
>> don't care about the mask at all for PPIs, so that can be removed (the code
>> you have is racy against hotplug anyway).
>>
>> I took the liberty of writing a fixup for you (see below). Can you test it
>> on your platform please?
>
> Below fixup works fine on APM platform.
> Do you want me to send this fixup as part of next revision of the
> patch or will you apply it yourself? (For later case, you have my ack)

Any comments? Do I need to send the fix-up in next revision of patch?

Thanks
-Vinayak
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 09/14] efi: passing kexec necessary efi data via setup_data

2013-12-11 Thread Dave Young

> > > + */
> > > +static int __init map_regions_fixed(void)
> > > +{
> > > + int i, s, ret = 0;
> > > + u64 end, systab;
> > > + unsigned long size;
> > > + efi_memory_desc_t *md;
> > > + struct efi_setup_data *data;
> > > +
> > > + s = sizeof(*data) + nr_efi_runtime_map * sizeof(data->map[0]);
> > > + data = early_memremap(efi_setup, s);
> > > + if (!data) {
> > > + ret = -ENOMEM;
> > > + goto out;
> > > + }
> > 
> > newline.
> 
> Will remove

misread the comment, there's no new line here. Looks like you want a
new blank line here, ok..

> 
> > 
> > > + for (i = 0, md = data->map; i < nr_efi_runtime_map; i++, md++) {
> > > + efi_map_region_fixed(md); /* FIXME: add error handling */
> > > + size = md->num_pages << PAGE_SHIFT;
> > > + end = md->phys_addr + size;
> > > +
> > > + systab = (u64) (unsigned long) efi_phys.systab;
> > > + if (md->phys_addr <= systab && systab < end) {
> > > + systab += md->virt_addr - md->phys_addr;
> > > + efi.systab = (efi_system_table_t *)(unsigned 
> > > long)systab;
> > > + }
> > > + ret = save_runtime_map(md, i);
> > 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2 0/6] Memory compaction efficiency improvements

2013-12-11 Thread Joonsoo Kim

On Wed, Dec 11, 2013 at 11:24:31AM +0100, Vlastimil Babka wrote:
> Changelog since V1 (thanks to the reviewers!)
> o Included "trace compaction being and end" patch in the series (mgorman)
> o Changed variable names and comments in patches 2 and 5(mgorman)
> o More thorough measurements, based on v3.13-rc2
> 
> The broad goal of the series is to improve allocation success rates for huge
> pages through memory compaction, while trying not to increase the compaction
> overhead. The original objective was to reintroduce capturing of high-order
> pages freed by the compaction, before they are split by concurrent activity.
> However, several bugs and opportunities for simple improvements were found in
> the current implementation, mostly through extra tracepoints (which are 
> however
> too ugly for now to be considered for sending).
> 
> The patches mostly deal with two mechanisms that reduce compaction overhead,
> which is caching the progress of migrate and free scanners, and marking
> pageblocks where isolation failed to be skipped during further scans.
> 
> Patch 1 (from mgorman) adds tracepoints that allow calculate time spent in
> compaction and potentially debug scanner pfn values.
> 
> Patch 2 encapsulates the some functionality for handling deferred compactions
> for better maintainability, without a functional change
> type is not determined without being actually needed.
> 
> Patch 3 fixes a bug where cached scanner pfn's are sometimes reset only after
> they have been read to initialize a compaction run.
> 
> Patch 4 fixes a bug where scanners meeting is sometimes not properly detected
> and can lead to multiple compaction attempts quitting early without
> doing any work.
> 
> Patch 5 improves the chances of sync compaction to process pageblocks that
> async compaction has skipped due to being !MIGRATE_MOVABLE.
> 
> Patch 6 improves the chances of sync direct compaction to actually do anything
> when called after async compaction fails during allocation slowpath.
> 
> The impact of patches were validated using mmtests's stress-highalloc 
> benchmark
> with mmtests's stress-highalloc benchmark on a x86_64 machine with 4GB memory.
> 
> Due to instability of the results (mostly related to the bugs fixed by patches
> 2 and 3), 10 iterations were performed, taking min,mean,max values for success
> rates and mean values for time and vmstat-based metrics.
> 
> First, the default GFP_HIGHUSER_MOVABLE allocations were tested with the 
> patches
> stacked on top of v3.13-rc2. Patch 2 is OK to serve as baseline due to no
> functional changes in 1 and 2. Comments below.
> 
> stress-highalloc
>  3.13-rc2  3.13-rc2  
> 3.13-rc2  3.13-rc2  3.13-rc2
>   2-nothp   3-nothp   
> 4-nothp   5-nothp   6-nothp
> Success 1 Min  9.00 (  0.00%)   10.00 (-11.11%)   43.00 
> (-377.78%)   43.00 (-377.78%)   33.00 (-266.67%)
> Success 1 Mean27.50 (  0.00%)   25.30 (  8.00%)   45.50 
> (-65.45%)   45.90 (-66.91%)   46.30 (-68.36%)
> Success 1 Max 36.00 (  0.00%)   36.00 (  0.00%)   47.00 
> (-30.56%)   48.00 (-33.33%)   52.00 (-44.44%)
> Success 2 Min 10.00 (  0.00%)8.00 ( 20.00%)   46.00 
> (-360.00%)   45.00 (-350.00%)   35.00 (-250.00%)
> Success 2 Mean26.40 (  0.00%)   23.50 ( 10.98%)   47.30 
> (-79.17%)   47.60 (-80.30%)   48.10 (-82.20%)
> Success 2 Max 34.00 (  0.00%)   33.00 (  2.94%)   48.00 
> (-41.18%)   50.00 (-47.06%)   54.00 (-58.82%)
> Success 3 Min 65.00 (  0.00%)   63.00 (  3.08%)   85.00 
> (-30.77%)   84.00 (-29.23%)   85.00 (-30.77%)
> Success 3 Mean76.70 (  0.00%)   70.50 (  8.08%)   86.20 
> (-12.39%)   85.50 (-11.47%)   86.00 (-12.13%)
> Success 3 Max 87.00 (  0.00%)   86.00 (  1.15%)   88.00 ( 
> -1.15%)   87.00 (  0.00%)   87.00 (  0.00%)
> 
> 3.13-rc23.13-rc23.13-rc23.13-rc23.13-rc2
>  2-nothp 3-nothp 4-nothp 5-nothp 6-nothp
> User 6437.72 6459.76 5960.32 5974.55 6019.67
> System   1049.65 1049.09 1029.32 1031.47 1032.31
> Elapsed  1856.77 1874.48 1949.97 1994.22 1983.15
> 
>   3.13-rc23.13-rc23.13-rc23.13-rc2
> 3.13-rc2
>2-nothp 3-nothp 4-nothp 5-nothp
>  6-nothp
> Minor Faults 253952267   254581900   250030122   250507333   
> 250157829
> Major Faults   420 407 506 530
>  530
> Swap Ins 4   9   9   6
>6
> Swap Outs

linux-next: Tree for Dec 12

2013-12-11 Thread Stephen Rothwell

Hi all,

Changes since 20131211:

The powerpc tree still had its build failure for which I applied a
supplied patch.

The net-next tree gained a conflict against the net tree.

The block tree gained a conflict against the f2fs tree.

The usb-gadget tree still has its build failure so I used the version from
next-20131206.

Non-merge commits (relative to Linus' tree): 3588
 4076 files changed, 171238 insertions(+), 96653 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final
link) and i386, sparc, sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

I am currently merging 209 trees (counting Linus' and 29 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwell 

$ git checkout master
$ git reset --hard stable
Merging origin/master (9538e10086bd Merge 
git://www.linux-watchdog.org/linux-watchdog)
Merging fixes/master (8ae516aa8b81 Merge tag 'trace-fixes-v3.13-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace)
Merging kbuild-current/rc-fixes (19514fc665ff arm, kbuild: make "make install" 
not depend on vmlinux)
Merging arc-current/for-curr (da990a4f2d5a ARC: [perf] Fix a few thinkos)
Merging arm-current/fixes (b31459adeab0 ARM: 7917/1: cacheflush: correctly 
limit range of memory region being flushed)
Merging m68k-current/for-linus (77a42796786c m68k: Remove deprecated 
IRQF_DISABLED)
Merging metag-fixes/fixes (3b2f64d00c46 Linux 3.11-rc2)
Merging powerpc-merge/merge (e641eb03ab2b powerpc: Fix up the kdump base cap to 
128M)
Merging sparc/master (1de425c7b271 sparc64: Fix build regression)
Merging net/master (9508fdde4d53 Revert "8390 : Replace ei_debug with 
msg_enable/NETIF_MSG_* feature")
Merging ipsec/master (239c78db9c41 net: clear local_df when passing skb between 
namespaces)
Merging sound-current/for-linus (3690739b0135 ALSA: hda - Add static DAC/pin 
mapping for AD1986A codec)
Merging pci-current/for-linus (4fc9bbf98fd6 PCI: Disable Bus Master only on 
kexec reboot)
Merging wireless/master (bbf807bc0697 ath9k: fix duration calculation for 
non-aggregated packets)
Merging driver-core.current/driver-core-linus (a8b14744429f sysfs: give 
different locking key to regular and bin files)
Merging tty.current/tty-linus (39434abd942c n_tty: Fix missing newline echo)
Merging usb.current/usb-linus (8820784203ac phy: kconfig: add depends on 
"USB_PHY" to OMAP_USB2 and TWL4030_USB)
Merging staging.current/staging-linus (55ef003e4ae6 Merge tag 
'iio-fixes-for-3.13b' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus)
Merging char-misc.current/char-misc-linus (76a9635979e5 mei: add 9 series PCH 
mei device ids)
Merging input-current/for-linus (241ecf1ce528 Input: adxl34x - Fix bug in 
definition of ADXL346_2D_ORIENT)
Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" 
stripe)
Merging crypto-current/master (389a5390583a crypto: scatterwalk - Use 
sg_chain_ptr on chain entries)
Merging ide/master (c2f7d1e103ef ide: pmac: remove unnecessary 
pci_set_drvdata())
Merging dwmw2/master (5950f0803ca9 pcmcia: remove RPX board stuff)
Merging sh-current/sh-fixes-for-linus (44033109e99c SH: Convert out[bwl] macros 
to inline functions)
Merging devicetree-current/devicetree/merge (1931ee143b0a Revert "drivers: of: 
add initialization code for dma reserved memory")
Merging rr-fi

Re: 50 Watt idle power regression bisected to Linux-3.10

2013-12-11 Thread Mike Galbraith

On Thu, 2013-12-12 at 06:57 +0100, Mike Galbraith wrote: 
> On Wed, 2013-12-11 at 21:45 -0800, H. Peter Anvin wrote: 
> > As in it hangs at that point?
> 
> Nope, it's still going.
> 
> [1567.578340] pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 
> 1064 MHz, 2266 MHz
> 
> Funny, continents move faster :)  Maybe missing a write or two.

When I get back it may be done booting.  I'm gonna let it try for grins
while I'm away, then take a peek, see if I can spot it.

[ 1567.578340] pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 
1064 MHz, 2266 MHz
  done
Starting HAL daemon   done


Setting up (localfs) network interfaces:
lo
loIP address: 127.0.0.1/8   
  IP address: 127.0.0.2/8 done
eth0  device: Broadcom Corporation NetXtreme II BCM5709 Gig
  No configuration found for eth0 unused
eth1  device: Broadcom Corporation NetXtreme II BCM5709 Gig
  No configuration found for eth1 unused
eth2  device: NetXen Incorporated NX3031 Multifunction 1/10
[ 2457.114007] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready
[ 2457.114455] netxen_nic: eth2 NIC Link is up
[ 2457.223582] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
eth2  IP address: 0.0.0.0/32

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -tip v4 0/6] kprobes: introduce NOKPROBE_SYMBOL() and fixes crash bugs

2013-12-11 Thread Masami Hiramatsu

(2013/12/11 22:34), Ingo Molnar wrote:
> 
> * Masami Hiramatsu  wrote:
> 
>>> So why are annotations needed at all? What can happen if an 
>>> annotation is missing and a piece of code is probed which is also 
>>> used by the kprobes code internally - do we crash, lock up, 
>>> misbehave or handle it safely?
>>
>> The kprobe has recursion detector, [...]
> 
> It's the 'current_kprobe' percpu variable, checked via 
> kprobe_running(), right?

Right. :)

>> [...] but it is detected in the kprobe exception(int3) handler, this 
>> means that if we put a probe before detecting the recursion, we'll 
>> do an infinite recursion.
> 
> So only the (presumably rather narrow) code path leading to the 
> recursion detection code has to be annotated, correct?

Yes, correct.

>> And also, even if we can detect the recursion, we can't stop the 
>> kernel, we need to skip the probe. This means that we need to 
>> recover to the main execution path by doing single step. As you may 
>> know, since the single stepping involves the debug exception, we 
>> have to avoid proving on that path too. Or we'll have an infinite 
>> recursion again.
> 
> I don't see why this is needed: if a "probing is disabled" recursion 
> flag is set the moment the first probe fires, and if it's only cleared 
> once all processing is finished, then any intermediate probes should 
> simply return early from int3 and not fire.

No, because the int3 already changes the original instruction.
This means that you cannot skip singlestep(or emulate) the
instruction which is copied to execution buffer (ainsn->insn),
even if you have such the flag.
So, kprobe requires the annotations on the singlestep path.

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cgroup: fix fail path in cgroup_load_subsys()

2013-12-11 Thread Li Zefan

> @@ -4861,10 +4861,8 @@ int __init_or_module cgroup_load_subsys(struct 
> cgroup_subsys *ss)
>*/
>   css = ss->css_alloc(cgroup_css(cgroup_dummy_top, ss));
>   if (IS_ERR(css)) {
> - /* failure case - need to deassign the cgroup_subsys[] slot. */
> - cgroup_subsys[ss->subsys_id] = NULL;
> - mutex_unlock(_mutex);
> - return PTR_ERR(css);
> + ret = PTR_ERR(css);
> + goto out_err;
>   }
>  
>   list_add(>sibling, _dummy_root.subsys_list);
> @@ -4873,6 +4871,10 @@ int __init_or_module cgroup_load_subsys(struct 
> cgroup_subsys *ss)
>   /* our new subsystem will be attached to the dummy hierarchy. */
>   init_css(css, ss, cgroup_dummy_top);
>  
> + ret = online_css(css);
> + if (ret)
> + goto free_css;
> +
>   /*
>* Now we need to entangle the css into the existing css_sets. unlike
>* in cgroup_init_subsys, there are now multiple css_sets, so each one
> @@ -4896,18 +4898,17 @@ int __init_or_module cgroup_load_subsys(struct 
> cgroup_subsys *ss)
>   }
>   write_unlock(_set_lock);
>  
> - ret = online_css(css);
> - if (ret)
> - goto err_unload;
> -

Moving online_css() upwards should be fine.

Acked-by: Li Zefan 

>   /* success! */
>   mutex_unlock(_mutex);
>   return 0;
>  
> -err_unload:
> +free_css:
> + list_del(>sibling);
> + ss->css_free(css);
> +out_err:
> + /* failure case - need to deassign the cgroup_subsys[] slot. */
> + cgroup_subsys[ss->subsys_id] = NULL;
>   mutex_unlock(_mutex);
> - /* @ss can't be mounted here as try_module_get() would fail */
> - cgroup_unload_subsys(ss);
>   return ret;
>  }
>  EXPORT_SYMBOL_GPL(cgroup_load_subsys);
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 50 Watt idle power regression bisected to Linux-3.10

2013-12-11 Thread Mike Galbraith

On Wed, 2013-12-11 at 21:45 -0800, H. Peter Anvin wrote: 
> As in it hangs at that point?

Nope, it's still going.

[1567.578340] pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1064 
MHz, 2266 MHz

Funny, continents move faster :)  Maybe missing a write or two.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: [PATCH 16/17] uprobes: Allocate ->utask before handler_chain() for tracing handlers

2013-12-11 Thread Masami Hiramatsu

(2013/12/12 3:11), Oleg Nesterov wrote:
> On 12/11, Masami Hiramatsu wrote:
>>
>> (2013/12/11 0:57), Oleg Nesterov wrote:
>>> On 12/10, Masami Hiramatsu wrote:

 and isn't it better to increment
 miss-hit counter of the uprobe?
>>>
>>> What do you mean? This is not miss-hit and ->utask == NULL is quite normal.
>>
>> But it could skip the handler_chain silently. It could confuse users
>> why their probe doesn't hit as expected.
> 
> No, we will restart the same (probed) instruction, handle_swbp()
> will be called again, get_utask() will be called again.

Hmm, in that case, how would you avoid infinite recursive loop??
Would you repeat it until get_utask() != NULL?

> Not to mention that (in practice) if GFP_KERNEL fails the task is
> already killed.
> 
>>> For example, on ppc it can be always NULL because ppc likely emulates the
>>> probed insn.
>>
>> Hmm, in that case, should uprobes handlers never be called on ppc with
>> this change?
> 
> Why? With this change ppc will have ->utask != NULL even if it doesn't
> need it at all.

Ah, I see. This changes that.

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2 0/4] X86/KVM: enable Intel MPX for KVM

2013-12-11 Thread Liu, Jinsong

Paolo Bonzini wrote:
> Il 11/12/2013 09:31, Liu, Jinsong ha scritto:
>> Paolo, comments for version 2?
> 
> I think I commented that it's fine, I'm just waiting for a rebase on
> top of the generic patches.
> 
> Paolo
> 

Thanks! common MPX definiation patches have been checked in tip tree (both 
Qiaowei and I use that definiations):
http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=191f57c137bcce0e3e9313acb77b2f114d15afbb
http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=e7d820a5e549b3eb6c3f9467507566565646a669

Jinsong

>> 
>> Liu, Jinsong wrote:
>>> These patches are version 2 to enalbe Intel MPX for KVM.
>>> 
>>> Version 1:
>>>   * Add some Intel MPX definiation
>>>   * Fix a cpuid(0x0d, 0) exposing bug, dynamic per XCR0 features
>>> enable/disable 
>>>   * vmx and msr handle for MPX support at KVM
>>>   * enalbe MPX feature for guest
>>> 
>>> Version 2:
>>>   * remove generic MPX definiation, kernel side has add the
>>> definiation 
>>>   * add MSR_IA32_BNDCFGS to msrs_to_save
>>> 
>>> Thanks,
>>> Jinsong
>>> 
>>> Liu Jinsong (4):
>>>   KVM/X86: Fix xsave cpuid exposing bug
>>>   KVM/X86: Intel MPX vmx and msr handle
>>>   KVM/X86: add MSR_IA32_BNDCFGS to msrs_to_save
>>>   KVM/X86: Enable Intel MPX for guest.
>>> 
>>>  arch/x86/include/asm/vmx.h|4 
>>>  arch/x86/include/asm/xsave.h  |2 ++
>>>  arch/x86/include/uapi/asm/msr-index.h |1 +
>>>  arch/x86/kvm/cpuid.c  |8 
>>>  arch/x86/kvm/vmx.c|   18 --
>>>  arch/x86/kvm/x86.c|   12 +---
>>>  arch/x86/kvm/x86.h|3 ++-
>>>  7 files changed, 38 insertions(+), 10 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 50 Watt idle power regression bisected to Linux-3.10

2013-12-11 Thread H. Peter Anvin

As in it hangs at that point?

Mike Galbraith  wrote:
>On Wed, 2013-12-11 at 20:49 -0800, H. Peter Anvin wrote: 
>> On 12/11/2013 08:25 PM, Mike Galbraith wrote:
>> >  arch/x86/include/asm/mwait.h   |4 ++--
>> >  arch/x86/kernel/cpu/common.c   |7 ---
>> >  arch/x86/kernel/setup_percpu.c |1 +
>> >  3 files changed, 7 insertions(+), 5 deletions(-)
>> > 
>> > Index: linux-2.6/arch/x86/kernel/cpu/common.c
>> > ===
>> > --- linux-2.6.orig/arch/x86/kernel/cpu/common.c
>> > +++ linux-2.6/arch/x86/kernel/cpu/common.c
>> > @@ -65,13 +65,14 @@ void __init setup_cpu_local_masks(void)
>> >  }
>> >  
>> >  /* allocate percpu area for mwait doorbell */
>> > -char __percpu *mwait_doorbell;
>> > +DEFINE_PER_CPU(char *, mwait_doorbell);
>> > +EXPORT_PER_CPU_SYMBOL(mwait_doorbell);
>> >  
>> 
>> Sorry, this is wrong.  This is NOT a percpu variable, it is a pointer
>to
>> a percpu allocation, but the variable itself is not a percpu
>variable.
>> This explains your boom.
>
>With that fixed, it boots, but is not quite perfect.
>
>... 
>[  258.560079] fbcon: radeondrmfb (fb0) is primary device
>[  258.722483] Console: switching to colour frame buffer device 128x48
>[  258.847076] radeon :01:03.0: fb0: radeondrmfb frame buffer
>device
>[  258.911991] radeon :01:03.0: registered panic notifier
>[  258.968772] [drm] Initialized radeon 2.35.0 20080528 for
>:01:03.0 on minor 0
>...
>[  469.738604] netxen_nic :04:00.3: using msi-x interrupts
>[  469.739078] netxen_nic :04:00.3: eth5: GbE port initialized
>[  469.830512] ipmi_si 00:01: Found new BMC (man_id: 0x0b, prod_id:
>0x2000, dev_id: 0x13)
>[  469.830524] ipmi_si 00:01: IPMI kcs interface initialized
>[  473.729862] iTCO_wdt: unable to reset NO_REBOOT flag, device
>disabled by hardware/BIOS
>...
>[  711.636741] fuse init (API version 7.22)
>
>... ok box, doctor appointment is in an hour away.
>
>-Mike 

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 7/8] mm, memcg: allow processes handling oom notifications to access reserves

2013-12-11 Thread Tim Hockin

The immediate problem I see with setting aside reserves "off the top"
is that we don't really know a priori how much memory the kernel
itself is going to use, which could still land us in an overcommitted
state.

In other words, if I have your 128 MB machine, and I set aside 8 MB
for OOM handling, and give 120 MB for jobs, I have not accounted for
the kernel.  So I set aside 8 MB for OOM and 100 MB for jobs, leaving
20 MB for jobs.  That should be enough right?  Hell if I know, and
nothing ensures that.

On Wed, Dec 11, 2013 at 4:42 AM, Tejun Heo  wrote:
> Yo,
>
> On Tue, Dec 10, 2013 at 03:55:48PM -0800, David Rientjes wrote:
>> > Well, the gotcha there is that you won't be able to do that with
>> > system level OOM handler either unless you create a separately
>> > reserved memory, which, again, can be achieved using hierarchical
>> > memcg setup already.  Am I missing something here?
>>
>> System oom conditions would only arise when the usage of memcgs A + B
>> above cause the page allocator to not be able to allocate memory without
>> oom killing something even though the limits of both A and B may not have
>> been reached yet.  No userspace oom handler can allocate memory with
>> access to memory reserves in the page allocator in such a context; it's
>> vital that if we are to handle system oom conditions in userspace that we
>> given them access to memory that other processes can't allocate.  You
>> could attach a userspace system oom handler to any memcg in this scenario
>> with memory.oom_reserve_in_bytes and since it has PF_OOM_HANDLER it would
>> be able to allocate in reserves in the page allocator and overcharge in
>> its memcg to handle it.  This isn't possible only with a hierarchical
>> memcg setup unless you ensure the sum of the limits of the top level
>> memcgs do not equal or exceed the sum of the min watermarks of all memory
>> zones, and we exceed that.
>
> Yes, exactly.  If system memory is 128M, create top level memcgs w/
> 120M and 8M each (well, with some slack of course) and then overcommit
> the descendants of 120M while putting OOM handlers and friends under
> 8M without overcommitting.
>
> ...
>> The stronger rationale is that you can't handle system oom in userspace
>> without this functionality and we need to do so.
>
> You're giving yourself an unreasonable precondition - overcommitting
> at root level and handling system OOM from userland - and then trying
> to contort everything to fit that.  How can possibly "overcommitting
> at root level" be a goal of and in itself?  Please take a step back
> and look at and explain the *problem* you're trying to solve.  You
> haven't explained why that *need*s to be the case at all.
>
> I wrote this at the start of the thread but you're still doing the
> same thing.  You're trying to create a hidden memcg level inside a
> memcg.  At the beginning of this thread, you were trying to do that
> for !root memcgs and now you're arguing that you *need* that for root
> memcg.  Because there's no other limit we can make use of, you're
> suggesting the use of kernel reserve memory for that purpose.  It
> seems like an absurd thing to do to me.  It could be that you might
> not be able to achieve exactly the same thing that way, but the right
> thing to do would be improving memcg in general so that it can instead
> of adding yet more layer of half-baked complexity, right?
>
> Even if there are some inherent advantages of system userland OOM
> handling with a separate physical memory reserve, which AFAICS you
> haven't succeeded at showing yet, this is a very invasive change and,
> as you said before, something with an *extremely* narrow use case.
> Wouldn't it be a better idea to improve the existing mechanisms - be
> that memcg in general or kernel OOM handling - to fit the niche use
> case better?  I mean, just think about all the corner cases.  How are
> you gonna handle priority inversion through locked pages or
> allocations given out to other tasks through slab?  You're suggesting
> opening a giant can of worms for extremely narrow benefit which
> doesn't even seem like actually needing opening the said can.
>
> Thanks.
>
> --
> tejun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 50 Watt idle power regression bisected to Linux-3.10

2013-12-11 Thread Mike Galbraith

On Wed, 2013-12-11 at 20:49 -0800, H. Peter Anvin wrote: 
> On 12/11/2013 08:25 PM, Mike Galbraith wrote:
> >  arch/x86/include/asm/mwait.h   |4 ++--
> >  arch/x86/kernel/cpu/common.c   |7 ---
> >  arch/x86/kernel/setup_percpu.c |1 +
> >  3 files changed, 7 insertions(+), 5 deletions(-)
> > 
> > Index: linux-2.6/arch/x86/kernel/cpu/common.c
> > ===
> > --- linux-2.6.orig/arch/x86/kernel/cpu/common.c
> > +++ linux-2.6/arch/x86/kernel/cpu/common.c
> > @@ -65,13 +65,14 @@ void __init setup_cpu_local_masks(void)
> >  }
> >  
> >  /* allocate percpu area for mwait doorbell */
> > -char __percpu *mwait_doorbell;
> > +DEFINE_PER_CPU(char *, mwait_doorbell);
> > +EXPORT_PER_CPU_SYMBOL(mwait_doorbell);
> >  
> 
> Sorry, this is wrong.  This is NOT a percpu variable, it is a pointer to
> a percpu allocation, but the variable itself is not a percpu variable.
> This explains your boom.

With that fixed, it boots, but is not quite perfect.

... 
[  258.560079] fbcon: radeondrmfb (fb0) is primary device
[  258.722483] Console: switching to colour frame buffer device 128x48
[  258.847076] radeon :01:03.0: fb0: radeondrmfb frame buffer device
[  258.911991] radeon :01:03.0: registered panic notifier
[  258.968772] [drm] Initialized radeon 2.35.0 20080528 for :01:03.0 on 
minor 0
...
[  469.738604] netxen_nic :04:00.3: using msi-x interrupts
[  469.739078] netxen_nic :04:00.3: eth5: GbE port initialized
[  469.830512] ipmi_si 00:01: Found new BMC (man_id: 0x0b, prod_id: 0x2000, 
dev_id: 0x13)
[  469.830524] ipmi_si 00:01: IPMI kcs interface initialized
[  473.729862] iTCO_wdt: unable to reset NO_REBOOT flag, device disabled by 
hardware/BIOS
...
[  711.636741] fuse init (API version 7.22)

... ok box, doctor appointment is in an hour away.

-Mike 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: process 'stuck' at exit.

2013-12-11 Thread Darren Hart

On Wed, 2013-12-11 at 23:26 -0500, Dave Jones wrote:
> On Tue, Dec 10, 2013 at 02:48:52PM -0800, Linus Torvalds wrote:
>  
>  > Dave, can you re-create that trinity run and test that patch? I think
>  > we've got this
> 
> 24 hours later, all is well.  I think we can call this one done.
> 
> Tested-by: Dave Jones 

Thank you again for a fine preemptive bug catch Dave!

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH CFT] ARM:S5P64X0: Enable ARM_PATCH_PHYS_VIRT and AUTO_ZRELADDR by default

2013-12-11 Thread panchaxari

ARM_PATCH_PHYS_VIRT and AUTO_ZRELADDR have been enabled as default configs
to S5P64X0 platforms.

Introduction of PHYS_VIRT config as default would enable phy-to-virt and
virt-to-phy translation function at boot and module loading time
and enforce dynamic reallocation of memory. AUTO_ZRELADDR config would
enable calculation of kernel load address at run time.

PHYS_VIRT config is mutually exclusive to XIP_KERNEL, XIP_KERNEL is used in
systems with NOR flash devices, and ZRELADDR config is mutually exclusive
to ZBOOT_ROM.

CFT::Call For Testing

Requesting maintainers of S5P64X0 platforms to evaluate the changes on the
board and comment, as I dont have the board for testing and also requesting
an ACK

Signed-off-by: panchaxari 
Cc: Kukjin Kim 
Cc: Tomasz Figa 
Cc: Sylwester Nawrocki 
Cc: Heiko Stuebner 
Cc: Russell King 
Cc: Linus Walleij 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

---
The samsung S5P64X0 vega has an average performing CPU with max speed 667 Mhz.
This SOC has two variants S5P6440 and S5P6450. It has one core based on
ARM1176JZF-S instruction set, and has 16KB data and instruction cache each.

SOC has a memory subsystem with support to NAND Flash interface with x8 data
bus, with 1/4/8/12/16 bit hardware ECC circuit and 4KB Page mode. It has
Mobile DDR interface with x16 or x32 data bus, and DDR2 interface with x16 or
x32 data bus it also supports eMMC4.4.

Below lkml link is a quoting by Russell which clears the concept of PHYS_VIRT
and ZRELADDR
-

https://lkml.org/lkml/2011/10/14/434

-
---
 arch/arm/Kconfig |2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 934e26c..8986335 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -759,6 +759,8 @@ config ARCH_S3C64XX
 
 config ARCH_S5P64X0
bool "Samsung S5P6440 S5P6450"
+   select ARM_PATCH_PHYS_VIRT
+   select AUTO_ZRELADDR
select CLKDEV_LOOKUP
select CLKSRC_SAMSUNG_PWM
select CPU_V6
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] tools/perf: Fix cross compilation

2013-12-11 Thread Michael Ellerman

Commit b6aa997 "Add feature check core code" added feature checking
logic in config/feature-checks/Makefile but didn't use the CROSS_COMPILE
value.

Fix it by prefixing $(CC), as is done in Makefile.perf.

Signed-off-by: Michael Ellerman 
---
 tools/perf/config/feature-checks/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/config/feature-checks/Makefile 
b/tools/perf/config/feature-checks/Makefile
index bc86462..f3946db 100644
--- a/tools/perf/config/feature-checks/Makefile
+++ b/tools/perf/config/feature-checks/Makefile
@@ -28,7 +28,7 @@ FILES=\
test-stackprotector-all \
test-timerfd
 
-CC := $(CC) -MD
+CC := $(CROSS_COMPILE)$(CC) -MD
 
 all: $(FILES)
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers

2013-12-11 Thread Mohit KUMAR DCG

> -Original Message-
> From: Jingoo Han [mailto:jg1@samsung.com]
> Sent: Thursday, December 12, 2013 3:55 AM
> To: 'Bjorn Helgaas'; linux-...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org; linux-arm-ker...@lists.infradead.org;
> linux-te...@vger.kernel.org; linux...@vger.kernel.org; linux-samsung-
> s...@vger.kernel.org; 'Shawn Guo'; 'Jason Cooper'; 'Thierry Reding'; 'Simon
> Horman'; 'Magnus Damm'; 'Valentine Barshak'; 'Wei Yongjun'; 'Wei Yongjun';
> 'Kuninori Morimoto'; Mohit KUMAR DCG; Pratyush ANAND; 'Jingoo Han'
> Subject: Re: [PATCH] MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car
> PCI host maintainers
> 
> On Thursday, December 12, 2013 3:43 AM, Bjorn Helgaas wrote:
> > On Wed, Dec 11, 2013 at 11:32:37AM -0700, Bjorn Helgaas wrote:
> > > If this looks reasonable, I'll merge it via the PCI tree for v3.13.
> >
> > And I see Mohit's patch [1] to update the DesignWare entry:
> >
> > +PCIE DRIVER FOR SYNOPSIS DESIGNWARE CONTROLLER
> > +M: Mohit Kumar 
> > +M: Jingoo Han 
> > +L: linux-...@vger.kernel.org
> > +S: Maintained
> > +F: drivers/pci/host/pcie-designware.c
> >
> > I can fold in that update too if Jingoo acks it.
> >
> > [1] http://patchwork.ozlabs.org/patch/299905/
> 
> Hi Bjorn,
> 
> I agree with this. :-)
> Acked-by: Jingoo Han 
> 
- Thanks Bjorn and Jingoo. I will remove this patch from my v2 patches.

Regards
Mohit
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG in munlock_vma_pages_range

2013-12-11 Thread Bob Liu


On 12/12/2013 11:16 AM, Sasha Levin wrote:
> On 12/11/2013 05:59 PM, Vlastimil Babka wrote:
>> On 12/09/2013 09:26 PM, Sasha Levin wrote:
>>> On 12/09/2013 12:12 PM, Vlastimil Babka wrote:
 On 12/09/2013 06:05 PM, Sasha Levin wrote:
> On 12/09/2013 04:34 AM, Vlastimil Babka wrote:
>> Hello, I will look at it, thanks.
>> Do you have specific reproduction instructions?
>
> Not really, the fuzzer hit it once and I've been unable to trigger
> it again. Looking at
> the piece of code involved it might have had something to do with
> hugetlbfs, so I'll crank
> up testing on that part.

 Thanks. Do you have trinity log and the .config file? I'm currently
 unable to even boot linux-next
 with my config/setup due to a GPF.
 Looking at code I wouldn't expect that it could encounter a tail
 page, without first encountering a
 head page and skipping the whole huge page. At least in THP case, as
 TLB pages should be split when
 a vma is split. As for hugetlbfs, it should be skipped for
 mlock/munlock operations completely. One
 of these assumptions is probably failing here...
>>>
>>> If it helps, I've added a dump_page() in case we hit a tail page
>>> there and got:
>>>
>>> [  980.172299] page:ea003e5e8040 count:0 mapcount:1
>>> mapping:  (null) index:0
>>> x0
>>> [  980.173412] page flags: 0x2f80008000(tail)
>>>
>>> I can also add anything else in there to get other debug output if
>>> you think of something else useful.
>>
>> Please try the following. Thanks in advance.
> 
> [  428.499889] page:ea003e5c0040 count:0 mapcount:4
> mapping:  (null) index:0x0
> [  428.499889] page flags: 0x2f80008000(tail)
> [  428.499889] start=140117131923456 pfn=16347137
> orig_start=140117130543104 page_increm
> =1 vm_start=140117130543104 vm_end=140117134688256 vm_flags=135266419
> [  428.499889] first_page pfn=16347136
> [  428.499889] page:ea003e5c count:204 mapcount:44
> mapping:880fb5c466c1 inde
> x:0x7f6f8fe00
> [  428.499889] page flags:
> 0x2f80084068(uptodate|lru|active|head|swapbacked)

>From this print, it looks like the page is still a huge page.
One situation I guess is a huge page which isn't PageMlocked and passed
to munlock_vma_page(). I'm not sure whether this will happen.
Please take a try this patch.

Thanks,
-Bob

diff --git a/mm/mlock.c b/mm/mlock.c
index d480cd6..f7066d2 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -466,6 +466,22 @@ void munlock_vma_pages_range(struct vm_area_struct
*vma,
 * the page_mask here.
 */
page_mask = munlock_vma_page(page);
+
+   /*
+* There are two possibilities when 
munlock_vma_page() return 0.
+* 1. The THP page was split.
+* 2. The THP page was not PageMlocked before 
and
+*it didn't get split.
+*
+* In case 2 we have to reset page_mask to
+* 'HPAGE_PMD_NR - 1' becuase this page is still
+* huge page, else PageTransHuge may receive a
+* tail page and trigger VM_BUG_ON on next loop.
+*/
+   if (!page_mask)
+   if (PageTransHuge(page))
+   page_mask = HPAGE_PMD_NR - 1;
+
unlock_page(page);
put_page(page); /* follow_page_mask() */
} else {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 50 Watt idle power regression bisected to Linux-3.10

2013-12-11 Thread Mike Galbraith

On Wed, 2013-12-11 at 20:49 -0800, H. Peter Anvin wrote: 
> On 12/11/2013 08:25 PM, Mike Galbraith wrote:
> >  arch/x86/include/asm/mwait.h   |4 ++--
> >  arch/x86/kernel/cpu/common.c   |7 ---
> >  arch/x86/kernel/setup_percpu.c |1 +
> >  3 files changed, 7 insertions(+), 5 deletions(-)
> > 
> > Index: linux-2.6/arch/x86/kernel/cpu/common.c
> > ===
> > --- linux-2.6.orig/arch/x86/kernel/cpu/common.c
> > +++ linux-2.6/arch/x86/kernel/cpu/common.c
> > @@ -65,13 +65,14 @@ void __init setup_cpu_local_masks(void)
> >  }
> >  
> >  /* allocate percpu area for mwait doorbell */
> > -char __percpu *mwait_doorbell;
> > +DEFINE_PER_CPU(char *, mwait_doorbell);
> > +EXPORT_PER_CPU_SYMBOL(mwait_doorbell);
> >  
> 
> Sorry, this is wrong.  This is NOT a percpu variable, it is a pointer to
> a percpu allocation, but the variable itself is not a percpu variable.
> This explains your boom.

Yeah, I know, I already slapped myself upside the head.

(what were you thinking mikie...la la la la la:)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Driver core: Fix device_add_attrs() error code path

2013-12-11 Thread Rafael J. Wysocki

From: Rafael J. Wysocki 

If the addition of dev_attr_online fails, device_add_attrs() should
remove device attribute groups as well as type and class attribute
groups before returning an error code.  Make that happen.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/base/core.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/base/core.c
===
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -491,11 +491,13 @@ static int device_add_attrs(struct devic
if (device_supports_offline(dev) && !dev->offline_disabled) {
error = device_create_file(dev, _attr_online);
if (error)
-   goto err_remove_type_groups;
+   goto err_remove_dev_groups;
}
 
return 0;
 
+ err_remove_dev_groups:
+   device_remove_groups(dev, dev->groups);
  err_remove_type_groups:
if (type)
device_remove_groups(dev, type->groups);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 50 Watt idle power regression bisected to Linux-3.10

2013-12-11 Thread H. Peter Anvin

On 12/11/2013 08:25 PM, Mike Galbraith wrote:
>  arch/x86/include/asm/mwait.h   |4 ++--
>  arch/x86/kernel/cpu/common.c   |7 ---
>  arch/x86/kernel/setup_percpu.c |1 +
>  3 files changed, 7 insertions(+), 5 deletions(-)
> 
> Index: linux-2.6/arch/x86/kernel/cpu/common.c
> ===
> --- linux-2.6.orig/arch/x86/kernel/cpu/common.c
> +++ linux-2.6/arch/x86/kernel/cpu/common.c
> @@ -65,13 +65,14 @@ void __init setup_cpu_local_masks(void)
>  }
>  
>  /* allocate percpu area for mwait doorbell */
> -char __percpu *mwait_doorbell;
> +DEFINE_PER_CPU(char *, mwait_doorbell);
> +EXPORT_PER_CPU_SYMBOL(mwait_doorbell);
>  

Sorry, this is wrong.  This is NOT a percpu variable, it is a pointer to
a percpu allocation, but the variable itself is not a percpu variable.
This explains your boom.

>  void __init setup_mwait_doorbell(void)
>  {
>   if (boot_cpu_has(X86_FEATURE_MWAIT)) {
> - mwait_doorbell = __alloc_percpu(boot_cpu_data.clflush_size,
> - boot_cpu_data.clflush_size);
> + mwait_doorbell = __alloc_percpu(boot_cpu_data.x86_clflush_size,
> + boot_cpu_data.x86_clflush_size);
>  
>   if (!mwait_doorbell) {
>   /* This should never happen... */

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -tip v4 6/6] [RFC] kprobes/x86: Call exception handlers directly from do_int3/do_debug

2013-12-11 Thread Masami Hiramatsu

(2013/12/11 22:31), Jiri Kosina wrote:
> On Tue, 3 Dec 2013, Steven Rostedt wrote:
> 
>>> To avoid a kernel crash by probing on lockdep code, call
>>> kprobe_int3_handler and kprobe_debug_handler directly
>>> from do_int3 and do_debug. Since there is a locking code
>>> in notify_die, lockdep code can be invoked. And because
>>> the lockdep involves printk() related things, theoretically,
>>> we need to prohibit probing on much more code...
>>>
>>> Anyway, most of the int3 handlers in the kernel are already
>>> called from do_int3 directly, e.g. ftrace_int3_handler,
>>> poke_int3_handler, kgdb_ll_trap. Actually only
>>> kprobe_exceptions_notify is on the notifier_call_chain.
>>>
>>> So I think this is not a crazy thing.
>>
>> What? Oh, yeah. No, using notifiers in int3 handler is the crazy
>> thing ;-)
> 
> Yeah, it's broken. Obviously, if you happen to trigger int3 before the 
> notifier has been registered, it'd cause int3 exception to be unhandled. 
> See
> 
>   commit 17f41571bb2c4a398785452ac2718a6c5d77180e
>   Author: Jiri Kosina 
>   Date:   Tue Jul 23 10:09:28 2013 +0200
> 
>   kprobes/x86: Call out into INT3 handler directly instead of using 
> notifier
> 
> for one such issue that happened with jump labels.
> 
>> Hmm, if there's no users of the int3 notifier, should we just remove it?
> 
> Hmm, there are still uprobes, right?

Right, uprobes still use it, however, since it only handles user-space
breakpoint, there is no problem.

Thank you!


-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 01/10] net: stmmac: Enable stmmac main clock when probing hardware

2013-12-11 Thread Chen-Yu Tsai

Hi,

On Wed, Dec 11, 2013 at 4:05 AM, Maxime Ripard
 wrote:
> Hi,
>
> On Mon, Dec 09, 2013 at 10:43:29AM +0800, Chen-Yu Tsai wrote:
>> >> @@ -2759,15 +2760,18 @@ struct stmmac_priv *stmmac_dvr_probe(struct 
>> >> device *device,
>> >>   }
>> >>   }
>> >>
>> >> + clk_disable_unprepare(priv->stmmac_clk);
>> >> +
>> >
>> > Hu? Why do you disable the clock? don't you need it afterwards?
>>
>> The clock is enabled in *_open (when the network interface is used),
>> and disabled in *_close.
>
> Maybe it is the real issue then.
>
> Why don't you move the clk_disable to _remove then?

I wasn't sure this was the proper way. However, looking around, it
seems other drivers enable the clock in, _probe, and disable it in
_remove. I will modify stmmac to do so as well.


ChenYu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: process 'stuck' at exit.

2013-12-11 Thread Dave Jones

On Tue, Dec 10, 2013 at 02:48:52PM -0800, Linus Torvalds wrote:
 
 > Dave, can you re-create that trinity run and test that patch? I think
 > we've got this

24 hours later, all is well.  I think we can call this one done.

Tested-by: Dave Jones 

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 50 Watt idle power regression bisected to Linux-3.10

2013-12-11 Thread Mike Galbraith

On Wed, 2013-12-11 at 16:52 -0800, H. Peter Anvin wrote: 
> On 12/11/2013 03:14 PM, Borislav Petkov wrote:
> > On Wed, Dec 11, 2013 at 03:08:35PM -0800, H. Peter Anvin wrote:
> >> So I would like to propose that we switch to using a percpu variable
> >> which is a single cache line of nothing at all. It would only ever
> >> be touched by MONITOR and for explicit wakeup. Hopefully that will
> >> resolve this problem without the need for the CLFLUSH.
> > 
> > Yep, makes a lot of sense to me to have an exclusive (overloaded meaning
> > here :-)) cacheline only for that. And, if it works, we'll save us the
> > penalty from the CLFLUSH too, cool.
> > 
> 
> Here is a POC patch... anyone willing to test it out?

Got it built, but it went boom on boot.  Off to rummage. 

[0.00] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:64 
nr_node_ids:8
[0.00] PERCPU: Embedded 26 pages/cpu @88027ee0 s75904 r8192 
d22400 u131072
[0.00] pcpu-alloc: s75904 r8192 d22400 u131072 alloc=1*2097152
[0.00] pcpu-alloc: [0] 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 
[0.00] pcpu-alloc: [0] 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
[0.00] pcpu-alloc: [0] 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 
[0.00] pcpu-alloc: [0] 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 
[0.00] BUG: unable to handle kernel paging request at b8a0
[0.00] IP: [] setup_mwait_doorbell+0x20/0x38
[0.00] PGD 0 
[0.00] Oops: 0002 [#1] SMP 
[0.00] Modules linked in:
[0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 3.13.0-master #185
[0.00] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 
07/07/2010
[0.00] task: 81a10460 ti: 81a0 task.ti: 
81a0
[0.00] RIP: 0010:[]  [] 
setup_mwait_doorbell+0x20/0x38
[0.00] RSP: :81a01f28  EFLAGS: 00010002
[0.00] RAX: 00014880 RBX: 0040 RCX: 
[0.00] RDX: 0040 RSI: 0040 RDI: 81a38e60
[0.00] RBP: 81a01f28 R08: 0040 R09: 
[0.00] R10: 88027f5f4880 R11: 0001 R12: b850
[0.00] R13: b026 R14: b024 R15: b020
[0.00] FS:  () GS:88027ee0() 
knlGS:
[0.00] CS:  0010 DS:  ES:  CR0: 80050033
[0.00] CR2: b8a0 CR3: 01a0b000 CR4: 00b0
[0.00] Stack:
[0.00]  81a01f78 81aa3641 81a01f98 
cd48
[0.00]  88027ee0   

[0.00]    81a01fa8 
81a96d89
[0.00] Call Trace:
[0.00]  [] setup_per_cpu_areas+0x233/0x242
[0.00]  [] start_kernel+0x84/0x370
[0.00]  [] x86_64_start_reservations+0x1b/0x35
[0.00]  [] x86_64_start_kernel+0x12e/0x135
[0.00] Code: 40 8f a7 81 e8 f6 fe ff ff c9 c3 55 48 8b 05 0a bf fd ff 
48 89 e5 a8 08 75 02 c9 c3 0f b7 3d 84 bf fd ff 48 89 fe e8 fe dc 64 ff <48> 89 
05 27 e8 56 7e 48 85 c0 75 e3 48 c7 c7 f0 83 78 81 e8 55 
[0.00] RIP  [] setup_mwait_doorbell+0x20/0x38
[0.00]  RSP 
[0.00] CR2: b8a0
[0.00] ---[ end trace f6e32c58e0729292 ]---
[0.00] Kernel panic - not syncing: Attempted to kill the idle task!

Build delta.

---
 arch/x86/include/asm/mwait.h   |4 ++--
 arch/x86/kernel/cpu/common.c   |7 ---
 arch/x86/kernel/setup_percpu.c |1 +
 3 files changed, 7 insertions(+), 5 deletions(-)

Index: linux-2.6/arch/x86/kernel/cpu/common.c
===
--- linux-2.6.orig/arch/x86/kernel/cpu/common.c
+++ linux-2.6/arch/x86/kernel/cpu/common.c
@@ -65,13 +65,14 @@ void __init setup_cpu_local_masks(void)
 }
 
 /* allocate percpu area for mwait doorbell */
-char __percpu *mwait_doorbell;
+DEFINE_PER_CPU(char *, mwait_doorbell);
+EXPORT_PER_CPU_SYMBOL(mwait_doorbell);
 
 void __init setup_mwait_doorbell(void)
 {
if (boot_cpu_has(X86_FEATURE_MWAIT)) {
-   mwait_doorbell = __alloc_percpu(boot_cpu_data.clflush_size,
-   boot_cpu_data.clflush_size);
+   mwait_doorbell = __alloc_percpu(boot_cpu_data.x86_clflush_size,
+   boot_cpu_data.x86_clflush_size);
 
if (!mwait_doorbell) {
/* This should never happen... */
Index: linux-2.6/arch/x86/kernel/setup_percpu.c
===
--- linux-2.6.orig/arch/x86/kernel/setup_percpu.c
+++ linux-2.6/arch/x86/kernel/setup_percpu.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 DEFINE_PER_CPU_READ_MOSTLY(int,

Re: Re: [PATCH -tip v5.1 12/18] ftrace/kprobes: Use NOKPROBE_SYMBOL macro in ftrace

2013-12-11 Thread Masami Hiramatsu

(2013/12/12 10:34), Steven Rostedt wrote:
> On Tue, 10 Dec 2013 09:57:14 +
> Masami Hiramatsu  wrote:
> 
> 
>> --- a/kernel/trace/trace_kprobe.c
>> +++ b/kernel/trace/trace_kprobe.c
>> @@ -51,45 +51,45 @@ struct event_file_link {
>>  (sizeof(struct probe_arg) * (n)))
>>  
>>  
>> -static __kprobes bool trace_probe_is_return(struct trace_probe *tp)
>> +static __always_inline bool trace_probe_is_return(struct trace_probe *tp)
> 
> I wonder if we should have a comment somewhere explaining why we are
> using __always_inline.  Maybe we should add a new annotation:
> 
> #define kprobes_inline   __always_inline
> 
> ?
> 
> The above would be self documenting, and we can also include a comment
> with the define that states why it is there. Otherwise 10 years from
> now, someone is going to see these and say "WTF!" and remove them.

Hm, agreed, and I think nokprobe_inline is better since it is
similar to NOKPROBE_SYMBOL(). :)

[...]
>> @@ -755,8 +755,8 @@ static const struct file_operations kprobe_profile_ops = 
>> {
>>  };
>>  
>>  /* Sum up total data length for dynamic arraies (strings) */
>> -static __kprobes int __get_data_size(struct trace_probe *tp,
>> - struct pt_regs *regs)
>> +static __always_inline
>> +int __get_data_size(struct trace_probe *tp, struct pt_regs *regs)
> 
> This function is used 4 times within the file and is not that small. I
> think it's a bit too big for an inline, and qualifies for a normal
> function with a NOKPROBE_SYMBOL() attached. 

OK, I'll do so.

>> @@ -771,9 +771,9 @@ static __kprobes int __get_data_size(struct trace_probe 
>> *tp,
>>  }
>>  
>>  /* Store the value of each argument */
>> -static __kprobes void store_trace_args(int ent_size, struct trace_probe *tp,
>> -   struct pt_regs *regs,
>> -   u8 *data, int maxlen)
>> +static __always_inline
>> +void store_trace_args(int ent_size, struct trace_probe *tp,
>> +  struct pt_regs *regs, u8 *data, int maxlen)
> 
> Same here (even more so!)

OK.

>>  {
>>  int i;
>>  u32 end = tp->size;
>> @@ -803,7 +803,7 @@ static __kprobes void store_trace_args(int ent_size, 
>> struct trace_probe *tp,
>>  }
>>  
>>  /* Kprobe handler */
>> -static __kprobes void
>> +static __always_inline void
>>  __kprobe_trace_func(struct trace_probe *tp, struct pt_regs *regs,
>>  struct ftrace_event_file *ftrace_file)
> 
> OK, this one is big, but it's only used once.

Right, at least in my build binary, it is inlined.

Thank you,


-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 tip/core/rcu 0/4] Documentation changes for 3.14

2013-12-11 Thread Josh Triplett

On Wed, Dec 11, 2013 at 03:08:23PM -0800, Paul E. McKenney wrote:
> Hello!
> 
> This series once again attempts to improve rcu_assign_pointer()'s
> relationship with sparse.
> 
> 1.Add a comment indicating that despite appearances,
>   rcu_assign_pointer() really only evaluates its arguments once,
>   as a cpp macro should.
> 
> 2.Replace rcu_assign_pointer() of NULL with RCU_INIT_POINTER() to
>   silence a sparse warning.
> 
> 3.Apply ACCESS_ONCE() to rcu_assign_pointer()'s target to prevent
>   comiler mischief.  Also require that the source pointer be from
>   the kernel address space.  Sometimes it can be from the RCU address
>   space, which necessitates the remaining patches in this series.
>   Which, it must be admitted, apply to a very small fraction of
>   the rcu_assign_pointer() invocations in the kernel.  This commit
>   courtesy of Josh Triplett.
> 
> 4.Add an RCU_INITIALIZER() for compile-time initialization of
>   global RCU-protected pointers.

For all the patches (other than the one I wrote, for obvious reasons):
Reviewed-by: Josh Triplett 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] drivers: net: cpsw: fix for cpsw crash when build as modules

2013-12-11 Thread Felipe Balbi

From: Mugunthan V N 

When CPSW and Davinci MDIO are build as modules, CPSW crashes when
accessing CPSW registers in CPSW probe. The same is working in built-in
as the CPSW clocks are enabled in Davindi MDIO probe, SO Enabling the
clocks before accessing the version register and moving out the other
register access to cpsw device open.

Signed-off-by: Mugunthan V N 
Signed-off-by: Felipe Balbi 
---
 drivers/net/ethernet/ti/cpsw.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index a91f0c9..5120d9c 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1151,6 +1151,12 @@ static int cpsw_ndo_open(struct net_device *ndev)
 * receive descs
 */
cpsw_info(priv, ifup, "submitted %d rx descriptors\n", i);
+
+   if (cpts_register(>pdev->dev, priv->cpts,
+ priv->data.cpts_clock_mult,
+ priv->data.cpts_clock_shift))
+   dev_err(priv->dev, "error registering cpts device\n");
+
}
 
/* Enable Interrupt pacing if configured */
@@ -1197,6 +1203,7 @@ static int cpsw_ndo_stop(struct net_device *ndev)
netif_carrier_off(priv->ndev);
 
if (cpsw_common_res_usage_state(priv) <= 1) {
+   cpts_unregister(priv->cpts);
cpsw_intr_disable(priv);
cpdma_ctlr_int_ctrl(priv->dma, false);
cpdma_ctlr_stop(priv->dma);
@@ -1985,9 +1992,15 @@ static int cpsw_probe(struct platform_device *pdev)
goto clean_runtime_disable_ret;
}
priv->regs = ss_regs;
-   priv->version = __raw_readl(>regs->id_ver);
priv->host_port = HOST_PORT_NUM;
 
+   /* Need to enable clocks with runtime PM api to access module
+* registers
+*/
+   pm_runtime_get_sync(>dev);
+   priv->version = readl(>regs->id_ver);
+   pm_runtime_put_sync(>dev);
+
res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
priv->wr_regs = devm_ioremap_resource(>dev, res);
if (IS_ERR(priv->wr_regs)) {
@@ -2157,8 +2170,6 @@ static int cpsw_remove(struct platform_device *pdev)
unregister_netdev(cpsw_get_slave_ndev(priv, 1));
unregister_netdev(ndev);
 
-   cpts_unregister(priv->cpts);
-
cpsw_ale_destroy(priv->ale);
cpdma_chan_destroy(priv->txch);
cpdma_chan_destroy(priv->rxch);
-- 
1.8.4.GIT

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] drivers: net: cpsw: fix dt probe for one port ethernet

2013-12-11 Thread Felipe Balbi

From: Mugunthan V N 

When only one port of the two port is pinned out, then dt probe is failing
because second port phy is not found. fixing this by checking the number of
slaves and breaking the loop.

Signed-off-by: Mugunthan V N 
Signed-off-by: Felipe Balbi 
---

both patches were taken from TI's 3.12 tree [1]
and have been tested on am335x, am437x and
dra7xx.

Mugunthan, I took the patches because I got bug reports
on v3.13-rc which these patches fix. Let me know if you
prefer to send another version of them for whatever
reason.

cheers

 drivers/net/ethernet/ti/cpsw.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 7536a4c..a91f0c9 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1816,6 +1816,8 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
}
 
i++;
+   if (i == data->slaves)
+   break;
}
 
return 0;
-- 
1.8.4.GIT

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tty-next 0/4] tty: Fix ^C echo

2013-12-11 Thread Peter Hurley


On 12/04/2013 07:13 PM, One Thousand Gnomes wrote:

Not so much confused as simply merged. Input processing is inherently
single-threaded; it makes sense to rely on that at the highest level
possible.


I would disagree entirely. You want to minimise the areas affected by a
given lock. You also want to lock data not code. Correctness comes before
speed. You optimise it when its right, otherwise you end up in a nasty
mess when you discover you've optimised to assumptions that are flawed.


Sorry for the delayed reply, Alan; what little free time I had was spent
snuffing out regressions :/

Sure, I understand that ideally locks protect data, not operations.
But I think maybe you're missing my point. Almost every lock, even at
inception, is somewhat optimized; otherwise, every datum would have its
own lock. Eliminating overlapping locks is a common optimization in stable
code.

In this case, an already broken bit of code is just only still broken.
buf->lock is also fairly simple to break apart (although I don't want to
because of the performance hit) which is not characteristic of locks
which protect operations.



Firewire, which is capable of sustained throughput in excess of 40MB/sec,
struggles to get over 5MB/sec through the tty layer. [And drm output
is orders-of-magnitude slower than that, which is just sad...]


And what protocols do you care about 5MB/second - n_tty - no ? For the
high speed protocols you are trying to fix a lost cause. By the time
we've gone piddling around with tty buffers and serialized tty queues
firing bytes through tasks and the like you already lost.

For drm I assume you mean the framebuffer console logic ? Last time I
benched that except for the Poulsbo it was bottlenecked on the GPU - not
that I can type at 5MB/second anyway. Not that fixing the performance of
the various bits wouldn't be a good thing too especially on the output
end.


For drm, I actually mean GEM object deletion, which is typically fenced
and thus appears to be GPU-bound. What's really needed there is deferred
deletion, like kfree_rcu(), with partial synchronization on allocation
failures only.

I mostly care about output speed; unfortunately, that's the input side
at the other end :)


While that would work, it's expensive extra locking in a path that 99.999%
of the time doesn't need it. I'd rather explore other solutions.


How about getting the high speed paths out of the whole tty buffer
layer ? Almost every line discipline can be a fastpath directly to the
network layer. If optimisation is the new obsession then we can cut the
crap entirely by optimising for networking not making it a slave of n_tty.

Starting at the beginning

we have locks on rx because
- we want serialized rx
- we have buffer lifetimes
- we have buffer queues
- we have loads of flow control parameters

Only n_tty needs the buffers (maybe some of irda but irda hasn't worked
for years afaik). IRQ receive paths are serialized (and as a bonus can be
pinned to a CPU). Flow control is n_tty stuff, everyone else simply fires
it at their network layer as fast as possible and net already does the
work.

Keep a single tty_buf in the tty for batching at any given time, and
private so no locks at all

Have a wrapper via
ld->receive(tty, buf)

which fires the tty_buf at the ldisc and allocates a new empty one

tty_queue_bytes(tty, buf, flags, len)

which adds to the buffer, and if full calls ld->queue and then carries on
the copying cycle

and

ld->receive_direct(tty, buf, flags, len)

which allows block mode devices to blast bytes directly at the queue (ie
all the USB 3G stuff, firewire, etc) without going via any additional
copies.

For almost all ldiscs

ld->receive would be

ld->receive_direct(tty, buf->buf, buf->flags, buf->len);
free buffer

For n_tty type stuff

ld->receive is basically much of tty_flip_buffer_push

ld->receive_direct allocates tty_buffers and copies into it

We may even be able to optimise some of the n_tty cases into the
fastpath afterwards (notably raw, no echo)

For anything receiving in blocks that puts us close to (but not quite at)
ethernet kinds of cleanness for network buffer delivery.

Worth me looking into ?


I have to give this a lot more thought.

The universality of n_tty is important, and costs real cycles on servers and
such. It's not just about typing speed.


The clock/generation method seems like it might yield a lockless solution
for this problem, but maybe creates another one because the driver-side
would need to stamp the buffer (in essence, a flush could affect data
that has not yet been copied from the driver).


But it has arrived in the driver so might not matter. That requires a
little thought!


This is my next experiment.

Regards,
Peter Hurley
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -next 1/2] seq_file: Rename static bool seq_overflow to public bool seq_is_buf_full

2013-12-11 Thread Joe Perches

The return values of seq_printf/puts/putc are frequently misused.

Start down a path to remove all the return value uses of these
functions.

Make the static bool seq_overflow public along with a rename of
the function to seq_is_buf_full.  Rename the still static
seq_set_overflow to seq_set_buf_full.

Update the documentation to not show return types for seq_printf
et al.  Add a description of seq_is_buf_full.

Signed-off-by: Joe Perches 
---
 Documentation/filesystems/seq_file.txt | 28 
 fs/seq_file.c  | 28 ++--
 include/linux/seq_file.h   |  8 
 3 files changed, 38 insertions(+), 26 deletions(-)

diff --git a/Documentation/filesystems/seq_file.txt 
b/Documentation/filesystems/seq_file.txt
index a1e2e0d..794dbde 100644
--- a/Documentation/filesystems/seq_file.txt
+++ b/Documentation/filesystems/seq_file.txt
@@ -171,27 +171,23 @@ output must be passed to the seq_file code. Some utility 
functions have
 been defined which make this task easy.
 
 Most code will simply use seq_printf(), which works pretty much like
-printk(), but which requires the seq_file pointer as an argument. It is
-common to ignore the return value from seq_printf(), but a function
-producing complicated output may want to check that value and quit if
-something non-zero is returned; an error return means that the seq_file
-buffer has been filled and further output will be discarded.
+printk(), but which requires the seq_file pointer as an argument.
 
 For straight character output, the following functions may be used:
 
-   int seq_putc(struct seq_file *m, char c);
-   int seq_puts(struct seq_file *m, const char *s);
-   int seq_escape(struct seq_file *m, const char *s, const char *esc);
+   seq_putc(struct seq_file *m, char c);
+   seq_puts(struct seq_file *m, const char *s);
+   seq_escape(struct seq_file *m, const char *s, const char *esc);
 
 The first two output a single character and a string, just like one would
 expect. seq_escape() is like seq_puts(), except that any character in s
 which is in the string esc will be represented in octal form in the output.
 
-There is also a pair of functions for printing filenames:
+There are also a pair of functions for printing filenames:
 
-   int seq_path(struct seq_file *m, struct path *path, char *esc);
-   int seq_path_root(struct seq_file *m, struct path *path,
- struct path *root, char *esc)
+   seq_path(struct seq_file *m, struct path *path, char *esc);
+   seq_path_root(struct seq_file *m, struct path *path,
+ struct path *root, char *esc)
 
 Here, path indicates the file of interest, and esc is a set of characters
 which should be escaped in the output.  A call to seq_path() will output
@@ -200,6 +196,14 @@ root is desired, it can be used with seq_path_root().  
Note that, if it
 turns out that path cannot be reached from root, the value of root will be
 changed in seq_file_root() to a root which *does* work.
 
+A function producing complicated output may want to check
+   bool seq_is_buf_full(struct seq_file *m);
+and avoid further seq_ calls if true is returned.
+
+A true return from seq_is_buf_full means that the seq_file buffer is full
+and further output will be discarded.  The seq_show function will attempt
+to allocate a larger buffer and retry printing.
+
 
 Making it all work
 
diff --git a/fs/seq_file.c b/fs/seq_file.c
index 1d641bb..2fda3a1 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -14,18 +14,18 @@
 #include 
 #include 
 
-
 /*
- * seq_files have a buffer which can may overflow. When this happens a larger
+ * seq_files have a buffer which may overflow. When this happens a larger
  * buffer is reallocated and all the data will be printed again.
  * The overflow state is true when m->count == m->size.
  */
-static bool seq_overflow(struct seq_file *m)
+bool seq_is_buf_full(struct seq_file *m)
 {
return m->count == m->size;
 }
+EXPORT_SYMBOL(seq_is_buf_full);
 
-static void seq_set_overflow(struct seq_file *m)
+static void seq_set_buf_full(struct seq_file *m)
 {
m->count = m->size;
 }
@@ -112,7 +112,7 @@ static int traverse(struct seq_file *m, loff_t offset)
error = 0;
m->count = 0;
}
-   if (seq_overflow(m))
+   if (seq_is_buf_full(m))
goto Eoverflow;
if (pos + m->count > offset) {
m->from = offset - pos;
@@ -255,7 +255,7 @@ Fill:
break;
}
err = m->op->show(m, p);
-   if (seq_overflow(m) || err) {
+   if (seq_is_buf_full(m) || err) {
m->count = offs;
if (likely(err <= 0))
break;
@@ -384,7 +384,7 @@ int seq_escape(struct seq_file *m, const char *s, const 
char

[PATCH -next 2/2] netfilter: Convert print_tuple functions to return void

2013-12-11 Thread Joe Perches

Since adding a new function to seq_file (seq_is_buf_full)
there isn't any value for functions called from seq_show to
return anything.   Remove the int returns of the various
print_tuple/_print_tuple functions.

Signed-off-by: Joe Perches 
---
 include/net/netfilter/nf_conntrack_core.h|  2 +-
 include/net/netfilter/nf_conntrack_l3proto.h |  4 ++--
 include/net/netfilter/nf_conntrack_l4proto.h |  4 ++--
 net/netfilter/nf_conntrack_l3proto_generic.c |  5 ++---
 net/netfilter/nf_conntrack_proto_dccp.c  | 10 +-
 net/netfilter/nf_conntrack_proto_generic.c   |  5 ++---
 net/netfilter/nf_conntrack_proto_gre.c   | 10 +-
 net/netfilter/nf_conntrack_proto_sctp.c  | 10 +-
 net/netfilter/nf_conntrack_proto_tcp.c   | 10 +-
 net/netfilter/nf_conntrack_proto_udp.c   | 10 +-
 net/netfilter/nf_conntrack_proto_udplite.c   | 10 +-
 net/netfilter/nf_conntrack_standalone.c  | 15 +++
 12 files changed, 46 insertions(+), 49 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_core.h 
b/include/net/netfilter/nf_conntrack_core.h
index 15308b8..7b8f18c 100644
--- a/include/net/netfilter/nf_conntrack_core.h
+++ b/include/net/netfilter/nf_conntrack_core.h
@@ -72,7 +72,7 @@ static inline int nf_conntrack_confirm(struct sk_buff *skb)
return ret;
 }
 
-int
+void
 print_tuple(struct seq_file *s, const struct nf_conntrack_tuple *tuple,
 const struct nf_conntrack_l3proto *l3proto,
 const struct nf_conntrack_l4proto *proto);
diff --git a/include/net/netfilter/nf_conntrack_l3proto.h 
b/include/net/netfilter/nf_conntrack_l3proto.h
index 3efab70..9e349db 100644
--- a/include/net/netfilter/nf_conntrack_l3proto.h
+++ b/include/net/netfilter/nf_conntrack_l3proto.h
@@ -38,8 +38,8 @@ struct nf_conntrack_l3proto {
 const struct nf_conntrack_tuple *orig);
 
/* Print out the per-protocol part of the tuple. */
-   int (*print_tuple)(struct seq_file *s,
-  const struct nf_conntrack_tuple *);
+   void (*print_tuple)(struct seq_file *s,
+   const struct nf_conntrack_tuple *);
 
/*
 * Called before tracking. 
diff --git a/include/net/netfilter/nf_conntrack_l4proto.h 
b/include/net/netfilter/nf_conntrack_l4proto.h
index 4c8d573..fead8ee 100644
--- a/include/net/netfilter/nf_conntrack_l4proto.h
+++ b/include/net/netfilter/nf_conntrack_l4proto.h
@@ -56,8 +56,8 @@ struct nf_conntrack_l4proto {
 u_int8_t pf, unsigned int hooknum);
 
/* Print out the per-protocol part of the tuple. Return like seq_* */
-   int (*print_tuple)(struct seq_file *s,
-  const struct nf_conntrack_tuple *);
+   void (*print_tuple)(struct seq_file *s,
+   const struct nf_conntrack_tuple *);
 
/* Print out the private part of the conntrack. */
int (*print_conntrack)(struct seq_file *s, struct nf_conn *);
diff --git a/net/netfilter/nf_conntrack_l3proto_generic.c 
b/net/netfilter/nf_conntrack_l3proto_generic.c
index e7eb807..cf9ace7 100644
--- a/net/netfilter/nf_conntrack_l3proto_generic.c
+++ b/net/netfilter/nf_conntrack_l3proto_generic.c
@@ -49,10 +49,9 @@ static bool generic_invert_tuple(struct nf_conntrack_tuple 
*tuple,
return true;
 }
 
-static int generic_print_tuple(struct seq_file *s,
-   const struct nf_conntrack_tuple *tuple)
+static void generic_print_tuple(struct seq_file *s,
+   const struct nf_conntrack_tuple *tuple)
 {
-   return 0;
 }
 
 static int generic_get_l4proto(const struct sk_buff *skb, unsigned int nhoff,
diff --git a/net/netfilter/nf_conntrack_proto_dccp.c 
b/net/netfilter/nf_conntrack_proto_dccp.c
index a99b6c3..d357f11 100644
--- a/net/netfilter/nf_conntrack_proto_dccp.c
+++ b/net/netfilter/nf_conntrack_proto_dccp.c
@@ -618,12 +618,12 @@ out_invalid:
return -NF_ACCEPT;
 }
 
-static int dccp_print_tuple(struct seq_file *s,
-   const struct nf_conntrack_tuple *tuple)
+static void dccp_print_tuple(struct seq_file *s,
+const struct nf_conntrack_tuple *tuple)
 {
-   return seq_printf(s, "sport=%hu dport=%hu ",
- ntohs(tuple->src.u.dccp.port),
- ntohs(tuple->dst.u.dccp.port));
+   seq_printf(s, "sport=%hu dport=%hu ",
+  ntohs(tuple->src.u.dccp.port),
+  ntohs(tuple->dst.u.dccp.port));
 }
 
 static int dccp_print_conntrack(struct seq_file *s, struct nf_conn *ct)
diff --git a/net/netfilter/nf_conntrack_proto_generic.c 
b/net/netfilter/nf_conntrack_proto_generic.c
index d25f2937..0a3ded1 100644
--- a/net/netfilter/nf_conntrack_proto_generic.c
+++ b/net/netfilter/nf_conntrack_proto_generic.c
@@ -39,10 +39,9 @@ static bool generic_invert_tuple(struct nf_conntrack_tuple 
*tuple,
 }
 
 /* Print out the per-protocol part of the tuple.

[PATCH -next 0/2] seq_file/netfilter: Start removing returns from seq_

2013-12-11 Thread Joe Perches

The return value from seq_printf/puts/putc/etc are frequently misused.
Start removing the uses of the return values.

Joe Perches (2):
  seq_file: Rename static bool seq_overflow to public bool seq_is_buf_full
  netfilter: Convert print_tuple functions to return void

 Documentation/filesystems/seq_file.txt   | 28 
 fs/seq_file.c| 28 ++--
 include/linux/seq_file.h |  8 
 include/net/netfilter/nf_conntrack_core.h|  2 +-
 include/net/netfilter/nf_conntrack_l3proto.h |  4 ++--
 include/net/netfilter/nf_conntrack_l4proto.h |  4 ++--
 net/netfilter/nf_conntrack_l3proto_generic.c |  5 ++---
 net/netfilter/nf_conntrack_proto_dccp.c  | 10 +-
 net/netfilter/nf_conntrack_proto_generic.c   |  5 ++---
 net/netfilter/nf_conntrack_proto_gre.c   | 10 +-
 net/netfilter/nf_conntrack_proto_sctp.c  | 10 +-
 net/netfilter/nf_conntrack_proto_tcp.c   | 10 +-
 net/netfilter/nf_conntrack_proto_udp.c   | 10 +-
 net/netfilter/nf_conntrack_proto_udplite.c   | 10 +-
 net/netfilter/nf_conntrack_standalone.c  | 15 +++
 15 files changed, 84 insertions(+), 75 deletions(-)

-- 
1.8.1.2.459.gbcd45b4.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] rcu: use simple wait queues where possible in rcutree

2013-12-11 Thread Paul E. McKenney

On Wed, Dec 11, 2013 at 08:06:39PM -0500, Paul Gortmaker wrote:
> From: Thomas Gleixner 
> 
> As of commit dae6e64d2bcfd4b06304ab864c7e3a4f6b5fedf4 ("rcu: Introduce
> proper blocking to no-CBs kthreads GP waits") the rcu subsystem started
> making use of wait queues.
> 
> Here we convert all additions of rcu wait queues to use simple wait queues,
> since they don't need the extra overhead of the full wait queue features.
> 
> Originally this was done for RT kernels, since we would get things like...
> 
>   BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
>   in_atomic(): 1, irqs_disabled(): 1, pid: 8, name: rcu_preempt
>   Pid: 8, comm: rcu_preempt Not tainted
>   Call Trace:
>[] __might_sleep+0xd0/0xf0
>[] rt_spin_lock+0x24/0x50
>[] __wake_up+0x36/0x70
>[] rcu_gp_kthread+0x4d2/0x680
>[] ? __init_waitqueue_head+0x50/0x50
>[] ? rcu_gp_fqs+0x80/0x80
>[] kthread+0xdb/0xe0
>[] ? finish_task_switch+0x52/0x100
>[] kernel_thread_helper+0x4/0x10
>[] ? __init_kthread_worker+0x60/0x60
>[] ? gs_change+0xb/0xb
> 
> ...and hence simple wait queues were deployed on RT out of necessity
> (as simple wait uses a raw lock), but mainline might as well take
> advantage of the more streamline support as well.
> 
> Signed-off-by: Thomas Gleixner 
> Cc: Paul E. McKenney 
> Signed-off-by: Sebastian Andrzej Siewior 
> Signed-off-by: Steven Rostedt 
> [PG: adapt from multiple v3.10-rt patches and add a commit log.]
> Signed-off-by: Paul Gortmaker 

You got the swake_up_all() this time, so:

Reviewed-by: Paul E. McKenney 

;-)

> ---
>  kernel/rcu/tree.c| 16 
>  kernel/rcu/tree.h|  7 ---
>  kernel/rcu/tree_plugin.h | 14 +++---
>  3 files changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index dd08198..b35babb 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1550,9 +1550,9 @@ static int __noreturn rcu_gp_kthread(void *arg)
>   trace_rcu_grace_period(rsp->name,
>  ACCESS_ONCE(rsp->gpnum),
>  TPS("reqwait"));
> - wait_event_interruptible(rsp->gp_wq,
> -  ACCESS_ONCE(rsp->gp_flags) &
> -  RCU_GP_FLAG_INIT);
> + swait_event_interruptible(rsp->gp_wq,
> +   ACCESS_ONCE(rsp->gp_flags) &
> +   RCU_GP_FLAG_INIT);
>   if (rcu_gp_init(rsp))
>   break;
>   cond_resched();
> @@ -1576,7 +1576,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
>   trace_rcu_grace_period(rsp->name,
>  ACCESS_ONCE(rsp->gpnum),
>  TPS("fqswait"));
> - ret = wait_event_interruptible_timeout(rsp->gp_wq,
> + ret = swait_event_interruptible_timeout(rsp->gp_wq,
>   ((gf = ACCESS_ONCE(rsp->gp_flags)) &
>RCU_GP_FLAG_FQS) ||
>   (!ACCESS_ONCE(rnp->qsmask) &&
> @@ -1625,7 +1625,7 @@ static void rsp_wakeup(struct irq_work *work)
>   struct rcu_state *rsp = container_of(work, struct rcu_state, 
> wakeup_work);
> 
>   /* Wake up rcu_gp_kthread() to start the grace period. */
> - wake_up(>gp_wq);
> + swake_up(>gp_wq);
>  }
> 
>  /*
> @@ -1701,7 +1701,7 @@ static void rcu_report_qs_rsp(struct rcu_state *rsp, 
> unsigned long flags)
>  {
>   WARN_ON_ONCE(!rcu_gp_in_progress(rsp));
>   raw_spin_unlock_irqrestore(_get_root(rsp)->lock, flags);
> - wake_up(>gp_wq);  /* Memory barrier implied by wake_up() path. */
> + swake_up(>gp_wq);  /* Memory barrier implied by swake_up() path. */
>  }
> 
>  /*
> @@ -2271,7 +2271,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
>   }
>   rsp->gp_flags |= RCU_GP_FLAG_FQS;
>   raw_spin_unlock_irqrestore(_old->lock, flags);
> - wake_up(>gp_wq);  /* Memory barrier implied by wake_up() path. */
> + swake_up(>gp_wq); /* Memory barrier implied by swake_up() path. */
>  }
> 
>  /*
> @@ -3304,7 +3304,7 @@ static void __init rcu_init_one(struct rcu_state *rsp,
>   }
> 
>   rsp->rda = rda;
> - init_waitqueue_head(>gp_wq);
> + init_swaitqueue_head(>gp_wq);
>   init_irq_work(>wakeup_work, rsp_wakeup);
>   rnp = rsp->level[rcu_num_lvls - 1];
>   for_each_possible_cpu(i) {
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 52be957..01476e1 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -28,6 +28,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  /*
>   * Define shape of hierarchy based on NR_CPUS, CONFIG_RCU_FANOUT,

Re: kernel BUG in munlock_vma_pages_range

2013-12-11 Thread Sasha Levin


On 12/11/2013 05:59 PM, Vlastimil Babka wrote:

On 12/09/2013 09:26 PM, Sasha Levin wrote:

On 12/09/2013 12:12 PM, Vlastimil Babka wrote:

On 12/09/2013 06:05 PM, Sasha Levin wrote:

On 12/09/2013 04:34 AM, Vlastimil Babka wrote:

Hello, I will look at it, thanks.
Do you have specific reproduction instructions?


Not really, the fuzzer hit it once and I've been unable to trigger it again. 
Looking at
the piece of code involved it might have had something to do with hugetlbfs, so 
I'll crank
up testing on that part.


Thanks. Do you have trinity log and the .config file? I'm currently unable to 
even boot linux-next
with my config/setup due to a GPF.
Looking at code I wouldn't expect that it could encounter a tail page, without 
first encountering a
head page and skipping the whole huge page. At least in THP case, as TLB pages 
should be split when
a vma is split. As for hugetlbfs, it should be skipped for mlock/munlock 
operations completely. One
of these assumptions is probably failing here...


If it helps, I've added a dump_page() in case we hit a tail page there and got:

[  980.172299] page:ea003e5e8040 count:0 mapcount:1 mapping:  
(null) index:0
x0
[  980.173412] page flags: 0x2f80008000(tail)

I can also add anything else in there to get other debug output if you think of 
something else useful.


Please try the following. Thanks in advance.


[  428.499889] page:ea003e5c0040 count:0 mapcount:4 mapping:  
(null) index:0x0
[  428.499889] page flags: 0x2f80008000(tail)
[  428.499889] start=140117131923456 pfn=16347137 orig_start=140117130543104 
page_increm
=1 vm_start=140117130543104 vm_end=140117134688256 vm_flags=135266419
[  428.499889] first_page pfn=16347136
[  428.499889] page:ea003e5c count:204 mapcount:44 
mapping:880fb5c466c1 inde
x:0x7f6f8fe00
[  428.499889] page flags: 0x2f80084068(uptodate|lru|active|head|swapbacked)
[  428.499889] pc:880fcfb7 pc->flags:2 pc->mem_cgroup:c90006034000
[  428.374171]  
[  428.374171]     

[  428.374171] Call Trace:
[  428.374171]  [] exit_mmap+0x59/0x170
[  428.374171]  [] ? __khugepaged_exit+0xe0/0x150
[  428.374171]  [] ? kmem_cache_free+0x26b/0x370
[  428.374171]  [] ? __khugepaged_exit+0xe0/0x150
[  428.374171]  [] mmput+0x70/0xe0
[  428.374171]  [] exit_mm+0x18d/0x1a0
[  428.374171]  [] ? acct_collect+0x175/0x1b0
[  428.374171]  [] do_exit+0x26f/0x520
[  428.374171]  [] do_group_exit+0xa9/0xe0
[  428.374171]  [] get_signal_to_deliver+0x4e2/0x570
[  428.374171]  [] do_signal+0x4b/0x120
[  428.374171]  [] ? vtime_account_user+0x96/0xb0
[  428.374171]  [] ? _raw_spin_unlock+0x35/0x60
[  428.374171]  [] ? vtime_account_user+0x96/0xb0
[  428.374171]  [] ? context_tracking_user_exit+0xb8/0x1d0
[  428.374171]  [] ? trace_hardirqs_on+0xd/0x10
[  428.374171]  [] do_notify_resume+0x5a/0xe0
[  428.374171]  [] int_signal+0x12/0x17
[  428.374171] Code: 46 85 31 c0 e8 f9 60 12 03 48 8b 5b 30 48 c7 c7 b0 92 46 85 4a 8d 34 33 31 c0 
48 c1 fe 06 e8 df 60 12 03 48 89 df e8 97 e1 fc ff <0f> 0b 0f 1f 44 00 00 eb fe 66 0f 1f 44 00 00 48 
8b 03 66 85 c0

[  428.374171] RIP  [] munlock_vma_pages_range+0x109/0x240
[  428.374171]  RSP 

Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 09/14] efi: passing kexec necessary efi data via setup_data

2013-12-11 Thread Dave Young

On 12/11/13 at 11:20pm, Borislav Petkov wrote:
> On Mon, Dec 09, 2013 at 05:42:22PM +0800, Dave Young wrote:
> > Add a new setup_data type SETUP_EFI for kexec use.
> > Passing the saved fw_vendor, runtime, config tables and efi runtime 
> > mappings.
> > 
> > When entering virtual mode, directly mapping the efi runtime ragions which
> > we passed in previously. And skip the step to call SetVirtualAddressMap.
> > 
> > Specially for HP z420 workstation we need save the smbios physical address.
> > The kernel boot sequence proceeds in the following order.  Step 2
> > requires efi.smbios to be the physical address.  However, I found that on
> > HP z420 EFI system table has a virtual address of SMBIOS in step 1.  Hence,
> > we need set it back to the physical address with the smbios in
> > efi_setup_data.  (When it is still the physical address, it simply sets
> > the same value.)
> > 
> > 1. efi_init() - Set efi.smbios from EFI system table
> > 2. dmi_scan_machine() - Temporary map efi.smbios to access SMBIOS table
> > 3. efi_enter_virtual_mode() - Map EFI ranges
> > 
> > Tested on ovmf+qemu, lenovo thinkpad, a dell laptop and an
> > HP z420 workstation.
> > 
> > v2: refresh based on previous patch changes, code cleanup.
> > v3: use ioremap instead of phys_to_virt for efi_setup
> > v5: improve some code structure per comments from Matt
> > Boris: improve code structure, spell fix, etc.
> > Improve changelog from Toshi.
> > change the variable efi_setup to the physical address of efi setup_data
> > instead of the ioremapped virt address
> > 
> > Signed-off-by: Dave Young 
> > ---
> >  arch/x86/include/asm/efi.h|  11 ++
> >  arch/x86/include/uapi/asm/bootparam.h |   1 +
> >  arch/x86/kernel/setup.c   |   3 +
> >  arch/x86/platform/efi/efi.c   | 195 
> > ++
> >  4 files changed, 187 insertions(+), 23 deletions(-)
> 
> ...
> 
> > @@ -115,6 +116,25 @@ static int __init setup_storage_paranoia(char *arg)
> >  }
> >  early_param("efi_no_storage_paranoia", setup_storage_paranoia);
> >  
> > +void __init parse_efi_setup(u64 phys_addr)
> > +{
> > +   struct setup_data *sd;
> > +
> > +   if (!efi_enabled(EFI_64BIT)) {
> > +   pr_warn("SETUP_EFI not supported on 32-bit\n");
> > +   return;
> > +   }
> 
> Shouldn't this function be in two versions in efi_64.c and efi_32.c?
> This way you don't need this check with cryptic printk message.

Ok, will update.

> 
> > +
> > +   sd = early_memremap(phys_addr, sizeof(struct setup_data));
> > +   if (!sd) {
> > +   pr_warn("efi: early_memremap setup_data failed\n");
> > +   return;
> > +   }
> > +   efi_setup = phys_addr + sizeof(struct setup_data);
> > +   nr_efi_runtime_map = (sd->len - sizeof(struct efi_setup_data)) /
> > +sizeof(efi_memory_desc_t);
> > +   early_memunmap(sd, sizeof(struct setup_data));
> > +}
> >  
> >  static efi_status_t virt_efi_get_time(efi_time_t *tm, efi_time_cap_t *tc)
> >  {
> > @@ -494,18 +514,28 @@ static int __init efi_systab_init(void *phys)
> >  {
> > if (efi_enabled(EFI_64BIT)) {
> > efi_system_table_64_t *systab64;
> > +   struct efi_setup_data *data = NULL;
> > u64 tmp = 0;
> >  
> > +   if (efi_setup) {
> > +   data = early_memremap(efi_setup, sizeof(*data));
> > +   if (!data)
> > +   return -ENOMEM;
> > +   }
> > systab64 = early_memremap((unsigned long)phys,
> >  sizeof(*systab64));
> > if (systab64 == NULL) {
> > pr_err("Couldn't map the system table!\n");
> > +   if (data)
> > +   early_memunmap(data, sizeof(*data));
> > return -ENOMEM;
> > }
> >  
> > efi_systab.hdr = systab64->hdr;
> > -   efi_systab.fw_vendor = systab64->fw_vendor;
> > -   tmp |= systab64->fw_vendor;
> > +
> > +   efi_systab.fw_vendor = data ? (unsigned long)data->fw_vendor :
> > + systab64->fw_vendor;
> > +   tmp |= efi_systab.fw_vendor;
> > efi_systab.fw_revision = systab64->fw_revision;
> > efi_systab.con_in_handle = systab64->con_in_handle;
> > tmp |= systab64->con_in_handle;
> > @@ -519,15 +549,20 @@ static int __init efi_systab_init(void *phys)
> > tmp |= systab64->stderr_handle;
> > efi_systab.stderr = systab64->stderr;
> > tmp |= systab64->stderr;
> > -   efi_systab.runtime = (void *)(unsigned long)systab64->runtime;
> > -   tmp |= systab64->runtime;
> > +   efi_systab.runtime = data ?
> > +(void *)(unsigned long)data->runtime :
> > +(void *)(unsigned long)systab64->runtime;
> > +   tmp |= (unsigned long)efi_systab.runtime;
>

linux-next: manual merge of the block tree with the f2fs tree

2013-12-11 Thread Stephen Rothwell

Hi Jens,

Today's linux-next merge of the block tree got a conflict in
fs/f2fs/data.c between commit 8758e549e105 ("f2fs: add unlikely() macro
for compiler more aggressively") from the f2fs tree and commit
2c30c71bd653 ("block: Convert various code to bio_for_each_segment()")
from the block tree.

I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc fs/f2fs/data.c
index 15956fa584de,a2c8de8ba6ce..
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@@ -25,203 -25,6 +25,199 @@@
  #include 
  
  /*
 + * Low-level block read/write IO operations.
 + */
 +static struct bio *__bio_alloc(struct block_device *bdev, int npages)
 +{
 +  struct bio *bio;
 +
 +  /* No failure on bio allocation */
 +  bio = bio_alloc(GFP_NOIO, npages);
 +  bio->bi_bdev = bdev;
 +  bio->bi_private = NULL;
 +  return bio;
 +}
 +
 +static void f2fs_read_end_io(struct bio *bio, int err)
 +{
-   const int uptodate = test_bit(BIO_UPTODATE, >bi_flags);
-   struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1;
++  struct bio_vec *bvec;
++  int i;
 +
-   do {
++  bio_for_each_segment_all(bvec, bio, i) {
 +  struct page *page = bvec->bv_page;
 +
-   if (--bvec >= bio->bi_io_vec)
-   prefetchw(>bv_page->flags);
- 
-   if (unlikely(!uptodate)) {
++  if (unlikely(err)) {
 +  ClearPageUptodate(page);
 +  SetPageError(page);
 +  } else {
 +  SetPageUptodate(page);
 +  }
 +  unlock_page(page);
-   } while (bvec >= bio->bi_io_vec);
++  }
 +
 +  bio_put(bio);
 +}
 +
 +static void f2fs_write_end_io(struct bio *bio, int err)
 +{
-   const int uptodate = test_bit(BIO_UPTODATE, >bi_flags);
-   struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1;
-   struct f2fs_sb_info *sbi = F2FS_SB(bvec->bv_page->mapping->host->i_sb);
++  struct f2fs_sb_info *sbi = NULL;
++  struct bio_vec *bvec;
++  int i;
 +
-   do {
++  bio_for_each_segment_all(bvec, bio, i) {
 +  struct page *page = bvec->bv_page;
 +
-   if (--bvec >= bio->bi_io_vec)
-   prefetchw(>bv_page->flags);
- 
-   if (unlikely(!uptodate)) {
++  if (!sbi)
++  sbi = F2FS_SB(bvec->bv_page->mapping->host->i_sb);
++  if (unlikely(err)) {
 +  SetPageError(page);
 +  set_bit(AS_EIO, >mapping->flags);
 +  set_ckpt_flags(sbi->ckpt, CP_ERROR_FLAG);
 +  sbi->sb->s_flags |= MS_RDONLY;
 +  }
 +  end_page_writeback(page);
 +  dec_page_count(sbi, F2FS_WRITEBACK);
-   } while (bvec >= bio->bi_io_vec);
++  }
 +
 +  if (bio->bi_private)
 +  complete(bio->bi_private);
 +
 +  if (!get_pages(sbi, F2FS_WRITEBACK) &&
 +  !list_empty(>cp_wait.task_list))
 +  wake_up(>cp_wait);
 +
 +  bio_put(bio);
 +}
 +
 +static void __submit_merged_bio(struct f2fs_bio_info *io)
 +{
 +  struct f2fs_io_info *fio = >fio;
 +  int rw;
 +
 +  if (!io->bio)
 +  return;
 +
 +  rw = fio->rw | fio->rw_flag;
 +
 +  if (is_read_io(rw)) {
 +  trace_f2fs_submit_read_bio(io->sbi->sb, rw, fio->type, io->bio);
 +  submit_bio(rw, io->bio);
 +  io->bio = NULL;
 +  return;
 +  }
 +  trace_f2fs_submit_write_bio(io->sbi->sb, rw, fio->type, io->bio);
 +
 +  /*
 +   * META_FLUSH is only from the checkpoint procedure, and we should wait
 +   * this metadata bio for FS consistency.
 +   */
 +  if (fio->type == META_FLUSH) {
 +  DECLARE_COMPLETION_ONSTACK(wait);
 +  io->bio->bi_private = 
 +  submit_bio(rw, io->bio);
 +  wait_for_completion();
 +  } else {
 +  submit_bio(rw, io->bio);
 +  }
 +  io->bio = NULL;
 +}
 +
 +void f2fs_submit_merged_bio(struct f2fs_sb_info *sbi,
 +  enum page_type type, int rw)
 +{
 +  enum page_type btype = PAGE_TYPE_OF_BIO(type);
 +  struct f2fs_bio_info *io;
 +
 +  io = is_read_io(rw) ? >read_io : >write_io[btype];
 +
 +  mutex_lock(>io_mutex);
 +
 +  /* change META to META_FLUSH in the checkpoint procedure */
 +  if (type >= META_FLUSH) {
 +  io->fio.type = META_FLUSH;
 +  io->fio.rw = WRITE_FLUSH_FUA;
 +  }
 +  __submit_merged_bio(io);
 +  mutex_unlock(>io_mutex);
 +}
 +
 +/*
 + * Fill the locked page with data located in the block address.
 + * Return unlocked page.
 + */
 +int f2fs_submit_page_bio(struct f2fs_sb_info *sbi, struct page *page,
 +  block_t blk_addr, int rw)
 +{
 +

Re: [RFC PATCH tip 0/5] tracing filters with BPF

2013-12-11 Thread Alexei Starovoitov

On Tue, Dec 10, 2013 at 7:35 PM, Masami Hiramatsu
 wrote:
> (2013/12/11 11:32), Alexei Starovoitov wrote:
>> On Tue, Dec 10, 2013 at 7:47 AM, Ingo Molnar  wrote:
>>>
>>> * Alexei Starovoitov  wrote:
>>>
> I'm fine if it becomes a requirement to have a vmlinux built with
> DEBUG_INFO to use BPF and have a tool like perf to translate the
> filters. But it that must not replace what the current filters do
> now. That is, it can be an add on, but not a replacement.

 Of course. tracing filters via bpf is an additional tool for kernel
 debugging. bpf by itself has use cases beyond tracing.
>>>
>>> Well, Steve has a point: forcing DEBUG_INFO is a big showstopper for
>>> most people.
>>
>> there is a misunderstanding here.
>> I was saying 'of course' to 'not replace current filter infra'.
>>
>> bpf does not depend on debug info.
>> That's the key difference between 'perf probe' approach and bpf filters.
>>
>> Masami is right that what I was trying to achieve with bpf filters
>> is similar to 'perf probe': insert a dynamic probe anywhere
>> in the kernel, walk pointers, data structures, print interesting stuff.
>>
>> 'perf probe' does it via scanning vmlinux with debug info.
>> bpf filters don't need it.
>> tools/bpf/trace/*_orig.c examples only depend on linux headers
>> in /lib/modules/../build/include/
>> Today bpf compiler struct layout is the same as x86_64.
>>
>> Tomorrow bpf compiler will have flags to adjust endianness, pointer size, etc
>> of the front-end. Similar to -m32/-m64 and -m*-endian flags.
>> Neat part is that I don't need to do any work, just enable it properly in
>> the bpf backend. From gcc/llvm point of view, bpf is yet another 'hw'
>> architecture that compiler is emitting code for.
>> So when C code of filter_ex1_orig.c does 'skb->dev', compiler determines
>> field offset by looking at /lib/modules/.../include/skbuff.h
>> whereas for 'perf probe' 'skb->dev' means walk debug info.
>
> Right, the offset of the data structure can get from the header etc.
>
> However, how would the bpf get the register or stack assignment of
> skb itself? In the tracepoint macro, it will be able to get it from
> function parameters (it needs a trick, like jprobe does).
> I doubt you can do that on kprobes/uprobes without any debuginfo
> support. :(

the 4/5 diff actually shows how it's working ;)
for kprobes it works at the function entry, since arguments are still
in the registers
and walks the pointers further down.
It cannot do func+line_number as perf-probe does, of course.
for tracepoints it's the same trick: call no-inline func with traceprobe args
and call inlined crash_setup_regs() that stores the regs.

Of course, there are limitations. Like 7th func argument goes into
stack and requires
more work to get out. If struct is not defined in .h, it would need to
be redefined in filter.c
Corner cases as you said.
Today user of bpf filter needs to know that arg1 goes into %rdi and so on.
that is easy to cleanup.

>> Another use case is to optimize fetch sequences of dynamic probes
>> as Masami suggested, but backward compatibility requirement
>> would preserve to ways of doing it as well.
>
> The backward compatibility issue is only for the interface, but not
> for the implementation, I think. :) The fetch method and filter
> pred do already parse the argument into a syntax tree. IMHO, bpf
> can optimize that tree to just a simple opcode stream.

ahh. yes. that's doable.

Thanks
Alexei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] perf symbols: symbol-minimal.c causes random fd to be closed

2013-12-11 Thread Anton Blanchard


I hit a cryptic failure when testing a recent version
of perf:

  # perf report
  write failure on standard output: Bad file descriptor

The issue is in commit b68e2f91 (perf symbols: Introduce symsrc
structure). symsrc__destroy() does a close(ss->fd) but
ss->fd is only initialised in the symbol-elf.c case and
not for symbol-minimal.c.

The issue has been around for a while however most people
will build with libelf which wont use the symbol-minimal.c
code.

Cc: sta...@vger.kernel.org # v3.8+
Signed-off-by: Anton Blanchard 
---

diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c
index 2d2dd05..3528204 100644
--- a/tools/perf/util/symbol-minimal.c
+++ b/tools/perf/util/symbol-minimal.c
@@ -254,6 +254,7 @@ int symsrc__init(struct symsrc *ss, struct dso *dso 
__maybe_unused,
goto out_close;
 
ss->type = type;
+   ss->fd = fd;
 
return 0;
 out_close:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 08/14] efi: export efi runtime memory mapping to sysfs

2013-12-11 Thread Dave Young

> > +++ b/Documentation/ABI/testing/sysfs-firmware-efi-runtime-map
> > @@ -0,0 +1,36 @@
> > +What:  /sys/firmware/efi/runtime-map/
> > +Date:  December 2013
> > +Contact:   Dave Young 
> > +Description:
> 
> This could start at the same line as Description

Ok.

[snip]
> > +
> > +   Above values are all hexadecimal numbers with the '0x' prefix.
> > +
> 
> Superfluous newline.

Will remove

> 
> > +Users: Kexec
> > diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> > index 3e8b760..8289e0c 100644
> > --- a/arch/x86/platform/efi/efi.c
> > +++ b/arch/x86/platform/efi/efi.c
> > @@ -76,6 +76,9 @@ static __initdata efi_config_table_type_t arch_tables[] = 
> > {
> > {NULL_GUID, NULL, NULL},
> >  };
> >  
> > +void *efi_runtime_map;
> > +int nr_efi_runtime_map;
> > +
> >  /*
> >   * Returns 1 if 'facility' is enabled, 0 otherwise.
> >   */
> > @@ -810,6 +813,19 @@ static void __init efi_merge_regions(void)
> > }
> >  }
> >  
> > +static int __init save_runtime_map(efi_memory_desc_t *md, int idx)
> > +{
> > +   void *p;
> > +   p = krealloc(efi_runtime_map, (idx + 1) * memmap.desc_size, GFP_KERNEL);
> > +   if (!p)
> > +   return -ENOMEM;
> > +
> > +   efi_runtime_map = p;
> > +   memcpy(efi_runtime_map + idx * memmap.desc_size, md, memmap.desc_size);
> > +
> > +   return 0;
> > +}
> > +
> >  /*
> >   * Map efi memory ranges for runtime serivce and update new_memmap with 
> > virtual
> >   * addresses.
> > @@ -820,6 +836,7 @@ static void * __init efi_map_regions(int *count)
> > void *p, *tmp, *new_memmap = NULL;
> > unsigned long size;
> > u64 end, systab;
> > +   int err = 0;
> >  
> > for (p = memmap.map; p < memmap.map_end; p += memmap.desc_size) {
> > md = p;
> > @@ -848,10 +865,21 @@ static void * __init efi_map_regions(int *count)
> > new_memmap = tmp;
> > memcpy(new_memmap + (*count * memmap.desc_size), md,
> >memmap.desc_size);
> > +   if (md->type != EFI_BOOT_SERVICES_CODE &&
> > +   md->type != EFI_BOOT_SERVICES_DATA) {
> > +   err = save_runtime_map(md, nr_efi_runtime_map);
> > +   if (err)
> > +   goto out_save_runtime;
> > +   nr_efi_runtime_map++;
> > +   }
> 
> So why don't you move that code to save_runtime_map?
> 
> 
> It would looks like this:
> 
> ...
> new_memmap = tmp;
> memcpy(new_memmap + (*count * memmap.desc_size), md,
>memmap.desc_size);
> 
> save_runtime_map(md);
> (*count)++;
> 
>  [nr_efi_runtime_map is global, no need to pass it to save_runtime_map() ]

nr_efi_runtime_map is handled in diffrent way for 1st kernel and kexec kernel

For 1st kernel (boot from firmware) it's increased one by one in above function.

But for kexec kernel it is directly calculated from setup_data array len.
And increasing nr_efi_runtime_map in save_runtime_map is not ok the main reason
is I need the value firstly for the loop counter max value like below:
static int __init map_regions_fixed(void)
{
[snip]
for (i = 0, md = data->map; i < nr_efi_runtime_map; i++, md++) {
[snip]
save_runtime_map(md, i);
[snip]
}

> 
> and the EFI_BOOT* tests can be done in save_runtime_map and also the
> error handling can happen there. This way efi_map_regions() won't
> need to know about anything. This way, you can later move the whole
> save_runtime_map() function to efi-kexec.c just by taking it without any
> need for untangling.
> 
> > +out_save_runtime:
> > +   kfree(efi_runtime_map);
> > +   nr_efi_runtime_map = 0;
> > +   efi_runtime_map = NULL;
> 
> This can go there too.

This section can go the save_runtime_map but it looks clearer to put them here.

> 
> >  out_krealloc:
> > kfree(new_memmap);
> > return NULL;
> > diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
> > index 3150aa4..3d8d6f6 100644
> > --- a/drivers/firmware/efi/Kconfig
> > +++ b/drivers/firmware/efi/Kconfig
> > @@ -39,4 +39,15 @@ config EFI_VARS_PSTORE_DEFAULT_DISABLE
> >  config UEFI_CPER
> > def_bool n
> >  
> > +config EFI_RUNTIME_MAP
> > +   bool "Export efi runtime maps to sysfs" if EXPERT
> 
> What's with the EXPERT? It depends on KEXEC already.

EXPERT can be removed safely, will do.

> 
> > +   depends on X86 && EFI && KEXEC
> > +   default y
> > +   help
> > + Export efi runtime memory maps to /sys/firmware/efi/runtime-map.
> > + That memory map is used for example by kexec to set up efi virtual
> > + mapping the 2nd kernel, but can also be used for debugging purposes.
> > +
> > + See also Documentation/ABI/testing/sysfs-firmware-efi-runtime-map.
> > +
> >  endmenu
> 
> ...
> 
> > +static int __init efi_runtime_map_init(void)
> > +{
> > +   int i, j, ret = 0;
> > +   struct

Re: [PATCH v5 07/14] efi: export more efi table variable to sysfs

2013-12-11 Thread Dave Young

On 12/11/13 at 07:32pm, Borislav Petkov wrote:
> On Mon, Dec 09, 2013 at 05:42:20PM +0800, Dave Young wrote:
> > Export fw_vendor, runtime and config table physical addresses to
> > /sys/firmware/efi/fw_vendor, /sys/firmware/efi/runtime and
> > /sys/firmware/efi/config_table because kexec kernel will need them.
> 
> you might wanna shorten:
> 
> ... sys/firmware/efi/{fw_vendor,runtime,config_table} ...

Ok, will do.

> 
> > 
> > From EFI spec these 3 variables will be updated to
> > virtual address after entering virtual mode. But
> > kernel startup code will need the physical address.
> > 
> > changelog:
> > Greg: add standalone sysfs files instead of add lines to systab
> > Document them as testing ABI
> > Greg: use group attrs and is_visible
> > Boris: align comments lines
> > Boris: add macros for _show functions
> > Matt: Documentation fixes.
> > 
> > Signed-off-by: Dave Young 
> > ---
> >  Documentation/ABI/testing/sysfs-firmware-efi | 24 +
> >  arch/x86/platform/efi/efi.c  |  4 +++
> >  drivers/firmware/efi/efi.c   | 39 
> > 
> >  include/linux/efi.h  |  3 +++
> >  4 files changed, 70 insertions(+)
> >  create mode 100644 Documentation/ABI/testing/sysfs-firmware-efi
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-firmware-efi 
> > b/Documentation/ABI/testing/sysfs-firmware-efi
> > new file mode 100644
> > index 000..8c6e460
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-firmware-efi
> > @@ -0,0 +1,24 @@
> > +What:  /sys/firmware/efi/fw_vendor
> > +Date:  December 2013
> > +Contact:   Dave Young 
> > +Description:
> > +   It shows the physical address of firmware vendor field in the
> 
> Why doesn't this start at the same line as "Description:"?

It can, just in 1st version I copied the format from some template, I have
found it's better so updated the Users line but missed the Description.

Will do in next version.

> 
> > +   EFI system table.
> > +
> 
> Superfluous newline.

Will remove it.

[snip]
> > +
> >  static struct attribute *efi_subsys_attrs[] = {
> > _attr_systab.attr,
> > +   _attr_fw_vendor.attr,
> > +   _attr_runtime.attr,
> > +   _attr_config_table.attr,
> > NULL,   /* maybe more in the future? */
>   ^
> 
> Now that there's more, you can drop that wise guy comment :)

Ok.

--
Thanks for review
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/1] tty: Fix hang at ldsem_down_read()

2013-12-11 Thread Peter Hurley

When a controlling tty is being hung up and the hang up is
waiting for a just-signalled tty reader or writer to exit, and a new tty
reader/writer tries to acquire an ldisc reference concurrently with the
ldisc reference release from the signalled reader/writer, the hangup
can hang. The new reader/writer is sleeping in ldsem_down_read() and the
hangup is sleeping in ldsem_down_write() [1].

The new reader/writer fails to wakeup the waiting hangup because the
wrong lock count value is checked (the old lock count rather than the new
lock count) to see if the lock is unowned.

Change helper function to return the new lock count if the cmpxchg was
successful; document this behavior.

[1] edited dmesg log from reporter

SysRq : Show Blocked State
  taskPC stack   pid father
systemd D 88040c4f 0 1  0 0x
 88040c49fbe0 0046 88040c4a 88040c49ffd8
 001d3980 001d3980 88040c4a 88040593d840
 88040c49fb40 810a4cc0 0006 0023
Call Trace:
 [] ? sched_clock_cpu+0x9f/0xe4
 [] ? sched_clock_cpu+0x9f/0xe4
 [] ? sched_clock_cpu+0x9f/0xe4
 [] ? sched_clock_cpu+0x9f/0xe4
 [] schedule+0x24/0x5e
 [] schedule_timeout+0x15b/0x1ec
 [] ? sched_clock_cpu+0x9f/0xe4
 [] ? _raw_spin_unlock_irq+0x24/0x26
 [] down_read_failed+0xe3/0x1b9
 [] ldsem_down_read+0x8b/0xa5
 [] ? tty_ldisc_ref_wait+0x1b/0x44
 [] tty_ldisc_ref_wait+0x1b/0x44
 [] tty_write+0x7d/0x28a
 [] redirected_tty_write+0x8d/0x98
 [] ? tty_write+0x28a/0x28a
 [] do_loop_readv_writev+0x56/0x79
 [] do_readv_writev+0x1b0/0x1ff
 [] ? do_vfs_ioctl+0x32a/0x489
 [] ? final_putname+0x1d/0x3a
 [] vfs_writev+0x2e/0x49
 [] SyS_writev+0x47/0xaa
 [] system_call_fastpath+0x16/0x1b
bashD 81c104c0 0  5469   5302 0x0082
 8800cf817ac0 0046 8804086b22a0 8800cf817fd8
 001d3980 001d3980 8804086b22a0 8800cf817a48
 b9a0 8800cf817a78 81004675 8800cf817a44
Call Trace:
 [] ? dump_trace+0x165/0x29c
 [] ? sched_clock_cpu+0x9f/0xe4
 [] ? save_stack_trace+0x26/0x41
 [] schedule+0x24/0x5e
 [] schedule_timeout+0x15b/0x1ec
 [] ? sched_clock_cpu+0x9f/0xe4
 [] ? down_write_failed+0xa3/0x1c9
 [] ? _raw_spin_unlock_irq+0x24/0x26
 [] down_write_failed+0xab/0x1c9
 [] ldsem_down_write+0x79/0xb1
 [] ? tty_ldisc_lock_pair_timeout+0xa5/0xd9
 [] tty_ldisc_lock_pair_timeout+0xa5/0xd9
 [] tty_ldisc_hangup+0xc4/0x218
 [] __tty_hangup+0x2e2/0x3ed
 [] disassociate_ctty+0x63/0x226
 [] do_exit+0x79f/0xa11
 [] ? get_signal_to_deliver+0x206/0x62f
 [] ? lock_release_holdtime.part.8+0xf/0x16e
 [] do_group_exit+0x47/0xb5
 [] get_signal_to_deliver+0x241/0x62f
 [] do_signal+0x43/0x59d
 [] ? __audit_syscall_exit+0x21a/0x2a8
 [] ? lock_release_holdtime.part.8+0xf/0x16e
 [] do_notify_resume+0x54/0x6c
 [] int_signal+0x12/0x17

Reported-by: Sami Farin 
Cc:  # 3.12.x
Signed-off-by: Peter Hurley 
---
 drivers/tty/tty_ldsem.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/tty_ldsem.c b/drivers/tty/tty_ldsem.c
index 22fad8a..d8a55e8 100644
--- a/drivers/tty/tty_ldsem.c
+++ b/drivers/tty/tty_ldsem.c
@@ -86,11 +86,21 @@ static inline long ldsem_atomic_update(long delta, struct 
ld_semaphore *sem)
return atomic_long_add_return(delta, (atomic_long_t *)>count);
 }
 
+/*
+ * ldsem_cmpxchg() updates @*old with the last-known sem->count value.
+ * Returns 1 if count was successfully changed; @*old will have @new value.
+ * Returns 0 if count was not changed; @*old will have most recent sem->count
+ */
 static inline int ldsem_cmpxchg(long *old, long new, struct ld_semaphore *sem)
 {
-   long tmp = *old;
-   *old = atomic_long_cmpxchg(>count, *old, new);
-   return *old == tmp;
+   long tmp = atomic_long_cmpxchg(>count, *old, new);
+   if (tmp == *old) {
+   *old = new;
+   return 1;
+   } else {
+   *old = tmp;
+   return 0;
+   }
 }
 
 /*
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/1] Fix hang report

2013-12-11 Thread Peter Hurley

Greg,

I know it's late in the -rc cycle but I'd like to get this fix into 3.13.
Although it's only likely to happen at shutdown/reboot, the hang frequency
could be as often as 1 in 1.

The patch fixes the count value returned when the cmpxchg() has
successfully changed the count. Only one code path checks the
returned count when the cmpxchg() is successful; down_read_failed().
After failed down_read attempt is reversed but before the reader waits
for the lock, the new count is checked to ensure _someone_ has the lock:

/* if there are no active locks, wake the new lock owner(s) */
if ((count & LDSEM_ACTIVE_MASK) == 0)
__ldsem_wake(sem);

Because ldsem_cmpxchg() was returning the _old_ value on success, this
was checking the wrong count value.

No other code paths are impacted by the patch. The equivalent diff below
also fixes the problem; however, I feel the intent is less clear.

| diff --git a/drivers/tty/tty_ldsem.c b/drivers/tty/tty_ldsem.c
| index 22fad8a..29d9e7c 100644
| --- a/drivers/tty/tty_ldsem.c
| +++ b/drivers/tty/tty_ldsem.c
| @@ -222,7 +222,7 @@ down_read_failed(struct ld_semaphore *sem, long count, 
long timeout)
|   get_task_struct(tsk);
|  
|   /* if there are no active locks, wake the new lock owner(s) */
| - if ((count & LDSEM_ACTIVE_MASK) == 0)
| + if ((count + adjust & LDSEM_ACTIVE_MASK) == 0)
|   __ldsem_wake(sem);
|  
|   raw_spin_unlock_irq(>wait_lock);


Regards,

Peter Hurley (1):
  tty: Fix hang at ldsem_down_read()

 drivers/tty/tty_ldsem.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] f2fs: introduce sysfs entry to control in-place-update policy

2013-12-11 Thread Jaegeuk Kim

This patch introduces new sysfs entries for users to control the policy of
in-place-updates, namely IPU, in f2fs.

Sometimes f2fs suffers from performance degradation due to its out-of-place
update policy that produces many additional node block writes.
If the storage performance is very dependant on the amount of data writes
instead of IO patterns, we'd better drop this out-of-place update policy.

This patch suggests 5 polcies and their triggering conditions as follows.

[sysfs entry name = ipu_policy]

0: F2FS_IPU_FORCE   all the time,
1: F2FS_IPU_SSR if SSR mode is activated,
2: F2FS_IPU_UTILif FS utilization is over threashold,
3: F2FS_IPU_SSR_UTILif SSR mode is activated and FS utilization is over
threashold,
4: F2FS_IPUT_DISABLEdisable IPU. (=default option)

[sysfs entry name = min_ipu_util]

This parameter controls the threshold to trigger in-place-updates.
The number indicates percentage of the filesystem utilization, and used by
F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies.

For more details, see need_inplace_update() in segment.h.

Signed-off-by: Jaegeuk Kim 
---
 Documentation/filesystems/f2fs.txt | 11 ++
 fs/f2fs/f2fs.h |  3 +++
 fs/f2fs/segment.c  |  2 ++
 fs/f2fs/segment.h  | 44 --
 fs/f2fs/super.c|  4 
 5 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/Documentation/filesystems/f2fs.txt 
b/Documentation/filesystems/f2fs.txt
index a3fe811..4f9b146 100644
--- a/Documentation/filesystems/f2fs.txt
+++ b/Documentation/filesystems/f2fs.txt
@@ -171,6 +171,17 @@ Files in /sys/fs/f2fs/
  conduct checkpoint to reclaim the prefree segments
  to free segments. By default, 100 segments, 200MB.
 
+ ipu_policy   This parameter controls the policy of in-place
+  updates in f2fs. There are five policies:
+   0: F2FS_IPU_FORCE, 1: F2FS_IPU_SSR,
+   2: F2FS_IPU_UTIL,  3: F2FS_IPU_SSR_UTIL,
+   4: F2FS_IPUT_DISABLE.
+
+ min_ipu_util This parameter controls the threshold to trigger
+  in-place-updates. The number indicates percentage
+  of the filesystem utilization, and used by
+  F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies.
+
 

 USAGE
 

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 022ce32..1b05a62 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -325,6 +325,9 @@ struct f2fs_sm_info {
struct list_head discard_list;  /* 4KB discard list */
int nr_discards;/* # of discards in the list */
int max_discards;   /* max. discards to be issued */
+
+   unsigned int ipu_policy;/* in-place-update policy */
+   unsigned int min_ipu_util;  /* in-place-update threshold */
 };
 
 /*
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 0b2e8ce..5b890ce 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1799,6 +1799,8 @@ int build_segment_manager(struct f2fs_sb_info *sbi)
sm_info->main_segments = le32_to_cpu(raw_super->segment_count_main);
sm_info->ssa_blkaddr = le32_to_cpu(raw_super->ssa_blkaddr);
sm_info->rec_prefree_segments = DEF_RECLAIM_PREFREE_SEGMENTS;
+   sm_info->ipu_policy = F2FS_IPU_DISABLE;
+   sm_info->min_ipu_util = DEF_MIN_IPU_UTIL;
 
INIT_LIST_HEAD(_info->discard_list);
sm_info->nr_discards = 0;
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index ea56376..e9a10bd 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -476,19 +476,51 @@ static inline int utilization(struct f2fs_sb_info *sbi)
 
 /*
  * Sometimes f2fs may be better to drop out-of-place update policy.
- * So, if fs utilization is over MIN_IPU_UTIL, then f2fs tries to write
- * data in the original place likewise other traditional file systems.
- * But, currently set 100 in percentage, which means it is disabled.
- * See below need_inplace_update().
+ * And, users can control the policy through sysfs entries.
+ * There are five policies with triggering conditions as follows.
+ * F2FS_IPU_FORCE - all the time,
+ * F2FS_IPU_SSR - if SSR mode is activated,
+ * F2FS_IPU_UTIL - if FS utilization is over threashold,
+ * F2FS_IPU_SSR_UTIL - if SSR mode is activated and FS utilization is over
+ * threashold,
+ * F2FS_IPUT_DISABLE - disable IPU. (=default option)
  */
-#define MIN_IPU_UTIL   100
+#define DEF_MIN_IPU_UTIL   70
+
+enum {
+   F2FS_IPU_FORCE,
+   F2FS_IPU_SSR,
+   F2FS_IPU_UTIL,
+   F2FS_IPU_SSR_UTIL,
+

Re: [PATCH v5 09/14] efi: passing kexec necessary efi data via setup_data

2013-12-11 Thread Dave Young

On 12/11/13 at 12:13pm, Matt Fleming wrote:
> On Mon, 09 Dec, at 05:42:22PM, Dave Young wrote:
> > Add a new setup_data type SETUP_EFI for kexec use.
> > Passing the saved fw_vendor, runtime, config tables and efi runtime 
> > mappings.
> > 
> > When entering virtual mode, directly mapping the efi runtime ragions which
> > we passed in previously. And skip the step to call SetVirtualAddressMap.
> > 
> > Specially for HP z420 workstation we need save the smbios physical address.
> > The kernel boot sequence proceeds in the following order.  Step 2
> > requires efi.smbios to be the physical address.  However, I found that on
> > HP z420 EFI system table has a virtual address of SMBIOS in step 1.  Hence,
> > we need set it back to the physical address with the smbios in
> > efi_setup_data.  (When it is still the physical address, it simply sets
> > the same value.)
> > 
> > 1. efi_init() - Set efi.smbios from EFI system table
> > 2. dmi_scan_machine() - Temporary map efi.smbios to access SMBIOS table
> > 3. efi_enter_virtual_mode() - Map EFI ranges
> > 
> > Tested on ovmf+qemu, lenovo thinkpad, a dell laptop and an
> > HP z420 workstation.
> > 
> > v2: refresh based on previous patch changes, code cleanup.
> > v3: use ioremap instead of phys_to_virt for efi_setup
> > v5: improve some code structure per comments from Matt
> > Boris: improve code structure, spell fix, etc.
> > Improve changelog from Toshi.
> > change the variable efi_setup to the physical address of efi setup_data
> > instead of the ioremapped virt address
> > 
> > Signed-off-by: Dave Young 
> > ---
> >  arch/x86/include/asm/efi.h|  11 ++
> >  arch/x86/include/uapi/asm/bootparam.h |   1 +
> >  arch/x86/kernel/setup.c   |   3 +
> >  arch/x86/platform/efi/efi.c   | 195 
> > ++
> >  4 files changed, 187 insertions(+), 23 deletions(-)
> 
> [...]
> 
> > @@ -115,6 +116,25 @@ static int __init setup_storage_paranoia(char *arg)
> >  }
> >  early_param("efi_no_storage_paranoia", setup_storage_paranoia);
> >  
> > +void __init parse_efi_setup(u64 phys_addr)
> > +{
> > +   struct setup_data *sd;
> > +
> > +   if (!efi_enabled(EFI_64BIT)) {
> > +   pr_warn("SETUP_EFI not supported on 32-bit\n");
> > +   return;
> > +   }
> > +
> > +   sd = early_memremap(phys_addr, sizeof(struct setup_data));
> > +   if (!sd) {
> > +   pr_warn("efi: early_memremap setup_data failed\n");
> 
> You shouldn't need the "efi:" prefix in the message.

Hmm, remove efi: looks better, will update.

> 
> > @@ -676,6 +766,8 @@ void __init efi_init(void)
> > efi.systab->hdr.revision >> 16,
> > efi.systab->hdr.revision & 0x, vendor);
> >  
> > +   efi_reuse_config(efi.systab->tables, efi.systab->nr_tables);
> > +
> 
> Please check the return value.

I missed this one, will update.

> 
> > if (efi_config_init(arch_tables))
> > return;
> >  
> > @@ -886,6 +978,50 @@ out_krealloc:
> >  }
> >  
> >  /*
> > + * Map efi regions which was passed via setup_data. The virt_addr is a 
> > fixed
> > + * addr which was used in first kernel in case kexec boot.
> > + */
> > +static int __init map_regions_fixed(void)
> > +{
> > +   int i, s, ret = 0;
> > +   u64 end, systab;
> > +   unsigned long size;
> > +   efi_memory_desc_t *md;
> > +   struct efi_setup_data *data;
> > +
> > +   s = sizeof(*data) + nr_efi_runtime_map * sizeof(data->map[0]);
> > +   data = early_memremap(efi_setup, s);
> > +   if (!data) {
> > +   ret = -ENOMEM;
> > +   goto out;
> > +   }
> > +   for (i = 0, md = data->map; i < nr_efi_runtime_map; i++, md++) {
> > +   efi_map_region_fixed(md); /* FIXME: add error handling */
> 
> Oops. Please fix this ;-)

Have discussed this with Boris, he will take care of this after he added
error handling in his __map_region function.

--
Thanks for review
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 09/14] efi: passing kexec necessary efi data via setup_data

2013-12-11 Thread Dave Young

On 12/11/13 at 03:05pm, Borislav Petkov wrote:
> On Wed, Dec 11, 2013 at 12:13:52PM +, Matt Fleming wrote:
> > > + for (i = 0, md = data->map; i < nr_efi_runtime_map; i++, md++) {
> > > + efi_map_region_fixed(md); /* FIXME: add error handling */
> > 
> > Oops. Please fix this ;-)
> 
> Yeah, this is on my TODO as it wraps around __map_region, the latter
> needing to propagate error codes.
> 
> I'll take care of it once you merge Dave's patchset.

Thanks Boris..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] drivers: char: Add a dynamic clock for the trace clock

2013-12-11 Thread Steven Rostedt

On Wed, 11 Dec 2013 18:06:06 -0800
Sonny Rao  wrote:


> > ftrace has several clocks that it uses:
> >
> > o local - basically sched_clock()
> > o global - something like hpet that is monotonic across CPUs but slower
> > o counter - a simple atomic counter (no time associated to it)
> > o uptime - jiffy counter
> > o perf  - trace_clock, which is what perf uses
> > o x86_tsc - the raw tsc counter.
> >
> > # cat /sys/kernel/debug/trace_clock
> > [local] global counter uptime perf x86-tsc
> >
> 
> Ah ok sorry for the incorrect info there, thanks for clarifying.
> So, If I wanted to make sure everything is synced up between ftrace
> events and perf events I should say perf here instead of local.

Correct, that's why I created that clock.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] wait-simple: Introduce the simple waitqueue implementation

2013-12-11 Thread Steven Rostedt


> --- /dev/null
> +++ b/kernel/swait.c
> @@ -0,0 +1,118 @@
> +/*
> + * Simple waitqueues without fancy flags and callbacks

We should probably have a more detailed description of when to use
simple wait queues verses normal wait queues. These are obviously much
lighter wait, and should be the preferred method unless you need a
feature of the more heavy weight wait queues.

-- Steve

"weight wait" Ha! Don't get to use that very often ;-)


> + *
> + * (C) 2011 Thomas Gleixner 
> + *
> + * Based on kernel/wait.c
> + *
> + * For licencing details see kernel-base/COPYING
> + */
> +#include 
> +#include 
> +#include 
> +#include 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 01/14] x86/mm: sparse warning fix for early_memremap

2013-12-11 Thread Dave Young

On 12/11/13 at 12:12pm, Borislav Petkov wrote:
> On Wed, Dec 11, 2013 at 10:20:25AM +, Matt Fleming wrote:
> > This needs reviewing by at least one of the x86 folks, but it
> > certainly makes sense to me.
> 
> Ingo told me yesterday, it makes sense too. I'd guess we can try it.
> FWIW, all callers of early_memremap use the memory they get remapped as
> normal memory so we should be safe.
> 
> Maybe this whole discussion should be noted down in the commit message
> so that people know.

Thanks for the info. will add.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] drivers: char: Add a dynamic clock for the trace clock

2013-12-11 Thread Sonny Rao

On Wed, Dec 11, 2013 at 5:49 PM, Steven Rostedt  wrote:
> On Wed, 11 Dec 2013 17:17:30 -0800
> Sonny Rao  wrote:
>
>> On Wed, Dec 11, 2013 at 11:30 AM, Stephane Eranian  
>> wrote:
>> > Sonny,
>> >
>> > Your patch has a couple of problems for me:
>> >  - requires CONFIG_TRACING
>> >
>> > You should directly invoke getrawmonotonic()
>> >  and inline the code from trace_clock_getres().
>> >
>> > That's how I managed to compile your kernel module on my system.
>>
>> You need the changes in kernel/trace/trace.c which is why it's
>> dependent on CONFIG_TRACING.
>> If we put those functions elsewhere we could remove that dependency,
>> but it sounds like people want to just fix the clock that perf uses so
>> that it's exportable and not handle this with something like this
>> patch, which is better.
>
> I have no issue moving the trace_clock.c code into lib/ and we can add
> a CONFIG_TRACE_CLOCK option that can be set by perf and ftrace.
>

That sounds like a good idea to me, regardless of what we end up doing.

>>
>> Also, we should ensure that perf and ftrace are guaranteed to use the
>> same clock, I think it just happens to be the same right now.
>
> ftrace has several clocks that it uses:
>
> o local - basically sched_clock()
> o global - something like hpet that is monotonic across CPUs but slower
> o counter - a simple atomic counter (no time associated to it)
> o uptime - jiffy counter
> o perf  - trace_clock, which is what perf uses
> o x86_tsc - the raw tsc counter.
>
> # cat /sys/kernel/debug/trace_clock
> [local] global counter uptime perf x86-tsc
>

Ah ok sorry for the incorrect info there, thanks for clarifying.
So, If I wanted to make sure everything is synced up between ftrace
events and perf events I should say perf here instead of local.

> -- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 02/14] efi: use early_memremap and early_memunmap

2013-12-11 Thread Dave Young

On 12/11/13 at 10:39am, Matt Fleming wrote:
> (Cc'ing Leif and Mark for the ARM-side of things)
> 
> On Mon, 09 Dec, at 05:42:15PM, Dave Young wrote:
> > In arch/x86/platform/efi/efi.c and drivers/firmware/efi/efi.c turn to use
> > early_memremap/early_memunmap instead of early_ioremap/early_iounmap so 
> > sparse
> > will be happy.
> > 
> > Signed-off-by: Dave Young 
> > ---
> >  arch/x86/platform/efi/efi.c | 20 ++--
> >  drivers/firmware/efi/efi.c  |  4 ++--
> >  2 files changed, 12 insertions(+), 12 deletions(-)
>  
> This looks like a rather nice cleanup but the commit log could use a
> little bit of tweaking...
> 
>   - Please start your commit title (the part after the subsystem tag)
> with a capital letter, e.g.
> 
>   efi: Use early_memremap...
> 
>   - You need to explain in the commit title that you're fixing a sparse
> warning. Anyone reading the patch subject will have no idea _why_
> you're using early_memremap() and early_memunmap().
> 
>   - In the commit message body explain why sparse is currently unhappy.

Sure, will do.

--
Thanks for review
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] wait-simple: Introduce the simple waitqueue implementation

2013-12-11 Thread Steven Rostedt

On Wed, 11 Dec 2013 20:06:37 -0500
Paul Gortmaker  wrote:

> From: Thomas Gleixner 
> 
> The wait_queue is a swiss army knife and in most of the cases the
> full complexity is not needed.  Here we provide a slim version, as
> it lowers memory consumption and runtime overhead.
> 
> The concept originated from RT, where waitqueues are a constant
> source of trouble, as we can't convert the head lock to a raw
> spinlock due to fancy and long lasting callbacks.
> 
> The smp_mb() was added (by Steven Rostedt) to fix a race condition
> with swait wakeups vs. adding items to the list.

For this part, you can also add my:

Signed-off-by: Steven Rostedt 

I'll also look at these and test them a bit against mainline.

Thanks for doing this!

-- Steve

> 
> Signed-off-by: Thomas Gleixner 
> Signed-off-by: Sebastian Andrzej Siewior 
> Cc: Steven Rostedt 
> [PG: carry forward from multiple v3.10-rt patches to mainline, align
>  function names with "normal" wait queue names, update commit log.]
> Signed-off-by: Paul Gortmaker 
> ---

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Staging: TIDSPBRIDGE: Use vm_iomap_memory for mmap-ing instead of remap_pfn_range

2013-12-11 Thread Greg KH

On Wed, Dec 11, 2013 at 01:27:17PM +0300, Dan Carpenter wrote:
> On Wed, Dec 11, 2013 at 11:57:04AM +0200, Ivaylo Dimitrov wrote:
> > On 11.12.2013 10:33, Dan Carpenter wrote:
> > >On Wed, Dec 11, 2013 at 09:45:52AM +0200, Ivajlo Dimitrov wrote:
> > >>I can pick your changes and re-send the original patch with them
> > >>incorporated if there are no objections. Are you fine with that?
> > >>
> > >Do it on top of staging-next, don't redo the original.
> > >
> > >regards,
> > >dan carpenter
> > 
> > I don't see the original patch in the staging-next tree [0], how to
> > proceed? Isn't it better to resend the original patch with Steven's
> > changes included?
> > 
> > [0] 
> > http://git.kernel.org/cgit/linux/kernel/git/gregkh/staging.git/log/drivers/staging/tidspbridge?h=staging-next
> > 
> 
> Oops.  It's in staging-linus not staging-next.  I don't know how Greg
> handles that tree.

The same way I do my others:
*-next : for the "next" kernel merge window
*-linus : for Linus's tree now before the -final release comes out.

The original patch here went to Linus, so it was in staging-linus and
it's already in Linus's tree.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] drivers: char: Add a dynamic clock for the trace clock

2013-12-11 Thread Steven Rostedt

On Wed, 11 Dec 2013 17:17:30 -0800
Sonny Rao  wrote:

> On Wed, Dec 11, 2013 at 11:30 AM, Stephane Eranian  wrote:
> > Sonny,
> >
> > Your patch has a couple of problems for me:
> >  - requires CONFIG_TRACING
> >
> > You should directly invoke getrawmonotonic()
> >  and inline the code from trace_clock_getres().
> >
> > That's how I managed to compile your kernel module on my system.
> 
> You need the changes in kernel/trace/trace.c which is why it's
> dependent on CONFIG_TRACING.
> If we put those functions elsewhere we could remove that dependency,
> but it sounds like people want to just fix the clock that perf uses so
> that it's exportable and not handle this with something like this
> patch, which is better.

I have no issue moving the trace_clock.c code into lib/ and we can add
a CONFIG_TRACE_CLOCK option that can be set by perf and ftrace.

> 
> Also, we should ensure that perf and ftrace are guaranteed to use the
> same clock, I think it just happens to be the same right now.

ftrace has several clocks that it uses:

o local - basically sched_clock()
o global - something like hpet that is monotonic across CPUs but slower
o counter - a simple atomic counter (no time associated to it)
o uptime - jiffy counter
o perf  - trace_clock, which is what perf uses
o x86_tsc - the raw tsc counter.

# cat /sys/kernel/debug/trace_clock
[local] global counter uptime perf x86-tsc

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1584 matches

Mail list logo