[PATCHv7 01/12] of: introduce of_property_for_each_phandle_with_args()
Iterating over a property containing a list of phandles with arguments is a common operation for device drivers. This patch adds a new of_property_for_each_phandle_with_args() macro to make the iteration simpler. Signed-off-by: Hiroshi Doyu Cc: Rob Herring Cc: Grant Likely --- v7: Fixed some minors pointed by Rob and Stephen. v6: Iterate without intrducing a new struct. v6+++: Introduced a new struct "of_phandle_iter" to keep the state when iterating over the list. v6++: Optimized to avoid O(n^2), suggested by Stephen Warren. http://lists.linuxfoundation.org/pipermail/iommu/2013-November/007066.html I didn't introduce any struct to hold params and state here. v6+: Use the description, which Grant Likely proposed, to be full enough that a future reader can figure out why a patch was written. v5: New patch for v5. Signed-off-by: Hiroshi Doyu --- drivers/of/base.c | 46 ++ include/linux/of.h | 32 2 files changed, 78 insertions(+) diff --git a/drivers/of/base.c b/drivers/of/base.c index f807d0e..cd4ab05 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1201,6 +1201,52 @@ void of_print_phandle_args(const char *msg, const struct of_phandle_args *args) printk("\n"); } +const __be32 *of_phandle_iter_next(const char *cells_name, int cell_count, + const __be32 *cur, const __be32 *end, + struct of_phandle_args *out_args) +{ + struct device_node *dn; + int i; + + if (!cells_name && !cell_count) + return NULL; + + if (!cur || (cur >= end)) + return NULL; + + dn = of_find_node_by_phandle(be32_to_cpup(cur++)); + if (!dn) + return NULL; + + if (cells_name) + if (of_property_read_u32(dn, cells_name, _count)) + return NULL; + + out_args->np = dn; + out_args->args_count = cell_count; + for (i = 0; i < cell_count; i++) + out_args->args[i] = be32_to_cpup(cur++); + + return cur; +} +EXPORT_SYMBOL_GPL(of_phandle_iter_next); + +const __be32 *of_phandle_iter_init(const struct device_node *np, + const char *list_name, + const __be32 **end) +{ + size_t bytes; + const __be32 *cur; + + cur = of_get_property(np, list_name, ); + *end = cur; + if (bytes) + *end += bytes / sizeof(*cur); + + return cur; +} +EXPORT_SYMBOL_GPL(of_phandle_iter_init); + static int __of_parse_phandle_with_args(const struct device_node *np, const char *list_name, const char *cells_name, diff --git a/include/linux/of.h b/include/linux/of.h index 276c546..4345582 100644 --- a/include/linux/of.h +++ b/include/linux/of.h @@ -303,6 +303,14 @@ extern int of_parse_phandle_with_fixed_args(const struct device_node *np, extern int of_count_phandle_with_args(const struct device_node *np, const char *list_name, const char *cells_name); +extern const __be32 *of_phandle_iter_init(const struct device_node *np, + const char *list_name, + const __be32 **end); +extern const __be32 *of_phandle_iter_next(const char *cells_name, + int cell_count, + const __be32 *cur, const __be32 *end, + struct of_phandle_args *out_args); + extern void of_alias_scan(void * (*dt_alloc)(u64 size, u64 align)); extern int of_alias_get_id(struct device_node *np, const char *stem); @@ -527,6 +535,22 @@ static inline int of_count_phandle_with_args(struct device_node *np, return -ENOSYS; } +static inline const __be32 *of_phandle_iter_init(const struct device_node *np, +const char *list_name, +const __be32 **end) +{ + return NULL; +} + +static inline const __be32 *of_phandle_iter_next(const char *cells_name, +int cell_count, +const __be32 *cur, +const __be32 *end, +struct of_phandle_args *out_args); +{ + return NULL; +} + static inline int of_alias_get_id(struct device_node *np, const char *stem) { return -ENOSYS; @@ -613,6 +637,14 @@ static inline int of_property_read_u32(const struct device_node *np, s; \ s = of_prop_next_string(prop, s)) +#define of_property_for_each_phandle_with_args(node, list_name, cells_name, \ +
[PATCHv7 00/12] Unifying SMMU driver among Tegra SoCs
Hi, This series provide: (0) IOMMU standard DT binding("iommus") (1) Unified IOMMU(SMMU) driver among Tegra SoCs (2) Multiple Address Space support(MASID) in IOMMU(SMMMU) (3) Tegra IOMMU'able devices, most of platform devices are IOMMU'able. There's been some discussion[1] about device population order. Some devices needs to be populated earlier than other devices regardless of their bus topology. For the solution I implemented an IOMMU hook in driver core: [PATCHv7 04/13] driver/core: populate devices in order for IOMMUs which is based on: http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006933.html The main problem here is, IOMMU devices on the bus need to be poplulated first, then iommu master devices are done later. With CONFIG_OF_IOMMU, "iommus=" DT binding would be used to identify whether a device can be an iommu msater or not. If a device can, we'll defer to populate that device till an iommu device is populated. Then, those defered iommu master devices are populated and configured with help of the already populated iommu device via a new IOMMU API iommu_ops->driver_bound(). This "iommus=" binding is expected used as the global/standard binding. Tested IOMMU functionality with T30 SD/MMC. Any further testing with T114 and/or other devices would be really appreciated. v6: Minior fixes. http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/213082.html v5: Use "iommus=" DT bindings as a standard IOMMU binding. http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/212331.html v4: Add a hook in driver core to control device populatin order. Introduced arm,smmu "mmu-master" binding instead of tegra own. Removed DT patches from this series. http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006931.html v3: Updated based on Stephen Warren's feedback http://lists.linuxfoundation.org/pipermail/iommu/2013-October/006724.html v2: Updated based on Thierry Reding's and Stephen Warren's feedback http://lists.infradead.org/pipermail/linux-arm-kernel/2013-July/181888.html v1: http://lists.infradead.org/pipermail/linux-arm-kernel/2013-June/180267.html Available in the git repository at: git://g...@nv-tegra.nvidia.com/user/hdoyu/linux.git smmu-upstreaming@20131212 Hiroshi Doyu (12): of: introduce of_property_for_each_phandle_with_args() iommu/of: introduce a global iommu device list iommu/of: check if dependee iommu is ready or not driver/core: populate devices in order for IOMMUs iommu/core: add ops->{bound,unbind}_driver() ARM: tegra: create a DT header defining SWGROUP ID iommu/tegra: smmu: register device to iommu dynamically iommu/tegra: smmu: calculate ASID register offset by ID iommu/tegra: smmu: get swgroups from DT "iommus=" iommu/tegra: smmu: allow duplicate ASID wirte iommu/tegra: smmu: Rename hwgrp -> swgroups iommu/tegra: smmu: add SMMU to an global iommu list .../bindings/iommu/nvidia,tegra30-smmu.txt | 30 +- drivers/base/dd.c | 5 + drivers/iommu/Kconfig | 1 + drivers/iommu/iommu.c | 13 +- drivers/iommu/of_iommu.c | 51 +++ drivers/iommu/tegra-smmu.c | 383 + drivers/of/base.c | 46 +++ include/dt-bindings/memory/tegra-swgroup.h | 50 +++ include/linux/iommu.h | 4 + include/linux/of.h | 32 ++ include/linux/of_iommu.h | 22 ++ 11 files changed, 487 insertions(+), 150 deletions(-) create mode 100644 include/dt-bindings/memory/tegra-swgroup.h -- 1.8.1.5 [1] "[RFC] early init and DT platform devices allocation/registration" https://lists.ozlabs.org/pipermail/devicetree-discuss/2013-June/thread.html#36542 "Report from 2013 ARM kernel summit" http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/210426.html "[RFC PATCH] Documentation: devicetree: add description for generic bus properties" http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/215042.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11] PPC: POWERNV: move iommu_add_device earlier
The current implementation of IOMMU on sPAPR does not use iommu_ops and therefore does not call IOMMU API's bus_set_iommu() which 1) sets iommu_ops for a bus 2) registers a bus notifier Instead, PCI devices are added to IOMMU groups from subsys_initcall_sync(tce_iommu_init) which does basically the same thing without using iommu_ops callbacks. However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158) implements iommu_ops and when tce_iommu_init is called, every PCI device is already added to some group so there is a conflict. This patch does 2 things: 1. removes the loop in which PCI devices were added to groups and adds explicit iommu_add_device() calls to add devices as soon as they get the iommu_table pointer assigned to them. 2. moves a bus notifier to powernv code in order to avoid conflict with the notifier from Freescale driver. iommu_add_device() and iommu_del_device() are public now. Signed-off-by: Alexey Kardashevskiy --- Changes: v11: * rebased on upstream v10: * fixed linker error when IOMMU_API is not enabled v9: * removed "KVM" from the subject as it is not really a KVM patch so PPC mainainter (hi Ben!) can review/include it into his tree v8: * added the check for iommu_group!=NULL before removing device from a group as suggested by Wei Yang v2: * added a helper - set_iommu_table_base_and_group - which does set_iommu_table_base() and iommu_add_device() --- arch/powerpc/include/asm/iommu.h| 26 arch/powerpc/kernel/iommu.c | 11 -- arch/powerpc/platforms/powernv/pci-ioda.c | 8 arch/powerpc/platforms/powernv/pci-p5ioc2.c | 2 +- arch/powerpc/platforms/powernv/pci.c| 31 - arch/powerpc/platforms/pseries/iommu.c | 8 +--- 6 files changed, 70 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h index c34656a..774fa27 100644 --- a/arch/powerpc/include/asm/iommu.h +++ b/arch/powerpc/include/asm/iommu.h @@ -101,8 +101,34 @@ extern void iommu_free_table(struct iommu_table *tbl, const char *node_name); */ extern struct iommu_table *iommu_init_table(struct iommu_table * tbl, int nid); +#ifdef CONFIG_IOMMU_API extern void iommu_register_group(struct iommu_table *tbl, int pci_domain_number, unsigned long pe_num); +extern int iommu_add_device(struct device *dev); +extern void iommu_del_device(struct device *dev); +#else +static inline void iommu_register_group(struct iommu_table *tbl, + int pci_domain_number, + unsigned long pe_num) +{ +} + +static inline int iommu_add_device(struct device *dev) +{ + return 0; +} + +static inline void iommu_del_device(struct device *dev) +{ +} +#endif /* !CONFIG_IOMMU_API */ + +static inline void set_iommu_table_base_and_group(struct device *dev, + void *base) +{ + set_iommu_table_base(dev, base); + iommu_add_device(dev); +} extern int iommu_map_sg(struct device *dev, struct iommu_table *tbl, struct scatterlist *sglist, int nelems, diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index 572bb5b..818a092 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -1105,7 +1105,7 @@ void iommu_release_ownership(struct iommu_table *tbl) } EXPORT_SYMBOL_GPL(iommu_release_ownership); -static int iommu_add_device(struct device *dev) +int iommu_add_device(struct device *dev) { struct iommu_table *tbl; int ret = 0; @@ -1134,11 +1134,13 @@ static int iommu_add_device(struct device *dev) return ret; } +EXPORT_SYMBOL_GPL(iommu_add_device); -static void iommu_del_device(struct device *dev) +void iommu_del_device(struct device *dev) { iommu_group_remove_device(dev); } +EXPORT_SYMBOL_GPL(iommu_del_device); static int iommu_bus_notifier(struct notifier_block *nb, unsigned long action, void *data) @@ -1162,13 +1164,8 @@ static struct notifier_block tce_iommu_bus_nb = { static int __init tce_iommu_init(void) { - struct pci_dev *pdev = NULL; - BUILD_BUG_ON(PAGE_SIZE < IOMMU_PAGE_SIZE); - for_each_pci_dev(pdev) - iommu_add_device(>dev); - bus_register_notifier(_bus_type, _iommu_bus_nb); return 0; } diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 2c6d173..f0e6871 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -460,7 +460,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev return; pe = >ioda.pe_array[pdn->pe_number]; - set_iommu_table_base(>dev, >tce32_table); +
[PATCH 3/6] perf tools: Fix perf list --raw-dump option bug.
Perf completion will use perf list --raw-dump to get the events available for '-e'. But currently, it does not work well. Example: # perf stat -e kvmm[TAB] Error: unknown option `raw-dump' usage: perf list [hw|sw|cache|tracepoint|pmu|event_glob] Because perf-completion.sh use 'perf list --raw-dump' to get the all events, but as we introduced the parse_options() for perf list. We will get a error when we use --raw-dump option. This patch add an hiden option raw_dump for perf list. Then the --raw-dump will work well to get the all event names and it will not be noise in perf list -h. Verification: # ./perf stat -e kvmmmu:[TAB] fast_page_fault kvm_mmu_get_page kvm_mmu_paging_elementkvm_mmu_set_accessed_bit kvm_mmu_sync_page kvm_mmu_walker_error handle_mmio_page_faultkvm_mmu_pagetable_walk kvm_mmu_prepare_zap_page kvm_mmu_set_dirty_bit kvm_mmu_unsync_page mark_mmio_spte Signed-off-by: Dongsheng Yang --- tools/perf/builtin-list.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c index 011195e..82d54b6 100644 --- a/tools/perf/builtin-list.c +++ b/tools/perf/builtin-list.c @@ -19,7 +19,9 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused) { int i; + bool raw_dump = false; const struct option list_options[] = { + OPT_BOOLEAN_HIDEN(0, "raw-dump", _dump, NULL), OPT_END() }; const char * const list_usage[] = { @@ -30,6 +32,11 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused) argc = parse_options(argc, argv, list_options, list_usage, PARSE_OPT_STOP_AT_NON_OPTION); + if (raw_dump) { + print_events(NULL, true); + return 0; + } + setup_pager(); if (argc == 0) { @@ -53,8 +60,6 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused) print_hwcache_events(NULL, false); else if (strcmp(argv[i], "pmu") == 0) print_pmu_events(NULL, false); - else if (strcmp(argv[i], "--raw-dump") == 0) - print_events(NULL, true); else { char *sep = strchr(argv[i], ':'), *s; int sep_idx; -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/6] perf tools: Enhancement for perf list for unexpected input.
Example: # perf list test List of pre-defined events (to be used in -e): # echo $? 0 Verification: # perf list test Error: No event for test. Usage: perf list [hw|sw|cache|tracepoint|pmu|event_glob] # echo $? 255 Signed-off-by: Dongsheng Yang --- tools/perf/builtin-list.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c index 82d54b6..ba23f65 100644 --- a/tools/perf/builtin-list.c +++ b/tools/perf/builtin-list.c @@ -63,9 +63,11 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused) else { char *sep = strchr(argv[i], ':'), *s; int sep_idx; + unsigned int count; if (sep == NULL) { - print_events(argv[i], false); + if(!(count = print_events(argv[i], false))) + goto err_out; continue; } sep_idx = sep - argv[i]; @@ -74,9 +76,16 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused) return -1; s[sep_idx] = '\0'; - print_tracepoint_events(s, s + sep_idx + 1, false); + if (!(count = print_tracepoint_events(s, s + sep_idx + 1, false))) + goto err_out; free(s); } } + return 0; + +err_out: + printf("\nError: No event for %s.\n", argv[i]); + printf("Usage:\n\t%s\n", list_usage[0]); + return -1; } -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/6] perf tools: Introduce an OPT_BOOLEAN_HIDEN in to parse-options.h.
Signed-off-by: Dongsheng Yang --- tools/perf/util/parse-options.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/parse-options.h b/tools/perf/util/parse-options.h index cbf0149..f395a21a 100644 --- a/tools/perf/util/parse-options.h +++ b/tools/perf/util/parse-options.h @@ -84,7 +84,7 @@ typedef int parse_opt_cb(const struct option *, const char *arg, int unset); * CALLBACKS can use it like they want. * * `set`:: - * whether an option was set by the user + * whether an option was set by the user. */ struct option { enum parse_opt_type type; @@ -111,6 +111,8 @@ struct option { { .type = OPTION_BOOLEAN, .short_name = (s), .long_name = (l), \ .value = check_vtype(v, bool *), .help = (h), \ .set = check_vtype(os, bool *)} +#define OPT_BOOLEAN_HIDEN(s, l, v, h) \ + { .type = OPTION_BOOLEAN, .short_name = (s), .long_name = (l), .value = check_vtype(v, bool *), .flags = PARSE_OPT_HIDDEN, .help = (h)} #define OPT_INCR(s, l, v, h){ .type = OPTION_INCR, .short_name = (s), .long_name = (l), .value = check_vtype(v, int *), .help = (h) } #define OPT_SET_UINT(s, l, v, h, i) { .type = OPTION_SET_UINT, .short_name = (s), .long_name = (l), .value = check_vtype(v, unsigned int *), .help = (h), .defval = (i) } #define OPT_SET_PTR(s, l, v, h, p) { .type = OPTION_SET_PTR, .short_name = (s), .long_name = (l), .value = (v), .help = (h), .defval = (p) } -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/6] perf tools: Add long_name for call-graph option.
I am not sure why the -g option in record and top has no long_name for it. Example: # perf record --[TAB] --all-cpus --count --freq --no-delay --period--realtime --uid --branch-any--cpu --group --no-inherit--per-thread--stat --verbose --branch-filter --data --mmap-pages --no-samples--pid --tid --weight --call-graph--event --no-buildid--(null) --quiet --timestamp --cgroup--filter--no-buildid-cache --output --raw-samples --transaction There is a --(null) here, it is not clear enough to user. This patch add a "call-graph" long_name for it. Verification: # perf record --[TAB] --all-cpus --count --freq --no-delay --per-thread--stat --verbose --branch-any--cpu --group --no-inherit--pid --tid --weight --branch-filter --data --mmap-pages --no-samples--quiet --timestamp --call-graph--event --no-buildid--output --raw-samples --transaction --cgroup--filter--no-buildid-cache --period --realtime --uid Signed-off-by: Dongsheng Yang --- tools/perf/builtin-record.c | 2 +- tools/perf/builtin-top.c| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index c1c1200..7460c8f 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -844,7 +844,7 @@ const struct option record_options[] = { perf_evlist__parse_mmap_pages), OPT_BOOLEAN(0, "group", , "put the counters into a counter group"), - OPT_CALLBACK_NOOPT('g', NULL, , + OPT_CALLBACK_NOOPT('g', "call-graph", , NULL, "enables call-graph recording" , _callchain_opt), OPT_CALLBACK(0, "call-graph", , diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 03d37a7..a734a1b 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -1084,7 +1084,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused) " abort, in_tx, transaction"), OPT_BOOLEAN('n', "show-nr-samples", _conf.show_nr_samples, "Show a column with the number of samples"), - OPT_CALLBACK_NOOPT('g', NULL, _opts, + OPT_CALLBACK_NOOPT('g', "call-graph", _opts, NULL, "enables call-graph recording", _opt), OPT_CALLBACK(0, "call-graph", _opts, -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/6] perf tools: Fix bug in 'perf list event_glob'.
Example: # perf list kvmmmu List of pre-defined events (to be used in -e): Verification: # perf list kvmmmu List of pre-defined events (to be used in -e): kvmmmu:kvm_mmu_pagetable_walk [Tracepoint event] kvmmmu:kvm_mmu_paging_element [Tracepoint event] kvmmmu:kvm_mmu_set_accessed_bit[Tracepoint event] kvmmmu:kvm_mmu_set_dirty_bit [Tracepoint event] kvmmmu:kvm_mmu_walker_error[Tracepoint event] kvmmmu:kvm_mmu_get_page[Tracepoint event] kvmmmu:kvm_mmu_sync_page [Tracepoint event] kvmmmu:kvm_mmu_unsync_page [Tracepoint event] kvmmmu:kvm_mmu_prepare_zap_page[Tracepoint event] kvmmmu:mark_mmio_spte [Tracepoint event] kvmmmu:handle_mmio_page_fault [Tracepoint event] kvmmmu:fast_page_fault [Tracepoint event] Signed-off-by: Dongsheng Yang --- tools/perf/util/parse-events.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index 969cb8f..8acfa71 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -1236,6 +1236,8 @@ void print_events(const char *event_glob, bool name_only) print_pmu_events(event_glob, name_only); + print_tracepoint_events(event_glob, NULL, name_only); + if (event_glob != NULL) return; @@ -1254,8 +1256,6 @@ void print_events(const char *event_glob, bool name_only) event_type_descriptors[PERF_TYPE_BREAKPOINT]); printf("\n"); } - - print_tracepoint_events(NULL, NULL, name_only); } int parse_events__is_hardcoded_term(struct parse_events_term *term) -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers
Hi Bjorn, , On 11/12/2013 19:32, Bjorn Helgaas wrote: > If this looks reasonable, I'll merge it via the PCI tree for v3.13. > > Bjorn > > > MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers > > Add entries for PCI host controller drivers in drivers/pci/host/. > > Signed-off-by: Bjorn Helgaas > --- > MAINTAINERS | 31 +++ > 1 file changed, 31 insertions(+) > > diff --git a/MAINTAINERS b/MAINTAINERS > index 8285ed4676b6..826c722d92ba 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -6449,16 +6449,47 @@ F:drivers/pci/ > F: include/linux/pci* > F: arch/x86/pci/ > > +PCI DRIVER FOR DESIGNWARE > +M: Jingoo Han > +L: linux-...@vger.kernel.org > +S: Maintained > +F: drivers/pci/host/*designware* > + > +PCI DRIVER FOR IMX6 > +M: Shawn Guo > +L: linux-...@vger.kernel.org > +L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers) > +S: Maintained > +F: drivers/pci/host/*imx6* > + > +PCI DRIVER FOR MVEBU (Marvell Armada 370 and Armada XP SOC support) > +M: Jason Cooper > +L: linux-...@vger.kernel.org > +L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers) > +S: Maintained > +F: drivers/pci/host/*mvebu* I think that Thomas Petazzoni would be more appropriate, he worked on the mvebu PCIe since 6 moths and now know very well the subject. Until now all the mvebu PCIe related questions were managed by Thomas. Regards, Gregory -- Gregory Clement, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHSET 00/14] tools lib traceevent: Get rid of *die() calls from parse-filter.c (v2)
Hello, This patchset tries to remove all die() calls in event filter parsing code. I changed two main functions of pevent_filter_add_filter_str() and pevent_filter_match() to return a proper error code (pevent_errno). The actual error message might be saved in a static buffer in pevent_ filter and it can be accessed by new pevent_filter_strerror() function. The old pevent_strerror() still works for them too. The only remaining bits are in trace-seq.c which implement print functions and I want to hear what's the best way we can handle the error case during the print. I also put this patches on libtraceevent/die-removal-v2 branch in my tree git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git Any comments are welcome, thanks Namhyung Namhyung Kim (14): tools lib traceevent: Get rid of malloc_or_die() in show_error() tools lib traceevent: Get rid of die in add_filter_type() tools lib traceevent: Get rid of malloc_or_die() allocate_arg() tools lib traceevent: Get rid of malloc_or_die() in read_token() tools lib traceevent: Get rid of malloc_or_die() in find_event() tools lib traceevent: Get rid of die() in add_right() tools lib traceevent: Make add_left() return pevent_errno tools lib traceevent: Get rid of die() in reparent_op_arg() tools lib traceevent: Refactor create_arg_item() tools lib traceevent: Refactor process_filter() tools lib traceevent: Make pevent_filter_add_filter_str() return pevent_errno tools lib traceevent: Refactor pevent_filter_match() to get rid of die() tools lib traceevent: Get rid of die() in some string conversion funcitons tools lib traceevent: Introduce pevent_filter_strerror() tools/lib/traceevent/event-parse.c | 17 +- tools/lib/traceevent/event-parse.h | 48 ++- tools/lib/traceevent/parse-filter.c | 615 ++-- 3 files changed, 417 insertions(+), 263 deletions(-) -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 03/14] tools lib traceevent: Get rid of malloc_or_die() allocate_arg()
Also check return value and handle it. Signed-off-by: Namhyung Kim --- tools/lib/traceevent/parse-filter.c | 48 ++--- 1 file changed, 40 insertions(+), 8 deletions(-) diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index 767de4f1e8ee..ab9cefe320b4 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -211,12 +211,7 @@ struct event_filter *pevent_filter_alloc(struct pevent *pevent) static struct filter_arg *allocate_arg(void) { - struct filter_arg *arg; - - arg = malloc_or_die(sizeof(*arg)); - memset(arg, 0, sizeof(*arg)); - - return arg; + return calloc(1, sizeof(struct filter_arg)); } static void free_arg(struct filter_arg *arg) @@ -369,6 +364,10 @@ create_arg_item(struct event_format *event, const char *token, struct filter_arg *arg; arg = allocate_arg(); + if (arg == NULL) { + show_error(error_str, "failed to allocate filter arg"); + return NULL; + } switch (type) { @@ -422,6 +421,9 @@ create_arg_op(enum filter_op_type btype) struct filter_arg *arg; arg = allocate_arg(); + if (!arg) + return NULL; + arg->type = FILTER_ARG_OP; arg->op.type = btype; @@ -434,6 +436,9 @@ create_arg_exp(enum filter_exp_type etype) struct filter_arg *arg; arg = allocate_arg(); + if (!arg) + return NULL; + arg->type = FILTER_ARG_EXP; arg->op.type = etype; @@ -446,6 +451,9 @@ create_arg_cmp(enum filter_exp_type etype) struct filter_arg *arg; arg = allocate_arg(); + if (!arg) + return NULL; + /* Use NUM and change if necessary */ arg->type = FILTER_ARG_NUM; arg->op.type = etype; @@ -909,8 +917,10 @@ static struct filter_arg *collapse_tree(struct filter_arg *arg) case FILTER_VAL_FALSE: free_arg(arg); arg = allocate_arg(); - arg->type = FILTER_ARG_BOOLEAN; - arg->boolean.value = ret == FILTER_VAL_TRUE; + if (arg) { + arg->type = FILTER_ARG_BOOLEAN; + arg->boolean.value = ret == FILTER_VAL_TRUE; + } } return arg; @@ -1057,6 +1067,8 @@ process_filter(struct event_format *event, struct filter_arg **parg, switch (op_type) { case OP_BOOL: arg = create_arg_op(btype); + if (arg == NULL) + goto fail_alloc; if (current_op) ret = add_left(arg, current_op); else @@ -1067,6 +1079,8 @@ process_filter(struct event_format *event, struct filter_arg **parg, case OP_NOT: arg = create_arg_op(btype); + if (arg == NULL) + goto fail_alloc; if (current_op) ret = add_right(current_op, arg, error_str); if (ret < 0) @@ -1086,6 +1100,8 @@ process_filter(struct event_format *event, struct filter_arg **parg, arg = create_arg_exp(etype); else arg = create_arg_cmp(ctype); + if (arg == NULL) + goto fail_alloc; if (current_op) ret = add_right(current_op, arg, error_str); @@ -1119,11 +1135,16 @@ process_filter(struct event_format *event, struct filter_arg **parg, current_op = current_exp; current_op = collapse_tree(current_op); + if (current_op == NULL) + goto fail_alloc; *parg = current_op; return 0; + fail_alloc: + show_error(error_str, "failed to allocate filter arg"); + goto fail; fail_print: show_error(error_str, "Syntax error"); fail: @@ -1154,6 +1175,10 @@ process_event(struct event_format *event, const char *filter_str, /* If parg is NULL, then make it into FALSE */ if (!*parg) { *parg = allocate_arg(); + if (*parg == NULL) { + show_error(error_str, "failed to allocate filter arg"); + return -1; + } (*parg)->type = FILTER_ARG_BOOLEAN; (*parg)->boolean.value = FILTER_FALSE; } @@ -1177,6 +1202,10 @@ static int filter_event(struct event_filter *filter, } else { /* just add a TRUE arg */ arg = allocate_arg(); +
[PATCH 14/14] tools lib traceevent: Introduce pevent_filter_strerror()
From: Namhyung Kim The pevent_filter_strerror() function is for receiving actual error message from pevent_errno value. To do that, add a static buffer to event_filter for saving internal error message If a failed function saved other information in the static buffer returns the information, otherwise returns generic error message. Signed-off-by: Namhyung Kim --- tools/lib/traceevent/event-parse.c | 17 +-- tools/lib/traceevent/event-parse.h | 7 ++- tools/lib/traceevent/parse-filter.c | 98 - 3 files changed, 61 insertions(+), 61 deletions(-) diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c index 22566c271275..2ce565a73dd5 100644 --- a/tools/lib/traceevent/event-parse.c +++ b/tools/lib/traceevent/event-parse.c @@ -5230,22 +5230,7 @@ int pevent_strerror(struct pevent *pevent __maybe_unused, idx = errnum - __PEVENT_ERRNO__START - 1; msg = pevent_error_str[idx]; - - switch (errnum) { - case PEVENT_ERRNO__MEM_ALLOC_FAILED: - case PEVENT_ERRNO__PARSE_EVENT_FAILED: - case PEVENT_ERRNO__READ_ID_FAILED: - case PEVENT_ERRNO__READ_FORMAT_FAILED: - case PEVENT_ERRNO__READ_PRINT_FAILED: - case PEVENT_ERRNO__OLD_FTRACE_ARG_FAILED: - case PEVENT_ERRNO__INVALID_ARG_TYPE: - snprintf(buf, buflen, "%s", msg); - break; - - default: - /* cannot reach here */ - break; - } + snprintf(buf, buflen, "%s", msg); return 0; } diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index 3ad784f5f647..cf5db9013f2c 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -851,10 +851,13 @@ struct filter_type { struct filter_arg *filter; }; +#define PEVENT_FILTER_ERROR_BUFSZ 1024 + struct event_filter { struct pevent *pevent; int filters; struct filter_type *event_filters; + charerror_buffer[PEVENT_FILTER_ERROR_BUFSZ]; }; struct event_filter *pevent_filter_alloc(struct pevent *pevent); @@ -874,10 +877,12 @@ enum filter_trivial_type { enum pevent_errno pevent_filter_add_filter_str(struct event_filter *filter, const char *filter_str); - enum pevent_errno pevent_filter_match(struct event_filter *filter, struct pevent_record *record); +int pevent_filter_strerror(struct event_filter *filter, enum pevent_errno err, + char *buf, size_t buflen); + int pevent_event_filtered(struct event_filter *filter, int event_id); diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index 32ab4396653c..c28b1a912a0c 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -38,55 +38,31 @@ struct event_list { struct event_format *event; }; -#define MAX_ERR_STR_SIZE 256 - -static void show_error(char **error_str, const char *fmt, ...) +static void show_error(char *error_buf, const char *fmt, ...) { unsigned long long index; const char *input; - char *error; va_list ap; int len; int i; - if (!error_str) - return; - input = pevent_get_input_buf(); index = pevent_get_input_buf_ptr(); len = input ? strlen(input) : 0; - error = malloc(MAX_ERR_STR_SIZE + (len*2) + 3); - if (error == NULL) { - /* -* Maybe it's due to len is too long. -* Retry without the input buffer part. -*/ - len = 0; - - error = malloc(MAX_ERR_STR_SIZE); - if (error == NULL) { - /* no memory */ - *error_str = NULL; - return; - } - } - if (len) { - strcpy(error, input); - error[len] = '\n'; + strcpy(error_buf, input); + error_buf[len] = '\n'; for (i = 1; i < len && i < index; i++) - error[len+i] = ' '; - error[len + i] = '^'; - error[len + i + 1] = '\n'; + error_buf[len+i] = ' '; + error_buf[len + i] = '^'; + error_buf[len + i + 1] = '\n'; len += i+2; } va_start(ap, fmt); - vsnprintf(error + len, MAX_ERR_STR_SIZE, fmt, ap); + vsnprintf(error_buf + len, PEVENT_FILTER_ERROR_BUFSZ - len, fmt, ap); va_end(ap); - - *error_str = error; } static void free_token(char *token) @@ -370,7 +346,7 @@ static void free_events(struct event_list *events) static enum pevent_errno create_arg_item(struct event_format *event, const char *token, -
[PATCH 13/14] tools lib traceevent: Get rid of die() in some string conversion funcitons
Those functions are for stringify filter arguments. As caller of those functions handles NULL string properly, it seems that it's enough to return NULL rather than calling die(). Signed-off-by: Namhyung Kim --- tools/lib/traceevent/parse-filter.c | 58 +++-- 1 file changed, 36 insertions(+), 22 deletions(-) diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index 9303c55128db..32ab4396653c 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -1361,8 +1361,10 @@ enum pevent_errno pevent_filter_add_filter_str(struct event_filter *filter, if (ret >= 0 && pevent->test_filters) { char *test; test = pevent_filter_make_string(filter, event->event->id); - printf(" '%s: %s'\n", event->event->name, test); - free(test); + if (test) { + printf(" '%s: %s'\n", event->event->name, test); + free(test); + } } } @@ -2097,7 +2099,9 @@ static char *op_to_str(struct event_filter *filter, struct filter_arg *arg) default: break; } - str = malloc_or_die(6); + str = malloc(6); + if (str == NULL) + break; if (val) strcpy(str, "TRUE"); else @@ -2120,7 +2124,9 @@ static char *op_to_str(struct event_filter *filter, struct filter_arg *arg) } len = strlen(left) + strlen(right) + strlen(op) + 10; - str = malloc_or_die(len); + str = malloc(len); + if (str == NULL) + break; snprintf(str, len, "(%s) %s (%s)", left, op, right); break; @@ -2138,7 +2144,9 @@ static char *op_to_str(struct event_filter *filter, struct filter_arg *arg) right_val = 0; if (right_val >= 0) { /* just return the opposite */ - str = malloc_or_die(6); + str = malloc(6); + if (str == NULL) + break; if (right_val) strcpy(str, "FALSE"); else @@ -2146,8 +2154,9 @@ static char *op_to_str(struct event_filter *filter, struct filter_arg *arg) break; } len = strlen(right) + strlen(op) + 3; - str = malloc_or_die(len); - snprintf(str, len, "%s(%s)", op, right); + str = malloc(len); + if (str) + snprintf(str, len, "%s(%s)", op, right); break; default: @@ -2163,9 +2172,9 @@ static char *val_to_str(struct event_filter *filter, struct filter_arg *arg) { char *str; - str = malloc_or_die(30); - - snprintf(str, 30, "%lld", arg->value.val); + str = malloc(30); + if (str) + snprintf(str, 30, "%lld", arg->value.val); return str; } @@ -2220,12 +2229,14 @@ static char *exp_to_str(struct event_filter *filter, struct filter_arg *arg) op = "^"; break; default: - die("oops in exp"); + op = "[ERROR IN EXPRESSION TYPE]"; + break; } len = strlen(op) + strlen(lstr) + strlen(rstr) + 4; - str = malloc_or_die(len); - snprintf(str, len, "%s %s %s", lstr, op, rstr); + str = malloc(len); + if (str) + snprintf(str, len, "%s %s %s", lstr, op, rstr); out: free(lstr); free(rstr); @@ -2271,9 +2282,9 @@ static char *num_to_str(struct event_filter *filter, struct filter_arg *arg) op = "<="; len = strlen(lstr) + strlen(op) + strlen(rstr) + 4; - str = malloc_or_die(len); - sprintf(str, "%s %s %s", lstr, op, rstr); - + str = malloc(len); + if (str) + sprintf(str, "%s %s %s", lstr, op, rstr); break; default: @@ -2311,10 +2322,11 @@ static char *str_to_str(struct event_filter *filter, struct filter_arg *arg) len = strlen(arg->str.field->name) + strlen(op) + strlen(arg->str.val) + 6; - str = malloc_or_die(len); - snprintf(str, len, "%s %s \"%s\"", -arg->str.field->name, -op, arg->str.val); + str =
[PATCH 02/14] tools lib traceevent: Get rid of die in add_filter_type()
The realloc() should check return value and not to overwrite previous pointer in case of error. Reviewed-by: Steven Rostedt Signed-off-by: Namhyung Kim --- tools/lib/traceevent/parse-filter.c | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index d4b0bac80dc8..767de4f1e8ee 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -161,11 +161,13 @@ add_filter_type(struct event_filter *filter, int id) if (filter_type) return filter_type; - filter->event_filters = realloc(filter->event_filters, - sizeof(*filter->event_filters) * - (filter->filters + 1)); - if (!filter->event_filters) - die("Could not allocate filter"); + filter_type = realloc(filter->event_filters, + sizeof(*filter->event_filters) * + (filter->filters + 1)); + if (!filter_type) + return NULL; + + filter->event_filters = filter_type; for (i = 0; i < filter->filters; i++) { if (filter->event_filters[i].event_id > id) @@ -1180,6 +1182,12 @@ static int filter_event(struct event_filter *filter, } filter_type = add_filter_type(filter, event->id); + if (filter_type == NULL) { + show_error(error_str, "failed to add a new filter: %s", + filter_str ? filter_str : "true"); + return -1; + } + if (filter_type->filter) free_arg(filter_type->filter); filter_type->filter = arg; @@ -1417,6 +1425,9 @@ static int copy_filter_type(struct event_filter *filter, arg->boolean.value = 0; filter_type = add_filter_type(filter, event->id); + if (filter_type == NULL) + return -1; + filter_type->filter = arg; free(str); -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/14] tools lib traceevent: Refactor process_filter()
From: Namhyung Kim So that it can return a proper pevent_errno value. Signed-off-by: Namhyung Kim --- tools/lib/traceevent/event-parse.h | 6 +++- tools/lib/traceevent/parse-filter.c | 64 + 2 files changed, 42 insertions(+), 28 deletions(-) diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index 57b66aed8122..da942d59cc3a 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -368,7 +368,11 @@ enum pevent_flag { _PE(REPARENT_NOT_OP,"cannot reparent other than OP"), \ _PE(REPARENT_FAILED,"failed to reparent filter OP"), \ _PE(BAD_FILTER_ARG, "bad arg in filter tree"),\ - _PE(UNEXPECTED_TYPE,"unexpected type (not a value)") + _PE(UNEXPECTED_TYPE,"unexpected type (not a value)"), \ + _PE(ILLEGAL_TOKEN, "illegal token"), \ + _PE(INVALID_PAREN, "open parenthesis cannot come here"), \ + _PE(UNBALANCED_PAREN, "unbalanced number of parenthesis"), \ + _PE(UNKNOWN_TOKEN, "unknown token") #undef _PE #define _PE(__code, __str) PEVENT_ERRNO__ ## __code diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index 8d71208f0131..5aa5012a17ee 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -937,9 +937,10 @@ static int test_arg(struct filter_arg *parent, struct filter_arg *arg, } /* Remove any unknown event fields */ -static struct filter_arg *collapse_tree(struct filter_arg *arg, char **error_str) +static int collapse_tree(struct filter_arg *arg, +struct filter_arg **arg_collapsed, char **error_str) { - enum filter_vals ret; + int ret; ret = test_arg(arg, arg, error_str); switch (ret) { @@ -955,6 +956,7 @@ static struct filter_arg *collapse_tree(struct filter_arg *arg, char **error_str arg->boolean.value = ret == FILTER_VAL_TRUE; } else { show_error(error_str, "Failed to allocate filter arg"); + ret = PEVENT_ERRNO__MEM_ALLOC_FAILED; } break; @@ -965,10 +967,11 @@ static struct filter_arg *collapse_tree(struct filter_arg *arg, char **error_str break; } - return arg; + *arg_collapsed = arg; + return ret; } -static int +static enum pevent_errno process_filter(struct event_format *event, struct filter_arg **parg, char **error_str, int not) { @@ -982,7 +985,7 @@ process_filter(struct event_format *event, struct filter_arg **parg, enum filter_op_type btype; enum filter_exp_type etype; enum filter_cmp_type ctype; - int ret; + enum pevent_errno ret; *parg = NULL; @@ -1007,20 +1010,20 @@ process_filter(struct event_format *event, struct filter_arg **parg, if (not) { arg = NULL; if (current_op) - goto fail_print; + goto fail_syntax; free(token); *parg = current_exp; return 0; } } else - goto fail_print; + goto fail_syntax; arg = NULL; break; case EVENT_DELIM: if (*token == ',') { - show_error(error_str, - "Illegal token ','"); + show_error(error_str, "Illegal token ','"); + ret = PEVENT_ERRNO__ILLEGAL_TOKEN; goto fail; } @@ -1028,19 +1031,23 @@ process_filter(struct event_format *event, struct filter_arg **parg, if (left_item) { show_error(error_str, "Open paren can not come after item"); + ret = PEVENT_ERRNO__INVALID_PAREN; goto fail; } if (current_exp) { show_error(error_str, "Open paren can not come after expression"); + ret = PEVENT_ERRNO__INVALID_PAREN; goto fail; }
[PATCH 04/14] tools lib traceevent: Get rid of malloc_or_die() in read_token()
Reviewed-by: Steven Rostedt Signed-off-by: Namhyung Kim --- tools/lib/traceevent/parse-filter.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index ab9cefe320b4..246ee81e1f93 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -109,7 +109,11 @@ static enum event_type read_token(char **tok) (strcmp(token, "=") == 0 || strcmp(token, "!") == 0) && pevent_peek_char() == '~') { /* append it */ - *tok = malloc_or_die(3); + *tok = malloc(3); + if (*tok == NULL) { + free_token(token); + return EVENT_ERROR; + } sprintf(*tok, "%c%c", *token, '~'); free_token(token); /* Now remove the '~' from the buffer */ @@ -1123,6 +1127,8 @@ process_filter(struct event_format *event, struct filter_arg **parg, break; case EVENT_NONE: break; + case EVENT_ERROR: + goto fail_alloc; default: goto fail_print; } -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 01/14] tools lib traceevent: Get rid of malloc_or_die() in show_error()
Signed-off-by: Namhyung Kim --- tools/lib/traceevent/parse-filter.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index ab402fb2dcf7..d4b0bac80dc8 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -56,7 +56,21 @@ static void show_error(char **error_str, const char *fmt, ...) index = pevent_get_input_buf_ptr(); len = input ? strlen(input) : 0; - error = malloc_or_die(MAX_ERR_STR_SIZE + (len*2) + 3); + error = malloc(MAX_ERR_STR_SIZE + (len*2) + 3); + if (error == NULL) { + /* +* Maybe it's due to len is too long. +* Retry without the input buffer part. +*/ + len = 0; + + error = malloc(MAX_ERR_STR_SIZE); + if (error == NULL) { + /* no memory */ + *error_str = NULL; + return; + } + } if (len) { strcpy(error, input); -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 06/14] tools lib traceevent: Get rid of die() in add_right()
Refactor it to return appropriate pevent_errno value. Reviewed-by: Steven Rostedt Signed-off-by: Namhyung Kim --- tools/lib/traceevent/event-parse.h | 8 +++- tools/lib/traceevent/parse-filter.c | 34 +++--- 2 files changed, 26 insertions(+), 16 deletions(-) diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index abdfd3c606ed..89e4dfd40db6 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -358,7 +358,13 @@ enum pevent_flag { _PE(OLD_FTRACE_ARG_FAILED,"failed to allocate field name for ftrace"),\ _PE(INVALID_ARG_TYPE, "invalid argument type"), \ _PE(INVALID_EVENT_NAME, "invalid event name"),\ - _PE(EVENT_NOT_FOUND,"No event found") + _PE(EVENT_NOT_FOUND,"no event found"),\ + _PE(SYNTAX_ERROR, "syntax error"), \ + _PE(ILLEGAL_RVALUE, "illegal rvalue"),\ + _PE(ILLEGAL_LVALUE, "illegal lvalue for string comparison"), \ + _PE(INVALID_REGEX, "regex did not compute"), \ + _PE(ILLEGAL_STRING_CMP, "illegal comparison for string"), \ + _PE(ILLEGAL_INTEGER_CMP,"illegal comparison for integer") #undef _PE #define _PE(__code, __str) PEVENT_ERRNO__ ## __code diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index a0ab040e8f71..c08ce594cabe 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -473,8 +473,8 @@ create_arg_cmp(enum filter_exp_type etype) return arg; } -static int add_right(struct filter_arg *op, struct filter_arg *arg, -char **error_str) +static enum pevent_errno +add_right(struct filter_arg *op, struct filter_arg *arg, char **error_str) { struct filter_arg *left; char *str; @@ -505,9 +505,8 @@ static int add_right(struct filter_arg *op, struct filter_arg *arg, case FILTER_ARG_FIELD: break; default: - show_error(error_str, - "Illegal rvalue"); - return -1; + show_error(error_str, "Illegal rvalue"); + return PEVENT_ERRNO__ILLEGAL_RVALUE; } /* @@ -554,7 +553,7 @@ static int add_right(struct filter_arg *op, struct filter_arg *arg, if (left->type != FILTER_ARG_FIELD) { show_error(error_str, "Illegal lvalue for string comparison"); - return -1; + return PEVENT_ERRNO__ILLEGAL_LVALUE; } /* Make sure this is a valid string compare */ @@ -573,25 +572,31 @@ static int add_right(struct filter_arg *op, struct filter_arg *arg, show_error(error_str, "RegEx '%s' did not compute", str); - return -1; + return PEVENT_ERRNO__INVALID_REGEX; } break; default: show_error(error_str, "Illegal comparison for string"); - return -1; + return PEVENT_ERRNO__ILLEGAL_STRING_CMP; } op->type = FILTER_ARG_STR; op->str.type = op_type; op->str.field = left->field.field; op->str.val = strdup(str); - if (!op->str.val) - die("malloc string"); + if (!op->str.val) { + show_error(error_str, "Failed to allocate string filter"); + return PEVENT_ERRNO__MEM_ALLOC_FAILED; + } /* * Need a buffer to copy data for tests */ - op->str.buffer = malloc_or_die(op->str.field->size + 1); + op->str.buffer = malloc(op->str.field->size + 1); + if (!op->str.buffer) { + show_error(error_str, "Failed to allocate string filter"); + return PEVENT_ERRNO__MEM_ALLOC_FAILED; + } /* Null terminate this buffer */ op->str.buffer[op->str.field->size] = 0; @@ -609,7 +614,7 @@ static
[PATCH 08/14] tools lib traceevent: Get rid of die() in reparent_op_arg()
To do that, make the function returns the error code. Also pass error_str so that it can set proper error message when error occurred. Reviewed-by: Steven Rostedt Signed-off-by: Namhyung Kim --- tools/lib/traceevent/event-parse.h | 5 +- tools/lib/traceevent/parse-filter.c | 94 +++-- 2 files changed, 64 insertions(+), 35 deletions(-) diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index 89e4dfd40db6..5e4392d8e2d4 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -364,7 +364,10 @@ enum pevent_flag { _PE(ILLEGAL_LVALUE, "illegal lvalue for string comparison"), \ _PE(INVALID_REGEX, "regex did not compute"), \ _PE(ILLEGAL_STRING_CMP, "illegal comparison for string"), \ - _PE(ILLEGAL_INTEGER_CMP,"illegal comparison for integer") + _PE(ILLEGAL_INTEGER_CMP,"illegal comparison for integer"),\ + _PE(REPARENT_NOT_OP,"cannot reparent other than OP"), \ + _PE(REPARENT_FAILED,"failed to reparent filter OP"), \ + _PE(BAD_FILTER_ARG, "bad arg in filter tree") #undef _PE #define _PE(__code, __str) PEVENT_ERRNO__ ## __code diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index 774c3e4c1d9f..9b05892566e0 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -784,15 +784,18 @@ enum filter_vals { FILTER_VAL_TRUE, }; -void reparent_op_arg(struct filter_arg *parent, struct filter_arg *old_child, - struct filter_arg *arg) +static enum pevent_errno +reparent_op_arg(struct filter_arg *parent, struct filter_arg *old_child, + struct filter_arg *arg, char **error_str) { struct filter_arg *other_child; struct filter_arg **ptr; if (parent->type != FILTER_ARG_OP && - arg->type != FILTER_ARG_OP) - die("can not reparent other than OP"); + arg->type != FILTER_ARG_OP) { + show_error(error_str, "can not reparent other than OP"); + return PEVENT_ERRNO__REPARENT_NOT_OP; + } /* Get the sibling */ if (old_child->op.right == arg) { @@ -801,8 +804,10 @@ void reparent_op_arg(struct filter_arg *parent, struct filter_arg *old_child, } else if (old_child->op.left == arg) { ptr = _child->op.left; other_child = old_child->op.right; - } else - die("Error in reparent op, find other child"); + } else { + show_error(error_str, "Error in reparent op, find other child"); + return PEVENT_ERRNO__REPARENT_FAILED; + } /* Detach arg from old_child */ *ptr = NULL; @@ -813,23 +818,29 @@ void reparent_op_arg(struct filter_arg *parent, struct filter_arg *old_child, *parent = *arg; /* Free arg without recussion */ free(arg); - return; + return 0; } if (parent->op.right == old_child) ptr = >op.right; else if (parent->op.left == old_child) ptr = >op.left; - else - die("Error in reparent op"); + else { + show_error(error_str, "Error in reparent op"); + return PEVENT_ERRNO__REPARENT_FAILED; + } + *ptr = arg; free_arg(old_child); + return 0; } -enum filter_vals test_arg(struct filter_arg *parent, struct filter_arg *arg) +/* Returns either filter_vals (success) or pevent_errno (failfure) */ +static int test_arg(struct filter_arg *parent, struct filter_arg *arg, + char **error_str) { - enum filter_vals lval, rval; + int lval, rval; switch (arg->type) { @@ -844,63 +855,68 @@ enum filter_vals test_arg(struct filter_arg *parent, struct filter_arg *arg) return FILTER_VAL_NORM; case FILTER_ARG_EXP: - lval = test_arg(arg, arg->exp.left); + lval = test_arg(arg, arg->exp.left, error_str); if (lval != FILTER_VAL_NORM) return lval; - rval = test_arg(arg, arg->exp.right); + rval = test_arg(arg, arg->exp.right, error_str); if (rval != FILTER_VAL_NORM) return rval; return FILTER_VAL_NORM; case FILTER_ARG_NUM: - lval = test_arg(arg, arg->num.left); + lval = test_arg(arg, arg->num.left, error_str); if (lval != FILTER_VAL_NORM) return lval; - rval = test_arg(arg, arg->num.right); + rval = test_arg(arg, arg->num.right, error_str); if (rval != FILTER_VAL_NORM) return rval; return
[PATCH 12/14] tools lib traceevent: Refactor pevent_filter_match() to get rid of die()
The test_filter() function is for testing given filter is matched to a given record. However it doesn't handle error cases properly so add a new argument err to save error info during the test and also pass it to internal test functions. The return value of pevent_filter_match() also converted to pevent_errno to indicate an exact error case. Signed-off-by: Namhyung Kim --- tools/lib/traceevent/event-parse.h | 21 -- tools/lib/traceevent/parse-filter.c | 135 +++- 2 files changed, 99 insertions(+), 57 deletions(-) diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index 089964e56ed4..3ad784f5f647 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -357,6 +357,8 @@ enum pevent_flag { _PE(READ_PRINT_FAILED, "failed to read event print fmt"),\ _PE(OLD_FTRACE_ARG_FAILED,"failed to allocate field name for ftrace"),\ _PE(INVALID_ARG_TYPE, "invalid argument type"), \ + _PE(INVALID_EXP_TYPE, "invalid expression type"), \ + _PE(INVALID_OP_TYPE,"invalid operator type"), \ _PE(INVALID_EVENT_NAME, "invalid event name"),\ _PE(EVENT_NOT_FOUND,"no event found"),\ _PE(SYNTAX_ERROR, "syntax error"), \ @@ -373,12 +375,16 @@ enum pevent_flag { _PE(INVALID_PAREN, "open parenthesis cannot come here"), \ _PE(UNBALANCED_PAREN, "unbalanced number of parenthesis"), \ _PE(UNKNOWN_TOKEN, "unknown token"), \ - _PE(FILTER_NOT_FOUND, "no filter found") + _PE(FILTER_NOT_FOUND, "no filter found"), \ + _PE(NOT_A_NUMBER, "must have number field"),\ + _PE(NO_FILTER, "no filters exists"), \ + _PE(FILTER_MISS,"record does not match to filter") #undef _PE #define _PE(__code, __str) PEVENT_ERRNO__ ## __code enum pevent_errno { PEVENT_ERRNO__SUCCESS = 0, + PEVENT_ERRNO__FILTER_MATCH = PEVENT_ERRNO__SUCCESS, /* * Choose an arbitrary negative big number not to clash with standard @@ -853,10 +859,11 @@ struct event_filter { struct event_filter *pevent_filter_alloc(struct pevent *pevent); -#define FILTER_NONE-2 -#define FILTER_NOEXIST -1 -#define FILTER_MISS0 -#define FILTER_MATCH 1 +/* for backward compatibility */ +#define FILTER_NONEPEVENT_ERRNO__FILTER_NOT_FOUND +#define FILTER_NOEXIST PEVENT_ERRNO__NO_FILTER +#define FILTER_MISSPEVENT_ERRNO__FILTER_MISS +#define FILTER_MATCH PEVENT_ERRNO__FILTER_MATCH enum filter_trivial_type { FILTER_TRIVIAL_FALSE, @@ -868,8 +875,8 @@ enum pevent_errno pevent_filter_add_filter_str(struct event_filter *filter, const char *filter_str); -int pevent_filter_match(struct event_filter *filter, - struct pevent_record *record); +enum pevent_errno pevent_filter_match(struct event_filter *filter, + struct pevent_record *record); int pevent_event_filtered(struct event_filter *filter, int event_id); diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index 78440d73e0ad..9303c55128db 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -1678,8 +1678,8 @@ int pevent_filter_event_has_trivial(struct event_filter *filter, } } -static int test_filter(struct event_format *event, - struct filter_arg *arg, struct pevent_record *record); +static int test_filter(struct event_format *event, struct filter_arg *arg, + struct pevent_record *record, enum pevent_errno *err); static const char * get_comm(struct event_format *event, struct pevent_record *record) @@ -1725,15 +1725,24 @@ get_value(struct event_format *event, } static unsigned long long -get_arg_value(struct event_format *event, struct filter_arg *arg, struct pevent_record *record); +get_arg_value(struct event_format *event, struct filter_arg *arg, + struct pevent_record *record, enum pevent_errno *err); static unsigned long long -get_exp_value(struct event_format *event, struct filter_arg *arg, struct pevent_record *record) +get_exp_value(struct event_format *event, struct filter_arg *arg, + struct pevent_record *record, enum pevent_errno *err) { unsigned long long lval, rval; - lval = get_arg_value(event, arg->exp.left, record); - rval = get_arg_value(event, arg->exp.right, record); + lval = get_arg_value(event, arg->exp.left, record,
[PATCH 11/14] tools lib traceevent: Make pevent_filter_add_filter_str() return pevent_errno
From: Namhyung Kim Refactor the pevent_filter_add_filter_str() to return a proper error code and get rid of the third error_str argument. Signed-off-by: Namhyung Kim --- tools/lib/traceevent/event-parse.h | 8 ++-- tools/lib/traceevent/parse-filter.c | 78 +++-- 2 files changed, 27 insertions(+), 59 deletions(-) diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index da942d59cc3a..089964e56ed4 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -372,7 +372,8 @@ enum pevent_flag { _PE(ILLEGAL_TOKEN, "illegal token"), \ _PE(INVALID_PAREN, "open parenthesis cannot come here"), \ _PE(UNBALANCED_PAREN, "unbalanced number of parenthesis"), \ - _PE(UNKNOWN_TOKEN, "unknown token") + _PE(UNKNOWN_TOKEN, "unknown token"), \ + _PE(FILTER_NOT_FOUND, "no filter found") #undef _PE #define _PE(__code, __str) PEVENT_ERRNO__ ## __code @@ -863,9 +864,8 @@ enum filter_trivial_type { FILTER_TRIVIAL_BOTH, }; -int pevent_filter_add_filter_str(struct event_filter *filter, -const char *filter_str, -char **error_str); +enum pevent_errno pevent_filter_add_filter_str(struct event_filter *filter, + const char *filter_str); int pevent_filter_match(struct event_filter *filter, diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index 5aa5012a17ee..78440d73e0ad 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -1209,7 +1209,7 @@ process_filter(struct event_format *event, struct filter_arg **parg, return ret; } -static int +static enum pevent_errno process_event(struct event_format *event, const char *filter_str, struct filter_arg **parg, char **error_str) { @@ -1218,21 +1218,15 @@ process_event(struct event_format *event, const char *filter_str, pevent_buffer_init(filter_str, strlen(filter_str)); ret = process_filter(event, parg, error_str, 0); - if (ret == 1) { - show_error(error_str, - "Unbalanced number of ')'"); - return -1; - } if (ret < 0) return ret; /* If parg is NULL, then make it into FALSE */ if (!*parg) { *parg = allocate_arg(); - if (*parg == NULL) { - show_error(error_str, "failed to allocate filter arg"); - return -1; - } + if (*parg == NULL) + return PEVENT_ERRNO__MEM_ALLOC_FAILED; + (*parg)->type = FILTER_ARG_BOOLEAN; (*parg)->boolean.value = FILTER_FALSE; } @@ -1240,13 +1234,13 @@ process_event(struct event_format *event, const char *filter_str, return 0; } -static int filter_event(struct event_filter *filter, - struct event_format *event, - const char *filter_str, char **error_str) +static enum pevent_errno +filter_event(struct event_filter *filter, struct event_format *event, +const char *filter_str, char **error_str) { struct filter_type *filter_type; struct filter_arg *arg; - int ret; + enum pevent_errno ret; if (filter_str) { ret = process_event(event, filter_str, , error_str); @@ -1256,20 +1250,16 @@ static int filter_event(struct event_filter *filter, } else { /* just add a TRUE arg */ arg = allocate_arg(); - if (arg == NULL) { - show_error(error_str, "failed to allocate filter arg"); - return -1; - } + if (arg == NULL) + return PEVENT_ERRNO__MEM_ALLOC_FAILED; + arg->type = FILTER_ARG_BOOLEAN; arg->boolean.value = FILTER_TRUE; } filter_type = add_filter_type(filter, event->id); - if (filter_type == NULL) { - show_error(error_str, "failed to add a new filter: %s", - filter_str ? filter_str : "true"); - return -1; - } + if (filter_type == NULL) + return PEVENT_ERRNO__MEM_ALLOC_FAILED; if (filter_type->filter) free_arg(filter_type->filter); @@ -1282,18 +1272,12 @@ static int filter_event(struct event_filter *filter, * pevent_filter_add_filter_str - add a new filter * @filter: the event filter to add to * @filter_str: the filter string that contains the filter - * @error_str: string containing reason for failed filter - * - * Returns 0 if the filter was successfully added - * -1 if there was an error. * - * On
[PATCH 09/14] tools lib traceevent: Refactor create_arg_item()
From: Namhyung Kim So that it can return a proper pevent_errno value. Signed-off-by: Namhyung Kim --- tools/lib/traceevent/event-parse.h | 3 ++- tools/lib/traceevent/parse-filter.c | 20 ++-- 2 files changed, 12 insertions(+), 11 deletions(-) diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index 5e4392d8e2d4..57b66aed8122 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -367,7 +367,8 @@ enum pevent_flag { _PE(ILLEGAL_INTEGER_CMP,"illegal comparison for integer"),\ _PE(REPARENT_NOT_OP,"cannot reparent other than OP"), \ _PE(REPARENT_FAILED,"failed to reparent filter OP"), \ - _PE(BAD_FILTER_ARG, "bad arg in filter tree") + _PE(BAD_FILTER_ARG, "bad arg in filter tree"),\ + _PE(UNEXPECTED_TYPE,"unexpected type (not a value)") #undef _PE #define _PE(__code, __str) PEVENT_ERRNO__ ## __code diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index 9b05892566e0..8d71208f0131 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -368,9 +368,9 @@ static void free_events(struct event_list *events) } } -static struct filter_arg * +static enum pevent_errno create_arg_item(struct event_format *event, const char *token, - enum event_type type, char **error_str) + enum event_type type, struct filter_arg **parg, char **error_str) { struct format_field *field; struct filter_arg *arg; @@ -378,7 +378,7 @@ create_arg_item(struct event_format *event, const char *token, arg = allocate_arg(); if (arg == NULL) { show_error(error_str, "failed to allocate filter arg"); - return NULL; + return PEVENT_ERRNO__MEM_ALLOC_FAILED; } switch (type) { @@ -392,7 +392,7 @@ create_arg_item(struct event_format *event, const char *token, if (!arg->value.str) { free_arg(arg); show_error(error_str, "failed to allocate string filter arg"); - return NULL; + return PEVENT_ERRNO__MEM_ALLOC_FAILED; } break; case EVENT_ITEM: @@ -420,11 +420,11 @@ create_arg_item(struct event_format *event, const char *token, break; default: free_arg(arg); - show_error(error_str, "expected a value but found %s", - token); - return NULL; + show_error(error_str, "expected a value but found %s", token); + return PEVENT_ERRNO__UNEXPECTED_TYPE; } - return arg; + *parg = arg; + return 0; } static struct filter_arg * @@ -993,8 +993,8 @@ process_filter(struct event_format *event, struct filter_arg **parg, case EVENT_SQUOTE: case EVENT_DQUOTE: case EVENT_ITEM: - arg = create_arg_item(event, token, type, error_str); - if (!arg) + ret = create_arg_item(event, token, type, , error_str); + if (ret < 0) goto fail; if (!left_item) left_item = arg; -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 05/14] tools lib traceevent: Get rid of malloc_or_die() in find_event()
Make it return pevent_errno to distinguish malloc allocation failure. Since it'll be returned to user later, add more error code. Reviewed-by: Steven Rostedt Signed-off-by: Namhyung Kim --- tools/lib/traceevent/event-parse.h | 4 +++- tools/lib/traceevent/parse-filter.c | 27 +++ 2 files changed, 22 insertions(+), 9 deletions(-) diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index 6e23f197175f..abdfd3c606ed 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -356,7 +356,9 @@ enum pevent_flag { _PE(READ_FORMAT_FAILED, "failed to read event format"), \ _PE(READ_PRINT_FAILED, "failed to read event print fmt"),\ _PE(OLD_FTRACE_ARG_FAILED,"failed to allocate field name for ftrace"),\ - _PE(INVALID_ARG_TYPE, "invalid argument type") + _PE(INVALID_ARG_TYPE, "invalid argument type"), \ + _PE(INVALID_EVENT_NAME, "invalid event name"),\ + _PE(EVENT_NOT_FOUND,"No event found") #undef _PE #define _PE(__code, __str) PEVENT_ERRNO__ ## __code diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index 246ee81e1f93..a0ab040e8f71 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -287,7 +287,7 @@ static int event_match(struct event_format *event, !regexec(ereg, event->name, 0, NULL, 0); } -static int +static enum pevent_errno find_event(struct pevent *pevent, struct event_list **events, char *sys_name, char *event_name) { @@ -306,23 +306,31 @@ find_event(struct pevent *pevent, struct event_list **events, sys_name = NULL; } - reg = malloc_or_die(strlen(event_name) + 3); + reg = malloc(strlen(event_name) + 3); + if (reg == NULL) + return PEVENT_ERRNO__MEM_ALLOC_FAILED; + sprintf(reg, "^%s$", event_name); ret = regcomp(, reg, REG_ICASE|REG_NOSUB); free(reg); if (ret) - return -1; + return PEVENT_ERRNO__INVALID_EVENT_NAME; if (sys_name) { - reg = malloc_or_die(strlen(sys_name) + 3); + reg = malloc(strlen(sys_name) + 3); + if (reg == NULL) { + regfree(); + return PEVENT_ERRNO__MEM_ALLOC_FAILED; + } + sprintf(reg, "^%s$", sys_name); ret = regcomp(, reg, REG_ICASE|REG_NOSUB); free(reg); if (ret) { regfree(); - return -1; + return PEVENT_ERRNO__INVALID_EVENT_NAME; } } @@ -342,9 +350,9 @@ find_event(struct pevent *pevent, struct event_list **events, regfree(); if (!match) - return -1; + return PEVENT_ERRNO__EVENT_NOT_FOUND; if (fail) - return -2; + return PEVENT_ERRNO__MEM_ALLOC_FAILED; return 0; } @@ -1312,7 +1320,10 @@ int pevent_filter_add_filter_str(struct event_filter *filter, /* Find this event */ ret = find_event(pevent, , strim(sys_name), strim(event_name)); if (ret < 0) { - if (event_name) + if (ret == PEVENT_ERRNO__MEM_ALLOC_FAILED) + show_error(error_str, + "Memory allocation failure"); + else if (event_name) show_error(error_str, "No event found under '%s.%s'", sys_name, event_name); -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 07/14] tools lib traceevent: Make add_left() return pevent_errno
From: Namhyung Kim So that it can propagate error properly. Signed-off-by: Namhyung Kim --- tools/lib/traceevent/parse-filter.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c index c08ce594cabe..774c3e4c1d9f 100644 --- a/tools/lib/traceevent/parse-filter.c +++ b/tools/lib/traceevent/parse-filter.c @@ -648,7 +648,7 @@ rotate_op_right(struct filter_arg *a, struct filter_arg *b) return arg; } -static int add_left(struct filter_arg *op, struct filter_arg *arg) +static enum pevent_errno add_left(struct filter_arg *op, struct filter_arg *arg) { switch (op->type) { case FILTER_ARG_EXP: @@ -667,11 +667,11 @@ static int add_left(struct filter_arg *op, struct filter_arg *arg) /* left arg of compares must be a field */ if (arg->type != FILTER_ARG_FIELD && arg->type != FILTER_ARG_BOOLEAN) - return -1; + return PEVENT_ERRNO__INVALID_ARG_TYPE; op->num.left = arg; break; default: - return -1; + return PEVENT_ERRNO__INVALID_ARG_TYPE; } return 0; } -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf list: fix --raw-dump
David Ahern wrote: > Why not make raw_dump a proper argument? Sure, that'd work too. I was thinking of a minimal way to fix the problem myself. > diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c > index 011195e38f21..b553d0c4ca82 100644 > --- a/tools/perf/builtin-list.c > +++ b/tools/perf/builtin-list.c > @@ -36,6 +38,10 @@ int cmd_list(int argc, const char > print_events(NULL, false); > return 0; >} > + if (raw_dump) { > +print_events(NULL, true); > +return 0; > + } This won't work because you've put it right below the `if (argc == 0)`, which executes print_events(). You could move it up and get it to work. From 7198a494cfef43395e8683ac3a0576277b8d1d80 Mon Sep 17 00:00:00 2001 From: David Ahern Date: Wed, 11 Dec 2013 14:00:20 -0700 Subject: [PATCH] perf list: Fix raw-dump arg Ramkumar reported that perf list --raw-dump was broken by 44d742e. Fix by making raw-dump a proper argument. Signed-off-by: David Ahern Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Ramkumar Ramachandra Signed-off-by: Ramkumar Ramachandra --- tools/perf/builtin-list.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c index 011195e..2629c24 100644 --- a/tools/perf/builtin-list.c +++ b/tools/perf/builtin-list.c @@ -19,7 +19,9 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused) { int i; + bool raw_dump = false; const struct option list_options[] = { + OPT_BOOLEAN(0, "raw-dump", _dump, "raw dump for completion"), OPT_END() }; const char * const list_usage[] = { @@ -32,6 +34,10 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused) setup_pager(); + if (raw_dump) { + print_events(NULL, true); + return 0; + } if (argc == 0) { print_events(NULL, false); return 0; @@ -53,8 +59,6 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused) print_hwcache_events(NULL, false); else if (strcmp(argv[i], "pmu") == 0) print_pmu_events(NULL, false); - else if (strcmp(argv[i], "--raw-dump") == 0) - print_events(NULL, true); else { char *sep = strchr(argv[i], ':'), *s; int sep_idx; -- 1.8.5.1.113.g8cb5bef.dirty
Re: [PATCH 04/10] net: stmmac: sunxi platfrom extensions for GMAC in Allwinner A20 SoC's
Hi, On Wed, Dec 11, 2013 at 10:45 PM, srinivas kandagatla wrote: > Hi Chen, > > On 11/12/13 12:17, Chen-Yu Tsai wrote: > >>> >>> I would be good to get actual picture of this hw setup, On ST the >>> additional glue logic which sits on top of the GMAC is to resposible for >>> selecting the correct retime clock. >> >> I would have liked to look at the internal design, how the dwmac core >> is connected to the clock control, but that is out of the question. >> Still, based on the documents, I think our clock controller is partially >> intertwined with the GMAC. It takes GMAC's internally generated clock >> as one of several inputs, then sends it back to the GMAC to time tx data. >> > This is very much similar to ST glue, one of the selected clk is used > for retime the tx data lines. This selection is more of board dependent. > It totally depends on how the GMAC is wired up with PHY. > >> Judging by the register definitions listed in the A20 manual, >> the SoC glue layer clocks is something like this: >> >>_ >> MII TX clock from PHY >-|____|> to GMAC core >> GMAC Int. RGMII TX clk >|___\__/__gate---|> to PHY >> Ext. 125MHz RGMII TX clk >--|__divider__/| >> || >> >> >> For MII mode, the glue layer should select the TX clock from the PHY. >> The gate to the PHY should be disabled. >> >> For RGMII mode, either the internal clock generated by the GMAC core, >> or the external 125MHz reference generated by the PHY can be selected. >> And the clock gate to the PHY should be enabled. >> If the 125MHz reference is used, the glue layer should select the proper >> divider (/1, /5, /50) based on the link speed. >> >> For GMII mode, under 10/100 speeds, the operation matches MII mode. >> For gigabit speeds, should use a 125MHz clock (internal or external) >> and enable the output gate. >> >> The glue layer may indeed sit on top or around the GMAC core. >> Nevertheless, its operational state does depend on the GMAC. >> The current callbacks present in the stmmac driver are a good model >> for this. > > Callbacks are OK with me, as they give good level of abstraction as you > said. > > But I don't like the idea of glue drivers passing the full platform data > to stmmac or glue driver parsing the platform data, which is going to > look as very ugly fixups. > > Also, currently callbacks just take pdev, which seems to be forcing glue > drivers to use platform data as the only data structure to pass information. > > My recommendation would be to add new parameter to these callbacks , > which can be used for to store glue private datastructure, we could > actually use bsp_priv variable from platform data. I agree. The original design provided .custom_data, .custom_cfg, .bsp_priv fields in the platform data for the callbacks. I am not aware of any users of these fields in the current kernel. Maybe the intended users, ST platforms, have migrated to DT. Merging the three fields would be nice, but may break some unsuspecting user. > So the of_data structure would have some thing like: > > struct stmmac_of_data { > void * (*setup)(struct platform_device *pdev); > void (*bus_setup)(struct platform_device *pdev, void *priv, void > __iomem *ioaddr); > int (*init)(struct platform_device *pdev, void *priv); > void (*exit)(struct platform_device *pdev, void *priv); > void (*fix_mac_speed)(struct platform_device *pdev, void *priv, > unnsigned int speed); > > }; > > setup() would return a private data struct of glue driver which can be > stored in plat->bsp_priv. Should be done at DT parsing level. So this would be called at the end of stmmac_probe_config_dt. And for non-DT platforms, they should provide .bsp_priv themselves. > Regarding the bindings, If Peppe is happy to allow optional SOC specific > binding in it, it is Ok with me too. > > But all SOC specific resources names and properties have to be properly > prefexed so that its not confused with dwmac properties. I agree. SOC specific bindings should have different prefixes and documented separately, along with the compatible strings. > Regarding reset, I think we can add the support in stmmac driver itself. Will do. > Regarding clocks, on STi glue we can not represent the configuration in > proper clock infrastructure. I see. Could you give me a description of the 4 tx clock inputs? I would like to learn a bit more about STi glue. > Am happy to change sti glue driver to this interface style, if you are > Ok with this approach Or if you have any other better ideas, lets discuss. > > Feel free to change the above proposed new APIs.. I think the original .fix_mac_speed (without *pdev) is ok. It likely requires link speed, interface type, and any SoC data. interface type is buried in platform data, so .setup should take care to copy it into .bsp_priv. About
[PATCH v8 3/4] sched/numa: use wrapper function task_faults_idx to calculate index in group_faults
Use wrapper function task_faults_idx to calculate index in group_faults. Reviewed-by: Naoya Horiguchi Acked-by: Mel Gorman Acked-by: David Rientjes Signed-off-by: Wanpeng Li --- kernel/sched/fair.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c3f6ff9..8a00879 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -935,7 +935,8 @@ static inline unsigned long group_faults(struct task_struct *p, int nid) if (!p->numa_group) return 0; - return p->numa_group->faults[2*nid] + p->numa_group->faults[2*nid+1]; + return p->numa_group->faults[task_faults_idx(nid, 0)] + + p->numa_group->faults[task_faults_idx(nid, 1)]; } /* -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 1/4] sched/numa: drop sysctl_numa_balancing_settle_count sysctl
Changelog: v7 -> v8: * remove references to it in Documentation/sysctl/kernel.txt commit 887c290e (sched/numa: Decide whether to favour task or group weights based on swap candidate relationships) drop the check against sysctl_numa_balancing_settle_count, this patch remove the sysctl. Acked-by: Mel Gorman Reviewed-by: Rik van Riel Acked-by: David Rientjes Signed-off-by: Wanpeng Li --- Documentation/sysctl/kernel.txt |5 - include/linux/sched/sysctl.h|1 - kernel/sched/fair.c |9 - kernel/sysctl.c |7 --- 4 files changed, 0 insertions(+), 22 deletions(-) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 26b7ee4..6d48640 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -428,11 +428,6 @@ rate for each task. numa_balancing_scan_size_mb is how many megabytes worth of pages are scanned for a given scan. -numa_balancing_settle_count is how many scan periods must complete before -the schedule balancer stops pushing the task towards a preferred node. This -gives the scheduler a chance to place the task on an alternative node if the -preferred node is overloaded. - numa_balancing_migrate_deferred is how many page migrations get skipped unconditionally, after a page migration is skipped because a page is shared with other tasks. This reduces page migration overhead, and determines diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h index 41467f8..31e0193 100644 --- a/include/linux/sched/sysctl.h +++ b/include/linux/sched/sysctl.h @@ -48,7 +48,6 @@ extern unsigned int sysctl_numa_balancing_scan_delay; extern unsigned int sysctl_numa_balancing_scan_period_min; extern unsigned int sysctl_numa_balancing_scan_period_max; extern unsigned int sysctl_numa_balancing_scan_size; -extern unsigned int sysctl_numa_balancing_settle_count; #ifdef CONFIG_SCHED_DEBUG extern unsigned int sysctl_sched_migration_cost; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index fd773ad..acdef27 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -886,15 +886,6 @@ static unsigned int task_scan_max(struct task_struct *p) return max(smin, smax); } -/* - * Once a preferred node is selected the scheduler balancer will prefer moving - * a task to that node for sysctl_numa_balancing_settle_count number of PTE - * scans. This will give the process the chance to accumulate more faults on - * the preferred node but still allow the scheduler to move the task again if - * the nodes CPUs are overloaded. - */ -unsigned int sysctl_numa_balancing_settle_count __read_mostly = 4; - static void account_numa_enqueue(struct rq *rq, struct task_struct *p) { rq->nr_numa_running += (p->numa_preferred_nid != -1); diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 34a6047..c8da99f 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -385,13 +385,6 @@ static struct ctl_table kern_table[] = { .proc_handler = proc_dointvec, }, { - .procname = "numa_balancing_settle_count", - .data = _numa_balancing_settle_count, - .maxlen = sizeof(unsigned int), - .mode = 0644, - .proc_handler = proc_dointvec, - }, - { .procname = "numa_balancing_migrate_deferred", .data = _numa_balancing_migrate_deferred, .maxlen = sizeof(unsigned int), -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 4/4] sched/numa: fix period_slot recalculation
Changelog: v3 -> v4: * remove period_slot recalculation The original code is as intended and was meant to scale the difference between the NUMA_PERIOD_THRESHOLD and local/remote ratio when adjusting the scan period. The period_slot recalculation can be dropped. Reviewed-by: Naoya Horiguchi Acked-by: Mel Gorman Acked-by: David Rientjes Signed-off-by: Wanpeng Li --- kernel/sched/fair.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8a00879..e7ca79a 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1356,7 +1356,6 @@ static void update_task_scan_period(struct task_struct *p, * scanning faster if shared accesses dominate as it may * simply bounce migrations uselessly */ - period_slot = DIV_ROUND_UP(diff, NUMA_PERIOD_SLOTS); ratio = DIV_ROUND_UP(private * NUMA_PERIOD_SLOTS, (private + shared)); diff = (diff * ratio) / NUMA_PERIOD_SLOTS; } -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 2/4] sched/numa: use wrapper function task_node to get node which task is on
Changelog: v2 -> v3: * tranlate cpu_to_node(task_cpu(p)) to task_node(p) in sched/debug.c Use wrapper function task_node to get node which task is on. Acked-by: Mel Gorman Reviewed-by: Naoya Horiguchi Reviewed-by: Rik van Riel Acked-by: David Rientjes Signed-off-by: Wanpeng Li --- kernel/sched/debug.c |2 +- kernel/sched/fair.c |4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 5c34d18..374fe04 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -139,7 +139,7 @@ print_task(struct seq_file *m, struct rq *rq, struct task_struct *p) 0LL, 0LL, 0LL, 0L, 0LL, 0L, 0LL, 0L); #endif #ifdef CONFIG_NUMA_BALANCING - SEQ_printf(m, " %d", cpu_to_node(task_cpu(p))); + SEQ_printf(m, " %d", task_node(p)); #endif #ifdef CONFIG_CGROUP_SCHED SEQ_printf(m, " %s", task_group_path(task_group(p))); diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index acdef27..c3f6ff9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1216,7 +1216,7 @@ static int task_numa_migrate(struct task_struct *p) * elsewhere, so there is no point in (re)trying. */ if (unlikely(!sd)) { - p->numa_preferred_nid = cpu_to_node(task_cpu(p)); + p->numa_preferred_nid = task_node(p); return -EINVAL; } @@ -1283,7 +1283,7 @@ static void numa_migrate_preferred(struct task_struct *p) p->numa_migrate_retry = jiffies + HZ; /* Success if task is already running on preferred CPU */ - if (cpu_to_node(task_cpu(p)) == p->numa_preferred_nid) + if (task_node(p) == p->numa_preferred_nid) return; /* Otherwise, try migrate to a CPU on the preferred node */ -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch] Read CONFIG_RD_ variables for initramfs compression
Hello Simon, Andrew +-- On Wed, 11 Dec 2013, Simon Guinot wrote --+ | IIUC this patch, the INITRAMFS_COMPRESSION_* options are now | ignored/useless. Don't you think we should remove them from the | usr/Kconfig file ? -> https://lkml.org/lkml/2013/11/25/21 I'v pushed a patch from Mr Hristo to the same effect. I guess it's still in the queue. I haven't received any review for it yet. (...Andrew?) | Actually, I think this patch makes the initramfs compression | configuration quite confusing. Consider the following configuration | for a 3.13-rc3 kernel: | | CONFIG_RD_GZIP=y | CONFIG_RD_LZMA=y | CONFIG_INITRAMFS_COMPRESSION_LZMA=y | | This now produces a gzipped initramfs_data.cpio against a lzma one | previously. That is because, when multiple options are set, CONFIG_RD_GZIP is checked last in the usr/Makefile. ... # Gzip suffix_$(CONFIG_RD_GZIP) = .gz Hope it helps. -- Prasad J Pandit / Red Hat Security Response Team -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 09/14] efi: passing kexec necessary efi data via setup_data
> > > > > > +void __init parse_efi_setup(u64 phys_addr) > > > +{ > > > + struct setup_data *sd; > > > + > > > + if (!efi_enabled(EFI_64BIT)) { > > > + pr_warn("SETUP_EFI not supported on 32-bit\n"); > > > + return; > > > + } > > > > Shouldn't this function be in two versions in efi_64.c and efi_32.c? > > This way you don't need this check with cryptic printk message. > > Ok, will update. Rethink about this issue, moving them to efi_$(BITS).c I need move the efi_setup from a static variable to an extern, It looks not worth. Thanks Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 08/14] efi: export efi runtime memory mapping to sysfs
> > > > > and the EFI_BOOT* tests can be done in save_runtime_map and also the > > error handling can happen there. This way efi_map_regions() won't > > need to know about anything. This way, you can later move the whole > > save_runtime_map() function to efi-kexec.c just by taking it without any > > need for untangling. > > > > > +out_save_runtime: > > > + kfree(efi_runtime_map); > > > + nr_efi_runtime_map = 0; > > > + efi_runtime_map = NULL; > > > > This can go there too. > > This section can go the save_runtime_map but it looks clearer to put them > here. BTW, I will restructure the whole code when I move them to efi_kexec.c, so no worry about it? If you have strong opinion I can move them though. Thanks Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] USB: core: Add warm reset while reset-resuming SuperSpeed HUBs
>> ...although, the spec says that it does not wait for the port resets >> to complete. As far as I can see re-issuing a warm reset and waiting >> is the only way to guarantee the core times the recovery. Presumably >> the portstatus debounce in hub_activate() mitigates this, but that >> 100ms is less than a full reset timeout. It's definitely not just a timing issue for us. I can't reproduce all the same cases as Vikas, but when I attach a USB analyzer to the ones I do see the host controller doesn't even start sending a reset. >>> The xHCI spec requires that when the xHCI host is reset, a USB reset is >>> driven down the USB 3.0 ports. If hot reset fails, the port may migrate >>> to warm reset. See table 32 in the xHCI spec, in the definition of >>> HCRST. It sounds like this host doesn't drive a USB reset down USB 3.0 >>> ports at all on host controller reset? Oh, interesting, I hadn't seen that yet. So I guess the spec itself is fine if it were followed to the letter. I did some more tests about this on my Exynos machine: when I put a device to autosuspend (U3) and manually poke the xHC reset bit, I do see an automatic warm reset on the analyzer and the ports manage to retrain to U0. But after a system suspend/resume which calls xhci_reset() in the process, there is no reset on the wire. I also noticed that it doesn't drive a reset (even after manual poking) when there is no device connected on the other end of the analyzer. So this might be our problem: maybe these host controllers (Synopsys DesignWare) issue the spec-mandated warm reset only on ports where they think there is a device attached. But after a system suspend/resume (where the whole IP block on the SoC was powered down), the host controller cannot know that there is still a device with an active power session attached, and therefore doesn't drive the reset on its own. Even though this is a host controller bug, we still have to deal with it somehow. I guess we could move the code into xhci_plat_resume() and hide it behind a quirk to lessen the impact. But since reset_resume is not a common case for most host controllers, it's hard to say if this is DesignWare specific or a more widespread implementation mistake. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers
Hi Bjorn, On Wed, Dec 11, 2013 at 11:32:37AM -0700, Bjorn Helgaas wrote: > +PCI DRIVER FOR IMX6 > +M: Shawn Guo Thanks for the nomination. But I think a better person for this position would be Richard Zhu (copied). He knows the driver and controller much better than myself, and most importantly he is the driver owner for Freescale kernel and he has the contact to Freescale PCIe hardware people. Shawn > +L: linux-...@vger.kernel.org > +L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers) > +S: Maintained > +F: drivers/pci/host/*imx6* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/5] net: macb updates
Hi Soren, On Tue, Dec 10, 2013 at 4:07 PM, Soren Brinkmann wrote: > Soren Brinkmann (5): > net: macb: Adjust tx_clk when link speed changes This patch causes build issues on some at91 platforms, namely at91sam9263 that lacks programmable clocks. So it doesn't implement clk_set_rate() and clk_round_rate(). I don't know if there's any reasonable config option to check for (that wouldn't add at91-specific stuff to the driver which we don't want). So I suspect the best way would be to implement dummy versions for at91 when CONFIG_AT91_PROGRAMMABLE_CLOCKS isn't set. Nicolas, you OK with that? It'd be something like the below (copy-paste, whitespace damage, just RFC): diff --git a/arch/arm/mach-at91/clock.c b/arch/arm/mach-at91/clock.c index 6b2630a..17c52a7 100644 --- a/arch/arm/mach-at91/clock.c +++ b/arch/arm/mach-at91/clock.c @@ -459,6 +459,22 @@ static void __init init_programmable_clock(struct clk *clk) clk->rate_hz = parent->rate_hz / pmc_prescaler_divider(pckr); } +#else /* CONFIG_AT91_PROGRAMMABLE_CLOCKS */ + +int clk_set_rate(struct clk *clk, unsigned long rate) +{ + if (rate == clk_get_rate(clk)) + return 0; + + return -EINVAL; +} + +long clk_round_rate(struct clk *clk, unsigned long rate) +{ + /* There's really nothing sane to return here. */ + return clk_get_rate(clk); +} + #endif /* CONFIG_AT91_PROGRAMMABLE_CLOCKS */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 4/4] sched/numa: fix period_slot recalculation
On Thu, 12 Dec 2013, Wanpeng Li wrote: > Changelog: > v3 -> v4: > * remove period_slot recalculation > > The original code is as intended and was meant to scale the difference > between the NUMA_PERIOD_THRESHOLD and local/remote ratio when adjusting > the scan period. The period_slot recalculation can be dropped. > > Reviewed-by: Naoya Horiguchi > Acked-by: Mel Gorman > Signed-off-by: Wanpeng Li Acked-by: David Rientjes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Linux 3.12.5
I'm announcing the release of the 3.12.5 kernel. All users of the 3.12 kernel series must upgrade. The updated 3.12.y git tree can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-3.12.y and can be browsed at the normal kernel.org git web browser: http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary thanks, greg k-h Makefile|2 arch/arm/boot/dts/armada-370-db.dts | 28 +- arch/arm/boot/dts/armada-370-xp.dtsi|2 arch/arm/boot/dts/armada-xp-mv78230.dtsi| 24 +- arch/arm/boot/dts/armada-xp-mv78260.dtsi| 109 -- arch/arm/boot/dts/omap4-panda-common.dtsi | 20 - arch/arm/configs/multi_v7_defconfig |2 arch/arm/include/asm/pgtable.h |2 arch/arm/mach-at91/sama5d3.c|6 arch/arm/mach-footbridge/common.c |3 arch/arm/mach-footbridge/dc21285.c |2 arch/arm/mach-footbridge/ebsa285.c | 22 +- arch/arm/mm/mmap.c |2 arch/arm/mm/pgd.c |3 arch/parisc/kernel/sys_parisc.c | 25 +- arch/s390/crypto/aes_s390.c | 31 +-- arch/x86/Makefile |8 block/blk-cgroup.h |8 crypto/algif_hash.c |3 crypto/algif_skcipher.c |3 crypto/authenc.c|7 crypto/ccm.c|3 drivers/ata/libata-scsi.c |1 drivers/char/i8k.c |7 drivers/cpuidle/cpuidle.c |2 drivers/firewire/sbp2.c |1 drivers/firmware/efi/efi-pstore.c | 163 ++-- drivers/firmware/efi/efivars.c | 12 - drivers/firmware/efi/vars.c | 12 - drivers/gpio/gpio-mpc8xxx.c |8 drivers/input/Kconfig |2 drivers/input/keyboard/Kconfig |4 drivers/input/serio/Kconfig |6 drivers/misc/enclosure.c|7 drivers/misc/mei/hw-me-regs.h |6 drivers/misc/mei/pci-me.c |5 drivers/net/can/c_can/c_can.c | 21 +- drivers/net/can/flexcan.c |2 drivers/net/can/sja1000/sja1000.c | 17 - drivers/net/ethernet/broadcom/tg3.c | 12 - drivers/net/wireless/iwlwifi/dvm/tx.c | 14 - drivers/pnp/driver.c| 12 + drivers/scsi/3w-9xxx.c |3 drivers/scsi/3w-sas.c |3 drivers/scsi/3w-.c |3 drivers/scsi/aacraid/linit.c|1 drivers/scsi/arcmsr/arcmsr_hba.c|1 drivers/scsi/bfa/bfa_fcs.h |1 drivers/scsi/bfa/bfa_fcs_lport.c| 14 + drivers/scsi/bfa/bfad_attr.c|7 drivers/scsi/gdth.c |1 drivers/scsi/hosts.c|1 drivers/scsi/hpsa.c |5 drivers/scsi/ipr.c |3 drivers/scsi/ips.c |1 drivers/scsi/libsas/sas_ata.c |2 drivers/scsi/megaraid.c |1 drivers/scsi/megaraid/megaraid_mbox.c |1 drivers/scsi/megaraid/megaraid_sas_base.c |1 drivers/scsi/pmcraid.c |1 drivers/scsi/sd.c |6 drivers/scsi/storvsc_drv.c |1 drivers/spi/spi-pxa2xx.c|2 drivers/tty/n_tty.c |6 drivers/usb/class/cdc-acm.c |2 drivers/usb/serial/ftdi_sio.c | 37 ++- drivers/usb/serial/mos7840.c| 32 +-- drivers/usb/serial/pl2303.c | 30 +- drivers/usb/serial/spcp8x5.c| 30 +-
Re: Linux 3.4.74
diff --git a/Documentation/i2c/busses/i2c-i801 b/Documentation/i2c/busses/i2c-i801 index 99d4e442b77d..8bb57d7c12ea 100644 --- a/Documentation/i2c/busses/i2c-i801 +++ b/Documentation/i2c/busses/i2c-i801 @@ -22,6 +22,7 @@ Supported adapters: * Intel Panther Point (PCH) * Intel Lynx Point (PCH) * Intel Lynx Point-LP (PCH) + * Intel Avoton (SOC) Datasheets: Publicly available at the Intel website On Intel Patsburg and later chipsets, both the normal host SMBus controller diff --git a/Makefile b/Makefile index 2ea579016292..ce277ff0fd72 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ VERSION = 3 PATCHLEVEL = 4 -SUBLEVEL = 73 +SUBLEVEL = 74 EXTRAVERSION = NAME = Saber-toothed Squirrel diff --git a/arch/um/os-Linux/start_up.c b/arch/um/os-Linux/start_up.c index 425162e22af5..2f53b892fd80 100644 --- a/arch/um/os-Linux/start_up.c +++ b/arch/um/os-Linux/start_up.c @@ -15,6 +15,8 @@ #include #include #include +#include +#include #include #include "init.h" #include "os.h" diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c index 850246206b12..585c3b279feb 100644 --- a/crypto/algif_hash.c +++ b/crypto/algif_hash.c @@ -117,6 +117,9 @@ static ssize_t hash_sendpage(struct socket *sock, struct page *page, if (flags & MSG_SENDPAGE_NOTLAST) flags |= MSG_MORE; + if (flags & MSG_SENDPAGE_NOTLAST) + flags |= MSG_MORE; + lock_sock(sk); sg_init_table(ctx->sgl.sg, 1); sg_set_page(ctx->sgl.sg, page, size, offset); diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index a19c027b29bd..918a3b4148b8 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -381,6 +381,9 @@ static ssize_t skcipher_sendpage(struct socket *sock, struct page *page, if (flags & MSG_SENDPAGE_NOTLAST) flags |= MSG_MORE; + if (flags & MSG_SENDPAGE_NOTLAST) + flags |= MSG_MORE; + lock_sock(sk); if (!ctx->more && ctx->used) goto unlock; diff --git a/crypto/authenc.c b/crypto/authenc.c index 5ef7ba6b6a76..d21da2f0f508 100644 --- a/crypto/authenc.c +++ b/crypto/authenc.c @@ -368,9 +368,10 @@ static void crypto_authenc_encrypt_done(struct crypto_async_request *req, if (!err) { struct crypto_aead *authenc = crypto_aead_reqtfm(areq); struct crypto_authenc_ctx *ctx = crypto_aead_ctx(authenc); - struct ablkcipher_request *abreq = aead_request_ctx(areq); - u8 *iv = (u8 *)(abreq + 1) + -crypto_ablkcipher_reqsize(ctx->enc); + struct authenc_request_ctx *areq_ctx = aead_request_ctx(areq); + struct ablkcipher_request *abreq = (void *)(areq_ctx->tail + + ctx->reqoff); + u8 *iv = (u8 *)abreq - crypto_ablkcipher_ivsize(ctx->enc); err = crypto_authenc_genicv(areq, iv, 0); } diff --git a/crypto/ccm.c b/crypto/ccm.c index 32fe1bb5decb..18d64ad0433c 100644 --- a/crypto/ccm.c +++ b/crypto/ccm.c @@ -271,7 +271,8 @@ static int crypto_ccm_auth(struct aead_request *req, struct scatterlist *plain, } /* compute plaintext into mac */ - get_data_to_compute(cipher, pctx, plain, cryptlen); + if (cryptlen) + get_data_to_compute(cipher, pctx, plain, cryptlen); out: return err; diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 60662545cd14..c20f1578d393 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -268,6 +268,30 @@ static const struct pci_device_id ahci_pci_tbl[] = { { PCI_VDEVICE(INTEL, 0x8c07), board_ahci }, /* Lynx Point RAID */ { PCI_VDEVICE(INTEL, 0x8c0e), board_ahci }, /* Lynx Point RAID */ { PCI_VDEVICE(INTEL, 0x8c0f), board_ahci }, /* Lynx Point RAID */ + { PCI_VDEVICE(INTEL, 0x9c02), board_ahci }, /* Lynx Point-LP AHCI */ + { PCI_VDEVICE(INTEL, 0x9c03), board_ahci }, /* Lynx Point-LP AHCI */ + { PCI_VDEVICE(INTEL, 0x9c04), board_ahci }, /* Lynx Point-LP RAID */ + { PCI_VDEVICE(INTEL, 0x9c05), board_ahci }, /* Lynx Point-LP RAID */ + { PCI_VDEVICE(INTEL, 0x9c06), board_ahci }, /* Lynx Point-LP RAID */ + { PCI_VDEVICE(INTEL, 0x9c07), board_ahci }, /* Lynx Point-LP RAID */ + { PCI_VDEVICE(INTEL, 0x9c0e), board_ahci }, /* Lynx Point-LP RAID */ + { PCI_VDEVICE(INTEL, 0x9c0f), board_ahci }, /* Lynx Point-LP RAID */ + { PCI_VDEVICE(INTEL, 0x1f22), board_ahci }, /* Avoton AHCI */ + { PCI_VDEVICE(INTEL, 0x1f23), board_ahci }, /* Avoton AHCI */ + { PCI_VDEVICE(INTEL, 0x1f24), board_ahci }, /* Avoton RAID */ + { PCI_VDEVICE(INTEL, 0x1f25), board_ahci }, /* Avoton RAID */ + { PCI_VDEVICE(INTEL, 0x1f26), board_ahci }, /* Avoton RAID */ + { PCI_VDEVICE(INTEL, 0x1f27), board_ahci }, /* Avoton RAID */ + { PCI_VDEVICE(INTEL, 0x1f2e), board_ahci }, /* Avoton RAID */ + {
Linux 3.10.24
I'm announcing the release of the 3.10.24 kernel. All users of the 3.10 kernel series must upgrade. The updated 3.10.y git tree can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-3.10.y and can be browsed at the normal kernel.org git web browser: http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary thanks, greg k-h Makefile |2 arch/arm/boot/dts/armada-370-xp.dtsi |2 arch/arm/boot/dts/armada-xp-mv78230.dtsi | 16 +++--- arch/arm/boot/dts/armada-xp-mv78260.dtsi | 78 -- arch/arm/include/asm/pgtable.h|2 arch/arm/mach-at91/sama5d3.c |6 +- arch/arm/mach-footbridge/common.c |3 + arch/arm/mach-footbridge/dc21285.c|2 arch/arm/mach-footbridge/ebsa285.c| 22 +--- arch/arm/mm/mmap.c|2 arch/arm/mm/pgd.c |3 - arch/parisc/kernel/sys_parisc.c | 25 + arch/s390/crypto/aes_s390.c | 31 ++- arch/x86/Makefile |8 ++- block/blk-cgroup.h|8 +-- crypto/algif_hash.c |3 + crypto/algif_skcipher.c |3 + crypto/authenc.c |7 +- crypto/ccm.c |3 - drivers/ata/libata-scsi.c |1 drivers/char/i8k.c|7 ++ drivers/firewire/sbp2.c |1 drivers/gpio/gpio-mpc8xxx.c |8 ++- drivers/hid/hid-ids.h |5 + drivers/hid/usbhid/hid-quirks.c |3 + drivers/input/Kconfig |2 drivers/input/keyboard/Kconfig|4 - drivers/input/serio/Kconfig |6 +- drivers/misc/enclosure.c |7 ++ drivers/misc/mei/hw-me-regs.h |6 +- drivers/misc/mei/pci-me.c |5 + drivers/net/can/c_can/c_can.c | 21 +--- drivers/net/can/sja1000/sja1000.c | 17 +++--- drivers/net/ethernet/broadcom/tg3.c | 12 ++-- drivers/net/ethernet/smsc/smc91x.h| 22 +--- drivers/net/wireless/iwlwifi/dvm/tx.c | 14 + drivers/scsi/3w-9xxx.c|3 - drivers/scsi/3w-sas.c |3 - drivers/scsi/3w-.c|3 - drivers/scsi/aacraid/linit.c |1 drivers/scsi/arcmsr/arcmsr_hba.c |1 drivers/scsi/bfa/bfa_fcs.h|1 drivers/scsi/bfa/bfa_fcs_lport.c | 14 - drivers/scsi/bfa/bfad_attr.c |7 -- drivers/scsi/gdth.c |1 drivers/scsi/hosts.c |1 drivers/scsi/hpsa.c |5 + drivers/scsi/ipr.c|3 - drivers/scsi/ips.c|1 drivers/scsi/libsas/sas_ata.c |2 drivers/scsi/megaraid.c |1 drivers/scsi/megaraid/megaraid_mbox.c |1 drivers/scsi/megaraid/megaraid_sas_base.c |1 drivers/scsi/pmcraid.c|1 drivers/scsi/sd.c |6 ++ drivers/scsi/storvsc_drv.c|1 drivers/usb/class/cdc-acm.c |2 drivers/usb/serial/ftdi_sio.c | 37 +- drivers/usb/serial/mos7840.c | 32 ++-- drivers/usb/serial/pl2303.c | 32 +--- drivers/usb/serial/spcp8x5.c | 30 +-- drivers/xen/grant-table.c |6 +- fs/nfs/nfs4proc.c | 10 +++ fs/pipe.c | 39 +++ include/crypto/scatterwalk.h |3 - include/linux/genalloc.h |4 - include/scsi/scsi_host.h |6 ++ kernel/irq/pm.c |2 kernel/time/timekeeping.c |2 lib/genalloc.c| 19 --- net/ipv4/udp.c|3 + sound/pci/hda/patch_realtek.c | 55 ++--- sound/soc/codecs/wm8731.c |4 - sound/soc/codecs/wm8990.c |2 74 files changed, 456 insertions(+), 256 deletions(-) AceLan Kao (2): HID: usbhid: quirk for Synaptics Large Touchccreen HID: usbhid: quirk for SiS Touchscreen Alan Cox (1): drivers/char/i8k.c: add Dell XPLS L421X Arnaud Ebalard (2): ARM: mvebu: fix second and third PCIe unit of Armada XP mv78260 ARM: mvebu: second PCIe unit of Armada XP mv78230 is only x1 capable Bo Shen (1): ASoC: wm8731: fix dsp mode configuration Colin Leitner (4): USB: pl2303: fixed handling of CS5 setting USB: ftdi_sio: fixed
Linux 3.4.74
I'm announcing the release of the 3.4.74 kernel. All users of the 3.4 kernel series must upgrade. The updated 3.4.y git tree can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-3.4.y and can be browsed at the normal kernel.org git web browser: http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary thanks, greg k-h Documentation/i2c/busses/i2c-i801 |1 + Makefile |2 +- arch/um/os-Linux/start_up.c|2 ++ crypto/algif_hash.c|3 +++ crypto/algif_skcipher.c|3 +++ crypto/authenc.c |7 --- crypto/ccm.c |3 ++- drivers/ata/ahci.c | 24 drivers/char/i8k.c |7 +++ drivers/gpio/gpio-mpc8xxx.c|8 ++-- drivers/i2c/busses/Kconfig |1 + drivers/i2c/busses/i2c-i801.c |3 +++ drivers/input/Kconfig |2 +- drivers/input/keyboard/Kconfig |4 ++-- drivers/input/serio/Kconfig|6 +++--- drivers/misc/enclosure.c |7 +++ drivers/net/ethernet/smsc/smc91x.h | 22 -- drivers/scsi/hpsa.c|4 ++-- drivers/scsi/libsas/sas_ata.c |2 +- drivers/usb/class/cdc-acm.c|2 ++ drivers/usb/serial/mos7840.c | 32 drivers/usb/serial/pl2303.c| 32 +++- drivers/usb/serial/spcp8x5.c | 30 ++ fs/nfs/nfs4proc.c | 10 -- include/crypto/scatterwalk.h |3 ++- kernel/irq/pm.c|2 +- net/ipv4/udp.c |3 +++ sound/soc/codecs/wm8731.c |4 ++-- sound/soc/codecs/wm8990.c |2 ++ 29 files changed, 142 insertions(+), 89 deletions(-) Alan Cox (1): drivers/char/i8k.c: add Dell XPLS L421X Bo Shen (1): ASoC: wm8731: fix dsp mode configuration Colin Leitner (3): USB: pl2303: fixed handling of CS5 setting USB: mos7840: correct handling of CS5 setting USB: spcp8x5: correct handling of CS5 setting Dan Williams (1): SCSI: libsas: fix usage of ata_tf_to_fis David Cluytens (1): USB: cdc-acm: Added support for the Lenovo RD02-D400 USB Modem Greg Kroah-Hartman (1): Linux 3.4.74 Horia Geanta (1): crypto: ccm - Fix handling of zero plaintext when computing mac James Bottomley (1): SCSI: enclosure: fix WARN_ON in dual path device removing James Ralston (1): ahci: Add Device IDs for Intel Lynx Point-LP PCH Laxman Dewangan (1): irq: Enable all irqs unconditionally in irq_resume Linus Walleij (1): net: smc91: fix crash regression on the versatile Liu Gang (1): powerpc/gpio: Fix the wrong GPIO input data on MPC8572/MPC8536 Mark Brown (1): ASoC: wm8990: Mark the register map as dirty when powering down Sergei Trofimovich (1): um: add missing declaration of 'getrlimit()' and friends Seth Heasley (2): ahci: AHCI-mode SATA patch for Intel Avoton DeviceIDs i2c: i801: SMBus patch for Intel Avoton DeviceIDs Shawn Landden (1): net: update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLAST Stephen M. Cameron (2): SCSI: hpsa: do not discard scsi status on aborted commands SCSI: hpsa: return 0 from driver probe function on success, not 1 Tom Gundersen (2): Input: allow deselecting serio drivers even without CONFIG_EXPERT Input: mousedev - allow disabling even without CONFIG_EXPERT Tom Lendacky (3): crypto: scatterwalk - Set the chain pointer indication bit crypto: authenc - Find proper IV address in ablkcipher callback crypto: scatterwalk - Use sg_chain_ptr on chain entries Trond Myklebust (1): NFSv4: Update list of irrecoverable errors on DELEGRETURN signature.asc Description: Digital signature
Re: [PATCH v7 3/4] sched/numa: use wrapper function task_faults_idx to calculate index in group_faults
On Thu, 12 Dec 2013, Wanpeng Li wrote: > Use wrapper function task_faults_idx to calculate index in group_faults. > > Reviewed-by: Naoya Horiguchi > Acked-by: Mel Gorman > Signed-off-by: Wanpeng Li Acked-by: David Rientjes The naming of task_faults_idx() is a little unfortunate since it is now used to index into both task_faults() and group_faults(), though. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 2/4] sched/numa: use wrapper function task_node to get node which task is on
On Thu, 12 Dec 2013, Wanpeng Li wrote: > Changelog: > v2 -> v3: > * tranlate cpu_to_node(task_cpu(p)) to task_node(p) in sched/debug.c > > Use wrapper function task_node to get node which task is on. > > Acked-by: Mel Gorman > Reviewed-by: Naoya Horiguchi > Reviewed-by: Rik van Riel > Signed-off-by: Wanpeng Li Acked-by: David Rientjes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] hfsplus: Remove hfsplus_file_lookup
On Wed, 2013-12-11 at 21:08 +, Anton Altaparmakov wrote: > Hi, > > On 11 Dec 2013, at 19:11, Al Viro wrote: > > On Wed, Dec 11, 2013 at 10:49:29PM +0300, Vyacheslav Dubeyko wrote: > >> This feature worked earlier under Linux. So, I suppose that some changes > >> in HFS+ driver > >> or in VFS broke it. And it needs to investigate and fix the reported > >> issue. Thank you for the > >> report. > > > > This "feature" is severely broken and yes, outright removal is what I'd > > suggest for a fix. HFS+ allows hardlinks to files, which means that > > you allow multiple dentries for the same inode with ->lookup() in it, > > which is asking for deadlocks. > > > > This is fundamentally not supported. Considering that forks are lousy > > idea in the first place, I'd seriously suggest to remove that idiocy for > > good. > > Completely agree with Al. If anyone really wants access to forks they can > implement them via the xattr interface (ok it has the 64k limitation but most > forks are quite small so not much of an issue). That's how I implemented > access to named streams in Tuxera NTFS and it works a treat (and allows Linux > apps and various security modules that require xattr support to work properly > which is also great). > Yes, I have the same considerations about using xattr way for the case of resource fork after the night. Usually, a file under HFS+ has or valid data fork, or valid resource fork. So, HFS+ compressed file has valid resource fork only. Also alias under Mac OS X has valid resource fork only. Of course, regular file can have as valid data fork as valid resource fork. Fortunately, such case is rare now (when file has both forks are valid). So, we can use xattr way for accessing resource fork for such files. For example, it is possible to use "osx.ResourceFork" xattr's name. And I suppose that 64 KB is reasonable limitation. Now we have access to FinderInfo fields of CatalogFile's record for file under HFS+ by means of "com.apple.FinderInfo" xattr. I think that I can implement support of resource forks by means of xattr way. Also, currently, I am implementing HFS+ compressed files support. So, I can clean up old-fashioned way of resource forks support in HFS+ driver because of necessity to rework it anyway. The suggested patch doesn't make all necessary cleanup, from my viewpoint. Any comments? Thanks, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 1/4] sched/numa: drop sysctl_numa_balancing_settle_count sysctl
On Thu, 12 Dec 2013, Wanpeng Li wrote: > commit 887c290e (sched/numa: Decide whether to favour task or group weights > based on swap candidate relationships) drop the check against > sysctl_numa_balancing_settle_count, this patch remove the sysctl. > What about the references to it in Documentation/sysctl/kernel.txt? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] ARM: sunxi: Add an ahci-platform compatible AHCI driver for the Allwinner SUNXi series of SoCs
On Wed, Dec 11, 2013 at 03:51:51PM +0100, Olliver Schinagl wrote: > Working on this and studying the existing > ahci_platform/shci_platform drivers the last few days and was > figuring out why ahci_platform only supports 1 clock. IMX handles > this by having 3 clocks defined in the DT, the first one gets > enabled by default via ahci_platform, the other 2 get enabled in > IMX's probe function. > > Is it an idea to extend this to support all clocks that would be > required (via a callback)? Not really. We did this for ahci_imx driver only because we do not want to churn generic ahci_platform driver with those imx specific setup code. Note, beside the additional two clocks, we have some PHY parameters to set up in IMX IOMUXC general purpose registers, and vendor specific register HOST_TIMER1MS to be set up as well. > Or do we prefer having the clocks > separated for other technical reasons? Or do we want to handle the > clocks via the ahci_platform framework and extend hpriv->clk to an > array of clocks? The direction of the generic ahci platform driver will be having it be a library providing helper functions, as discussed as below. https://lkml.org/lkml/2013/12/6/153 We can ask the helper function to handle the common clocks and leave the platform specific ones to platform driver. Shawn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V7 2/2] arm64: perf: add support for percpu pmu interrupt
Hi Will, On Tue, Dec 10, 2013 at 1:00 PM, Vinayak Kale wrote: > Hi Will, > > > On Mon, Dec 9, 2013 at 10:20 PM, Will Deacon wrote: >> Hi Vinayak, >> >> On Wed, Dec 04, 2013 at 10:09:51AM +, Vinayak Kale wrote: >>> Add support for irq registration when pmu interrupt is percpu. >> >> Getting closer... >> >>> Signed-off-by: Vinayak Kale >>> Signed-off-by: Tuan Phan >>> --- >>> arch/arm64/kernel/perf_event.c | 108 >>> +--- >>> 1 file changed, 78 insertions(+), 30 deletions(-) >>> >>> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c >>> index cea1594..d8e6667 100644 >>> --- a/arch/arm64/kernel/perf_event.c >>> +++ b/arch/arm64/kernel/perf_event.c >>> @@ -22,6 +22,7 @@ >>> >>> #include >>> #include >>> +#include >>> #include >>> #include >>> #include >>> @@ -363,26 +364,52 @@ validate_group(struct perf_event *event) >>> } >>> >>> static void >>> +armpmu_disable_percpu_irq(void *data) >>> +{ >>> + disable_percpu_irq((long)data); >>> +} >> >> Given that we wait for the CPUs to finish enabling/disabling the IRQ, I >> actually meant pass the pointer to the IRQ, which removes the horrible >> casts in the caller. >> >>> + if (irq_is_percpu(irq)) { >>> + cpumask_clear(>active_irqs); >> >> Thanks for moving the mask manipulation out. It now makes it obvious that we >> don't care about the mask at all for PPIs, so that can be removed (the code >> you have is racy against hotplug anyway). >> >> I took the liberty of writing a fixup for you (see below). Can you test it >> on your platform please? > > Below fixup works fine on APM platform. > Do you want me to send this fixup as part of next revision of the > patch or will you apply it yourself? (For later case, you have my ack) Any comments? Do I need to send the fix-up in next revision of patch? Thanks -Vinayak -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 09/14] efi: passing kexec necessary efi data via setup_data
> > > + */ > > > +static int __init map_regions_fixed(void) > > > +{ > > > + int i, s, ret = 0; > > > + u64 end, systab; > > > + unsigned long size; > > > + efi_memory_desc_t *md; > > > + struct efi_setup_data *data; > > > + > > > + s = sizeof(*data) + nr_efi_runtime_map * sizeof(data->map[0]); > > > + data = early_memremap(efi_setup, s); > > > + if (!data) { > > > + ret = -ENOMEM; > > > + goto out; > > > + } > > > > newline. > > Will remove misread the comment, there's no new line here. Looks like you want a new blank line here, ok.. > > > > > > + for (i = 0, md = data->map; i < nr_efi_runtime_map; i++, md++) { > > > + efi_map_region_fixed(md); /* FIXME: add error handling */ > > > + size = md->num_pages << PAGE_SHIFT; > > > + end = md->phys_addr + size; > > > + > > > + systab = (u64) (unsigned long) efi_phys.systab; > > > + if (md->phys_addr <= systab && systab < end) { > > > + systab += md->virt_addr - md->phys_addr; > > > + efi.systab = (efi_system_table_t *)(unsigned > > > long)systab; > > > + } > > > + ret = save_runtime_map(md, i); > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] Memory compaction efficiency improvements
On Wed, Dec 11, 2013 at 11:24:31AM +0100, Vlastimil Babka wrote: > Changelog since V1 (thanks to the reviewers!) > o Included "trace compaction being and end" patch in the series (mgorman) > o Changed variable names and comments in patches 2 and 5(mgorman) > o More thorough measurements, based on v3.13-rc2 > > The broad goal of the series is to improve allocation success rates for huge > pages through memory compaction, while trying not to increase the compaction > overhead. The original objective was to reintroduce capturing of high-order > pages freed by the compaction, before they are split by concurrent activity. > However, several bugs and opportunities for simple improvements were found in > the current implementation, mostly through extra tracepoints (which are > however > too ugly for now to be considered for sending). > > The patches mostly deal with two mechanisms that reduce compaction overhead, > which is caching the progress of migrate and free scanners, and marking > pageblocks where isolation failed to be skipped during further scans. > > Patch 1 (from mgorman) adds tracepoints that allow calculate time spent in > compaction and potentially debug scanner pfn values. > > Patch 2 encapsulates the some functionality for handling deferred compactions > for better maintainability, without a functional change > type is not determined without being actually needed. > > Patch 3 fixes a bug where cached scanner pfn's are sometimes reset only after > they have been read to initialize a compaction run. > > Patch 4 fixes a bug where scanners meeting is sometimes not properly detected > and can lead to multiple compaction attempts quitting early without > doing any work. > > Patch 5 improves the chances of sync compaction to process pageblocks that > async compaction has skipped due to being !MIGRATE_MOVABLE. > > Patch 6 improves the chances of sync direct compaction to actually do anything > when called after async compaction fails during allocation slowpath. > > The impact of patches were validated using mmtests's stress-highalloc > benchmark > with mmtests's stress-highalloc benchmark on a x86_64 machine with 4GB memory. > > Due to instability of the results (mostly related to the bugs fixed by patches > 2 and 3), 10 iterations were performed, taking min,mean,max values for success > rates and mean values for time and vmstat-based metrics. > > First, the default GFP_HIGHUSER_MOVABLE allocations were tested with the > patches > stacked on top of v3.13-rc2. Patch 2 is OK to serve as baseline due to no > functional changes in 1 and 2. Comments below. > > stress-highalloc > 3.13-rc2 3.13-rc2 > 3.13-rc2 3.13-rc2 3.13-rc2 > 2-nothp 3-nothp > 4-nothp 5-nothp 6-nothp > Success 1 Min 9.00 ( 0.00%) 10.00 (-11.11%) 43.00 > (-377.78%) 43.00 (-377.78%) 33.00 (-266.67%) > Success 1 Mean27.50 ( 0.00%) 25.30 ( 8.00%) 45.50 > (-65.45%) 45.90 (-66.91%) 46.30 (-68.36%) > Success 1 Max 36.00 ( 0.00%) 36.00 ( 0.00%) 47.00 > (-30.56%) 48.00 (-33.33%) 52.00 (-44.44%) > Success 2 Min 10.00 ( 0.00%)8.00 ( 20.00%) 46.00 > (-360.00%) 45.00 (-350.00%) 35.00 (-250.00%) > Success 2 Mean26.40 ( 0.00%) 23.50 ( 10.98%) 47.30 > (-79.17%) 47.60 (-80.30%) 48.10 (-82.20%) > Success 2 Max 34.00 ( 0.00%) 33.00 ( 2.94%) 48.00 > (-41.18%) 50.00 (-47.06%) 54.00 (-58.82%) > Success 3 Min 65.00 ( 0.00%) 63.00 ( 3.08%) 85.00 > (-30.77%) 84.00 (-29.23%) 85.00 (-30.77%) > Success 3 Mean76.70 ( 0.00%) 70.50 ( 8.08%) 86.20 > (-12.39%) 85.50 (-11.47%) 86.00 (-12.13%) > Success 3 Max 87.00 ( 0.00%) 86.00 ( 1.15%) 88.00 ( > -1.15%) 87.00 ( 0.00%) 87.00 ( 0.00%) > > 3.13-rc23.13-rc23.13-rc23.13-rc23.13-rc2 > 2-nothp 3-nothp 4-nothp 5-nothp 6-nothp > User 6437.72 6459.76 5960.32 5974.55 6019.67 > System 1049.65 1049.09 1029.32 1031.47 1032.31 > Elapsed 1856.77 1874.48 1949.97 1994.22 1983.15 > > 3.13-rc23.13-rc23.13-rc23.13-rc2 > 3.13-rc2 >2-nothp 3-nothp 4-nothp 5-nothp > 6-nothp > Minor Faults 253952267 254581900 250030122 250507333 > 250157829 > Major Faults 420 407 506 530 > 530 > Swap Ins 4 9 9 6 >6 > Swap Outs
linux-next: Tree for Dec 12
Hi all, Changes since 20131211: The powerpc tree still had its build failure for which I applied a supplied patch. The net-next tree gained a conflict against the net tree. The block tree gained a conflict against the f2fs tree. The usb-gadget tree still has its build failure so I used the version from next-20131206. Non-merge commits (relative to Linus' tree): 3588 4076 files changed, 171238 insertions(+), 96653 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" as mentioned in the FAQ on the wiki (see below). You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a multi_v7_defconfig for arm. After the final fixups (if any), it is also built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc, sparc64 and arm defconfig. These builds also have CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and CONFIG_DEBUG_INFO disabled when necessary. Below is a summary of the state of the merge. I am currently merging 209 trees (counting Linus' and 29 trees of patches pending for Linus' tree). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. There is a wiki covering stuff to do with linux-next at http://linux.f-seidel.de/linux-next/pmwiki/ . Thanks to Frank Seidel. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (9538e10086bd Merge git://www.linux-watchdog.org/linux-watchdog) Merging fixes/master (8ae516aa8b81 Merge tag 'trace-fixes-v3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace) Merging kbuild-current/rc-fixes (19514fc665ff arm, kbuild: make "make install" not depend on vmlinux) Merging arc-current/for-curr (da990a4f2d5a ARC: [perf] Fix a few thinkos) Merging arm-current/fixes (b31459adeab0 ARM: 7917/1: cacheflush: correctly limit range of memory region being flushed) Merging m68k-current/for-linus (77a42796786c m68k: Remove deprecated IRQF_DISABLED) Merging metag-fixes/fixes (3b2f64d00c46 Linux 3.11-rc2) Merging powerpc-merge/merge (e641eb03ab2b powerpc: Fix up the kdump base cap to 128M) Merging sparc/master (1de425c7b271 sparc64: Fix build regression) Merging net/master (9508fdde4d53 Revert "8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature") Merging ipsec/master (239c78db9c41 net: clear local_df when passing skb between namespaces) Merging sound-current/for-linus (3690739b0135 ALSA: hda - Add static DAC/pin mapping for AD1986A codec) Merging pci-current/for-linus (4fc9bbf98fd6 PCI: Disable Bus Master only on kexec reboot) Merging wireless/master (bbf807bc0697 ath9k: fix duration calculation for non-aggregated packets) Merging driver-core.current/driver-core-linus (a8b14744429f sysfs: give different locking key to regular and bin files) Merging tty.current/tty-linus (39434abd942c n_tty: Fix missing newline echo) Merging usb.current/usb-linus (8820784203ac phy: kconfig: add depends on "USB_PHY" to OMAP_USB2 and TWL4030_USB) Merging staging.current/staging-linus (55ef003e4ae6 Merge tag 'iio-fixes-for-3.13b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus) Merging char-misc.current/char-misc-linus (76a9635979e5 mei: add 9 series PCH mei device ids) Merging input-current/for-linus (241ecf1ce528 Input: adxl34x - Fix bug in definition of ADXL346_2D_ORIENT) Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" stripe) Merging crypto-current/master (389a5390583a crypto: scatterwalk - Use sg_chain_ptr on chain entries) Merging ide/master (c2f7d1e103ef ide: pmac: remove unnecessary pci_set_drvdata()) Merging dwmw2/master (5950f0803ca9 pcmcia: remove RPX board stuff) Merging sh-current/sh-fixes-for-linus (44033109e99c SH: Convert out[bwl] macros to inline functions) Merging devicetree-current/devicetree/merge (1931ee143b0a Revert "drivers: of: add initialization code for dma reserved memory") Merging rr-fi
Re: 50 Watt idle power regression bisected to Linux-3.10
On Thu, 2013-12-12 at 06:57 +0100, Mike Galbraith wrote: > On Wed, 2013-12-11 at 21:45 -0800, H. Peter Anvin wrote: > > As in it hangs at that point? > > Nope, it's still going. > > [1567.578340] pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: > 1064 MHz, 2266 MHz > > Funny, continents move faster :) Maybe missing a write or two. When I get back it may be done booting. I'm gonna let it try for grins while I'm away, then take a peek, see if I can spot it. [ 1567.578340] pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1064 MHz, 2266 MHz done Starting HAL daemon done Setting up (localfs) network interfaces: lo loIP address: 127.0.0.1/8 IP address: 127.0.0.2/8 done eth0 device: Broadcom Corporation NetXtreme II BCM5709 Gig No configuration found for eth0 unused eth1 device: Broadcom Corporation NetXtreme II BCM5709 Gig No configuration found for eth1 unused eth2 device: NetXen Incorporated NX3031 Multifunction 1/10 [ 2457.114007] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready [ 2457.114455] netxen_nic: eth2 NIC Link is up [ 2457.223582] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready eth2 IP address: 0.0.0.0/32 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -tip v4 0/6] kprobes: introduce NOKPROBE_SYMBOL() and fixes crash bugs
(2013/12/11 22:34), Ingo Molnar wrote: > > * Masami Hiramatsu wrote: > >>> So why are annotations needed at all? What can happen if an >>> annotation is missing and a piece of code is probed which is also >>> used by the kprobes code internally - do we crash, lock up, >>> misbehave or handle it safely? >> >> The kprobe has recursion detector, [...] > > It's the 'current_kprobe' percpu variable, checked via > kprobe_running(), right? Right. :) >> [...] but it is detected in the kprobe exception(int3) handler, this >> means that if we put a probe before detecting the recursion, we'll >> do an infinite recursion. > > So only the (presumably rather narrow) code path leading to the > recursion detection code has to be annotated, correct? Yes, correct. >> And also, even if we can detect the recursion, we can't stop the >> kernel, we need to skip the probe. This means that we need to >> recover to the main execution path by doing single step. As you may >> know, since the single stepping involves the debug exception, we >> have to avoid proving on that path too. Or we'll have an infinite >> recursion again. > > I don't see why this is needed: if a "probing is disabled" recursion > flag is set the moment the first probe fires, and if it's only cleared > once all processing is finished, then any intermediate probes should > simply return early from int3 and not fire. No, because the int3 already changes the original instruction. This means that you cannot skip singlestep(or emulate) the instruction which is copied to execution buffer (ainsn->insn), even if you have such the flag. So, kprobe requires the annotations on the singlestep path. Thank you, -- Masami HIRAMATSU IT Management Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cgroup: fix fail path in cgroup_load_subsys()
> @@ -4861,10 +4861,8 @@ int __init_or_module cgroup_load_subsys(struct > cgroup_subsys *ss) >*/ > css = ss->css_alloc(cgroup_css(cgroup_dummy_top, ss)); > if (IS_ERR(css)) { > - /* failure case - need to deassign the cgroup_subsys[] slot. */ > - cgroup_subsys[ss->subsys_id] = NULL; > - mutex_unlock(_mutex); > - return PTR_ERR(css); > + ret = PTR_ERR(css); > + goto out_err; > } > > list_add(>sibling, _dummy_root.subsys_list); > @@ -4873,6 +4871,10 @@ int __init_or_module cgroup_load_subsys(struct > cgroup_subsys *ss) > /* our new subsystem will be attached to the dummy hierarchy. */ > init_css(css, ss, cgroup_dummy_top); > > + ret = online_css(css); > + if (ret) > + goto free_css; > + > /* >* Now we need to entangle the css into the existing css_sets. unlike >* in cgroup_init_subsys, there are now multiple css_sets, so each one > @@ -4896,18 +4898,17 @@ int __init_or_module cgroup_load_subsys(struct > cgroup_subsys *ss) > } > write_unlock(_set_lock); > > - ret = online_css(css); > - if (ret) > - goto err_unload; > - Moving online_css() upwards should be fine. Acked-by: Li Zefan > /* success! */ > mutex_unlock(_mutex); > return 0; > > -err_unload: > +free_css: > + list_del(>sibling); > + ss->css_free(css); > +out_err: > + /* failure case - need to deassign the cgroup_subsys[] slot. */ > + cgroup_subsys[ss->subsys_id] = NULL; > mutex_unlock(_mutex); > - /* @ss can't be mounted here as try_module_get() would fail */ > - cgroup_unload_subsys(ss); > return ret; > } > EXPORT_SYMBOL_GPL(cgroup_load_subsys); > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 50 Watt idle power regression bisected to Linux-3.10
On Wed, 2013-12-11 at 21:45 -0800, H. Peter Anvin wrote: > As in it hangs at that point? Nope, it's still going. [1567.578340] pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1064 MHz, 2266 MHz Funny, continents move faster :) Maybe missing a write or two. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [PATCH 16/17] uprobes: Allocate ->utask before handler_chain() for tracing handlers
(2013/12/12 3:11), Oleg Nesterov wrote: > On 12/11, Masami Hiramatsu wrote: >> >> (2013/12/11 0:57), Oleg Nesterov wrote: >>> On 12/10, Masami Hiramatsu wrote: and isn't it better to increment miss-hit counter of the uprobe? >>> >>> What do you mean? This is not miss-hit and ->utask == NULL is quite normal. >> >> But it could skip the handler_chain silently. It could confuse users >> why their probe doesn't hit as expected. > > No, we will restart the same (probed) instruction, handle_swbp() > will be called again, get_utask() will be called again. Hmm, in that case, how would you avoid infinite recursive loop?? Would you repeat it until get_utask() != NULL? > Not to mention that (in practice) if GFP_KERNEL fails the task is > already killed. > >>> For example, on ppc it can be always NULL because ppc likely emulates the >>> probed insn. >> >> Hmm, in that case, should uprobes handlers never be called on ppc with >> this change? > > Why? With this change ppc will have ->utask != NULL even if it doesn't > need it at all. Ah, I see. This changes that. Thank you, -- Masami HIRAMATSU IT Management Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v2 0/4] X86/KVM: enable Intel MPX for KVM
Paolo Bonzini wrote: > Il 11/12/2013 09:31, Liu, Jinsong ha scritto: >> Paolo, comments for version 2? > > I think I commented that it's fine, I'm just waiting for a rebase on > top of the generic patches. > > Paolo > Thanks! common MPX definiation patches have been checked in tip tree (both Qiaowei and I use that definiations): http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=191f57c137bcce0e3e9313acb77b2f114d15afbb http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=e7d820a5e549b3eb6c3f9467507566565646a669 Jinsong >> >> Liu, Jinsong wrote: >>> These patches are version 2 to enalbe Intel MPX for KVM. >>> >>> Version 1: >>> * Add some Intel MPX definiation >>> * Fix a cpuid(0x0d, 0) exposing bug, dynamic per XCR0 features >>> enable/disable >>> * vmx and msr handle for MPX support at KVM >>> * enalbe MPX feature for guest >>> >>> Version 2: >>> * remove generic MPX definiation, kernel side has add the >>> definiation >>> * add MSR_IA32_BNDCFGS to msrs_to_save >>> >>> Thanks, >>> Jinsong >>> >>> Liu Jinsong (4): >>> KVM/X86: Fix xsave cpuid exposing bug >>> KVM/X86: Intel MPX vmx and msr handle >>> KVM/X86: add MSR_IA32_BNDCFGS to msrs_to_save >>> KVM/X86: Enable Intel MPX for guest. >>> >>> arch/x86/include/asm/vmx.h|4 >>> arch/x86/include/asm/xsave.h |2 ++ >>> arch/x86/include/uapi/asm/msr-index.h |1 + >>> arch/x86/kvm/cpuid.c |8 >>> arch/x86/kvm/vmx.c| 18 -- >>> arch/x86/kvm/x86.c| 12 +--- >>> arch/x86/kvm/x86.h|3 ++- >>> 7 files changed, 38 insertions(+), 10 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 50 Watt idle power regression bisected to Linux-3.10
As in it hangs at that point? Mike Galbraith wrote: >On Wed, 2013-12-11 at 20:49 -0800, H. Peter Anvin wrote: >> On 12/11/2013 08:25 PM, Mike Galbraith wrote: >> > arch/x86/include/asm/mwait.h |4 ++-- >> > arch/x86/kernel/cpu/common.c |7 --- >> > arch/x86/kernel/setup_percpu.c |1 + >> > 3 files changed, 7 insertions(+), 5 deletions(-) >> > >> > Index: linux-2.6/arch/x86/kernel/cpu/common.c >> > === >> > --- linux-2.6.orig/arch/x86/kernel/cpu/common.c >> > +++ linux-2.6/arch/x86/kernel/cpu/common.c >> > @@ -65,13 +65,14 @@ void __init setup_cpu_local_masks(void) >> > } >> > >> > /* allocate percpu area for mwait doorbell */ >> > -char __percpu *mwait_doorbell; >> > +DEFINE_PER_CPU(char *, mwait_doorbell); >> > +EXPORT_PER_CPU_SYMBOL(mwait_doorbell); >> > >> >> Sorry, this is wrong. This is NOT a percpu variable, it is a pointer >to >> a percpu allocation, but the variable itself is not a percpu >variable. >> This explains your boom. > >With that fixed, it boots, but is not quite perfect. > >... >[ 258.560079] fbcon: radeondrmfb (fb0) is primary device >[ 258.722483] Console: switching to colour frame buffer device 128x48 >[ 258.847076] radeon :01:03.0: fb0: radeondrmfb frame buffer >device >[ 258.911991] radeon :01:03.0: registered panic notifier >[ 258.968772] [drm] Initialized radeon 2.35.0 20080528 for >:01:03.0 on minor 0 >... >[ 469.738604] netxen_nic :04:00.3: using msi-x interrupts >[ 469.739078] netxen_nic :04:00.3: eth5: GbE port initialized >[ 469.830512] ipmi_si 00:01: Found new BMC (man_id: 0x0b, prod_id: >0x2000, dev_id: 0x13) >[ 469.830524] ipmi_si 00:01: IPMI kcs interface initialized >[ 473.729862] iTCO_wdt: unable to reset NO_REBOOT flag, device >disabled by hardware/BIOS >... >[ 711.636741] fuse init (API version 7.22) > >... ok box, doctor appointment is in an hour away. > >-Mike -- Sent from my mobile phone. Please pardon brevity and lack of formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 7/8] mm, memcg: allow processes handling oom notifications to access reserves
The immediate problem I see with setting aside reserves "off the top" is that we don't really know a priori how much memory the kernel itself is going to use, which could still land us in an overcommitted state. In other words, if I have your 128 MB machine, and I set aside 8 MB for OOM handling, and give 120 MB for jobs, I have not accounted for the kernel. So I set aside 8 MB for OOM and 100 MB for jobs, leaving 20 MB for jobs. That should be enough right? Hell if I know, and nothing ensures that. On Wed, Dec 11, 2013 at 4:42 AM, Tejun Heo wrote: > Yo, > > On Tue, Dec 10, 2013 at 03:55:48PM -0800, David Rientjes wrote: >> > Well, the gotcha there is that you won't be able to do that with >> > system level OOM handler either unless you create a separately >> > reserved memory, which, again, can be achieved using hierarchical >> > memcg setup already. Am I missing something here? >> >> System oom conditions would only arise when the usage of memcgs A + B >> above cause the page allocator to not be able to allocate memory without >> oom killing something even though the limits of both A and B may not have >> been reached yet. No userspace oom handler can allocate memory with >> access to memory reserves in the page allocator in such a context; it's >> vital that if we are to handle system oom conditions in userspace that we >> given them access to memory that other processes can't allocate. You >> could attach a userspace system oom handler to any memcg in this scenario >> with memory.oom_reserve_in_bytes and since it has PF_OOM_HANDLER it would >> be able to allocate in reserves in the page allocator and overcharge in >> its memcg to handle it. This isn't possible only with a hierarchical >> memcg setup unless you ensure the sum of the limits of the top level >> memcgs do not equal or exceed the sum of the min watermarks of all memory >> zones, and we exceed that. > > Yes, exactly. If system memory is 128M, create top level memcgs w/ > 120M and 8M each (well, with some slack of course) and then overcommit > the descendants of 120M while putting OOM handlers and friends under > 8M without overcommitting. > > ... >> The stronger rationale is that you can't handle system oom in userspace >> without this functionality and we need to do so. > > You're giving yourself an unreasonable precondition - overcommitting > at root level and handling system OOM from userland - and then trying > to contort everything to fit that. How can possibly "overcommitting > at root level" be a goal of and in itself? Please take a step back > and look at and explain the *problem* you're trying to solve. You > haven't explained why that *need*s to be the case at all. > > I wrote this at the start of the thread but you're still doing the > same thing. You're trying to create a hidden memcg level inside a > memcg. At the beginning of this thread, you were trying to do that > for !root memcgs and now you're arguing that you *need* that for root > memcg. Because there's no other limit we can make use of, you're > suggesting the use of kernel reserve memory for that purpose. It > seems like an absurd thing to do to me. It could be that you might > not be able to achieve exactly the same thing that way, but the right > thing to do would be improving memcg in general so that it can instead > of adding yet more layer of half-baked complexity, right? > > Even if there are some inherent advantages of system userland OOM > handling with a separate physical memory reserve, which AFAICS you > haven't succeeded at showing yet, this is a very invasive change and, > as you said before, something with an *extremely* narrow use case. > Wouldn't it be a better idea to improve the existing mechanisms - be > that memcg in general or kernel OOM handling - to fit the niche use > case better? I mean, just think about all the corner cases. How are > you gonna handle priority inversion through locked pages or > allocations given out to other tasks through slab? You're suggesting > opening a giant can of worms for extremely narrow benefit which > doesn't even seem like actually needing opening the said can. > > Thanks. > > -- > tejun > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 50 Watt idle power regression bisected to Linux-3.10
On Wed, 2013-12-11 at 20:49 -0800, H. Peter Anvin wrote: > On 12/11/2013 08:25 PM, Mike Galbraith wrote: > > arch/x86/include/asm/mwait.h |4 ++-- > > arch/x86/kernel/cpu/common.c |7 --- > > arch/x86/kernel/setup_percpu.c |1 + > > 3 files changed, 7 insertions(+), 5 deletions(-) > > > > Index: linux-2.6/arch/x86/kernel/cpu/common.c > > === > > --- linux-2.6.orig/arch/x86/kernel/cpu/common.c > > +++ linux-2.6/arch/x86/kernel/cpu/common.c > > @@ -65,13 +65,14 @@ void __init setup_cpu_local_masks(void) > > } > > > > /* allocate percpu area for mwait doorbell */ > > -char __percpu *mwait_doorbell; > > +DEFINE_PER_CPU(char *, mwait_doorbell); > > +EXPORT_PER_CPU_SYMBOL(mwait_doorbell); > > > > Sorry, this is wrong. This is NOT a percpu variable, it is a pointer to > a percpu allocation, but the variable itself is not a percpu variable. > This explains your boom. With that fixed, it boots, but is not quite perfect. ... [ 258.560079] fbcon: radeondrmfb (fb0) is primary device [ 258.722483] Console: switching to colour frame buffer device 128x48 [ 258.847076] radeon :01:03.0: fb0: radeondrmfb frame buffer device [ 258.911991] radeon :01:03.0: registered panic notifier [ 258.968772] [drm] Initialized radeon 2.35.0 20080528 for :01:03.0 on minor 0 ... [ 469.738604] netxen_nic :04:00.3: using msi-x interrupts [ 469.739078] netxen_nic :04:00.3: eth5: GbE port initialized [ 469.830512] ipmi_si 00:01: Found new BMC (man_id: 0x0b, prod_id: 0x2000, dev_id: 0x13) [ 469.830524] ipmi_si 00:01: IPMI kcs interface initialized [ 473.729862] iTCO_wdt: unable to reset NO_REBOOT flag, device disabled by hardware/BIOS ... [ 711.636741] fuse init (API version 7.22) ... ok box, doctor appointment is in an hour away. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: process 'stuck' at exit.
On Wed, 2013-12-11 at 23:26 -0500, Dave Jones wrote: > On Tue, Dec 10, 2013 at 02:48:52PM -0800, Linus Torvalds wrote: > > > Dave, can you re-create that trinity run and test that patch? I think > > we've got this > > 24 hours later, all is well. I think we can call this one done. > > Tested-by: Dave Jones Thank you again for a fine preemptive bug catch Dave! -- Darren Hart Intel Open Source Technology Center Yocto Project - Linux Kernel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH CFT] ARM:S5P64X0: Enable ARM_PATCH_PHYS_VIRT and AUTO_ZRELADDR by default
ARM_PATCH_PHYS_VIRT and AUTO_ZRELADDR have been enabled as default configs to S5P64X0 platforms. Introduction of PHYS_VIRT config as default would enable phy-to-virt and virt-to-phy translation function at boot and module loading time and enforce dynamic reallocation of memory. AUTO_ZRELADDR config would enable calculation of kernel load address at run time. PHYS_VIRT config is mutually exclusive to XIP_KERNEL, XIP_KERNEL is used in systems with NOR flash devices, and ZRELADDR config is mutually exclusive to ZBOOT_ROM. CFT::Call For Testing Requesting maintainers of S5P64X0 platforms to evaluate the changes on the board and comment, as I dont have the board for testing and also requesting an ACK Signed-off-by: panchaxari Cc: Kukjin Kim Cc: Tomasz Figa Cc: Sylwester Nawrocki Cc: Heiko Stuebner Cc: Russell King Cc: Linus Walleij Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- The samsung S5P64X0 vega has an average performing CPU with max speed 667 Mhz. This SOC has two variants S5P6440 and S5P6450. It has one core based on ARM1176JZF-S instruction set, and has 16KB data and instruction cache each. SOC has a memory subsystem with support to NAND Flash interface with x8 data bus, with 1/4/8/12/16 bit hardware ECC circuit and 4KB Page mode. It has Mobile DDR interface with x16 or x32 data bus, and DDR2 interface with x16 or x32 data bus it also supports eMMC4.4. Below lkml link is a quoting by Russell which clears the concept of PHYS_VIRT and ZRELADDR - https://lkml.org/lkml/2011/10/14/434 - --- arch/arm/Kconfig |2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 934e26c..8986335 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -759,6 +759,8 @@ config ARCH_S3C64XX config ARCH_S5P64X0 bool "Samsung S5P6440 S5P6450" + select ARM_PATCH_PHYS_VIRT + select AUTO_ZRELADDR select CLKDEV_LOOKUP select CLKSRC_SAMSUNG_PWM select CPU_V6 -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] tools/perf: Fix cross compilation
Commit b6aa997 "Add feature check core code" added feature checking logic in config/feature-checks/Makefile but didn't use the CROSS_COMPILE value. Fix it by prefixing $(CC), as is done in Makefile.perf. Signed-off-by: Michael Ellerman --- tools/perf/config/feature-checks/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/config/feature-checks/Makefile b/tools/perf/config/feature-checks/Makefile index bc86462..f3946db 100644 --- a/tools/perf/config/feature-checks/Makefile +++ b/tools/perf/config/feature-checks/Makefile @@ -28,7 +28,7 @@ FILES=\ test-stackprotector-all \ test-timerfd -CC := $(CC) -MD +CC := $(CROSS_COMPILE)$(CC) -MD all: $(FILES) -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers
> -Original Message- > From: Jingoo Han [mailto:jg1@samsung.com] > Sent: Thursday, December 12, 2013 3:55 AM > To: 'Bjorn Helgaas'; linux-...@vger.kernel.org > Cc: linux-kernel@vger.kernel.org; linux-arm-ker...@lists.infradead.org; > linux-te...@vger.kernel.org; linux...@vger.kernel.org; linux-samsung- > s...@vger.kernel.org; 'Shawn Guo'; 'Jason Cooper'; 'Thierry Reding'; 'Simon > Horman'; 'Magnus Damm'; 'Valentine Barshak'; 'Wei Yongjun'; 'Wei Yongjun'; > 'Kuninori Morimoto'; Mohit KUMAR DCG; Pratyush ANAND; 'Jingoo Han' > Subject: Re: [PATCH] MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car > PCI host maintainers > > On Thursday, December 12, 2013 3:43 AM, Bjorn Helgaas wrote: > > On Wed, Dec 11, 2013 at 11:32:37AM -0700, Bjorn Helgaas wrote: > > > If this looks reasonable, I'll merge it via the PCI tree for v3.13. > > > > And I see Mohit's patch [1] to update the DesignWare entry: > > > > +PCIE DRIVER FOR SYNOPSIS DESIGNWARE CONTROLLER > > +M: Mohit Kumar > > +M: Jingoo Han > > +L: linux-...@vger.kernel.org > > +S: Maintained > > +F: drivers/pci/host/pcie-designware.c > > > > I can fold in that update too if Jingoo acks it. > > > > [1] http://patchwork.ozlabs.org/patch/299905/ > > Hi Bjorn, > > I agree with this. :-) > Acked-by: Jingoo Han > - Thanks Bjorn and Jingoo. I will remove this patch from my v2 patches. Regards Mohit -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG in munlock_vma_pages_range
On 12/12/2013 11:16 AM, Sasha Levin wrote: > On 12/11/2013 05:59 PM, Vlastimil Babka wrote: >> On 12/09/2013 09:26 PM, Sasha Levin wrote: >>> On 12/09/2013 12:12 PM, Vlastimil Babka wrote: On 12/09/2013 06:05 PM, Sasha Levin wrote: > On 12/09/2013 04:34 AM, Vlastimil Babka wrote: >> Hello, I will look at it, thanks. >> Do you have specific reproduction instructions? > > Not really, the fuzzer hit it once and I've been unable to trigger > it again. Looking at > the piece of code involved it might have had something to do with > hugetlbfs, so I'll crank > up testing on that part. Thanks. Do you have trinity log and the .config file? I'm currently unable to even boot linux-next with my config/setup due to a GPF. Looking at code I wouldn't expect that it could encounter a tail page, without first encountering a head page and skipping the whole huge page. At least in THP case, as TLB pages should be split when a vma is split. As for hugetlbfs, it should be skipped for mlock/munlock operations completely. One of these assumptions is probably failing here... >>> >>> If it helps, I've added a dump_page() in case we hit a tail page >>> there and got: >>> >>> [ 980.172299] page:ea003e5e8040 count:0 mapcount:1 >>> mapping: (null) index:0 >>> x0 >>> [ 980.173412] page flags: 0x2f80008000(tail) >>> >>> I can also add anything else in there to get other debug output if >>> you think of something else useful. >> >> Please try the following. Thanks in advance. > > [ 428.499889] page:ea003e5c0040 count:0 mapcount:4 > mapping: (null) index:0x0 > [ 428.499889] page flags: 0x2f80008000(tail) > [ 428.499889] start=140117131923456 pfn=16347137 > orig_start=140117130543104 page_increm > =1 vm_start=140117130543104 vm_end=140117134688256 vm_flags=135266419 > [ 428.499889] first_page pfn=16347136 > [ 428.499889] page:ea003e5c count:204 mapcount:44 > mapping:880fb5c466c1 inde > x:0x7f6f8fe00 > [ 428.499889] page flags: > 0x2f80084068(uptodate|lru|active|head|swapbacked) >From this print, it looks like the page is still a huge page. One situation I guess is a huge page which isn't PageMlocked and passed to munlock_vma_page(). I'm not sure whether this will happen. Please take a try this patch. Thanks, -Bob diff --git a/mm/mlock.c b/mm/mlock.c index d480cd6..f7066d2 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -466,6 +466,22 @@ void munlock_vma_pages_range(struct vm_area_struct *vma, * the page_mask here. */ page_mask = munlock_vma_page(page); + + /* +* There are two possibilities when munlock_vma_page() return 0. +* 1. The THP page was split. +* 2. The THP page was not PageMlocked before and +*it didn't get split. +* +* In case 2 we have to reset page_mask to +* 'HPAGE_PMD_NR - 1' becuase this page is still +* huge page, else PageTransHuge may receive a +* tail page and trigger VM_BUG_ON on next loop. +*/ + if (!page_mask) + if (PageTransHuge(page)) + page_mask = HPAGE_PMD_NR - 1; + unlock_page(page); put_page(page); /* follow_page_mask() */ } else { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 50 Watt idle power regression bisected to Linux-3.10
On Wed, 2013-12-11 at 20:49 -0800, H. Peter Anvin wrote: > On 12/11/2013 08:25 PM, Mike Galbraith wrote: > > arch/x86/include/asm/mwait.h |4 ++-- > > arch/x86/kernel/cpu/common.c |7 --- > > arch/x86/kernel/setup_percpu.c |1 + > > 3 files changed, 7 insertions(+), 5 deletions(-) > > > > Index: linux-2.6/arch/x86/kernel/cpu/common.c > > === > > --- linux-2.6.orig/arch/x86/kernel/cpu/common.c > > +++ linux-2.6/arch/x86/kernel/cpu/common.c > > @@ -65,13 +65,14 @@ void __init setup_cpu_local_masks(void) > > } > > > > /* allocate percpu area for mwait doorbell */ > > -char __percpu *mwait_doorbell; > > +DEFINE_PER_CPU(char *, mwait_doorbell); > > +EXPORT_PER_CPU_SYMBOL(mwait_doorbell); > > > > Sorry, this is wrong. This is NOT a percpu variable, it is a pointer to > a percpu allocation, but the variable itself is not a percpu variable. > This explains your boom. Yeah, I know, I already slapped myself upside the head. (what were you thinking mikie...la la la la la:) -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Driver core: Fix device_add_attrs() error code path
From: Rafael J. Wysocki If the addition of dev_attr_online fails, device_add_attrs() should remove device attribute groups as well as type and class attribute groups before returning an error code. Make that happen. Signed-off-by: Rafael J. Wysocki --- drivers/base/core.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: linux-pm/drivers/base/core.c === --- linux-pm.orig/drivers/base/core.c +++ linux-pm/drivers/base/core.c @@ -491,11 +491,13 @@ static int device_add_attrs(struct devic if (device_supports_offline(dev) && !dev->offline_disabled) { error = device_create_file(dev, _attr_online); if (error) - goto err_remove_type_groups; + goto err_remove_dev_groups; } return 0; + err_remove_dev_groups: + device_remove_groups(dev, dev->groups); err_remove_type_groups: if (type) device_remove_groups(dev, type->groups); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 50 Watt idle power regression bisected to Linux-3.10
On 12/11/2013 08:25 PM, Mike Galbraith wrote: > arch/x86/include/asm/mwait.h |4 ++-- > arch/x86/kernel/cpu/common.c |7 --- > arch/x86/kernel/setup_percpu.c |1 + > 3 files changed, 7 insertions(+), 5 deletions(-) > > Index: linux-2.6/arch/x86/kernel/cpu/common.c > === > --- linux-2.6.orig/arch/x86/kernel/cpu/common.c > +++ linux-2.6/arch/x86/kernel/cpu/common.c > @@ -65,13 +65,14 @@ void __init setup_cpu_local_masks(void) > } > > /* allocate percpu area for mwait doorbell */ > -char __percpu *mwait_doorbell; > +DEFINE_PER_CPU(char *, mwait_doorbell); > +EXPORT_PER_CPU_SYMBOL(mwait_doorbell); > Sorry, this is wrong. This is NOT a percpu variable, it is a pointer to a percpu allocation, but the variable itself is not a percpu variable. This explains your boom. > void __init setup_mwait_doorbell(void) > { > if (boot_cpu_has(X86_FEATURE_MWAIT)) { > - mwait_doorbell = __alloc_percpu(boot_cpu_data.clflush_size, > - boot_cpu_data.clflush_size); > + mwait_doorbell = __alloc_percpu(boot_cpu_data.x86_clflush_size, > + boot_cpu_data.x86_clflush_size); > > if (!mwait_doorbell) { > /* This should never happen... */ -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -tip v4 6/6] [RFC] kprobes/x86: Call exception handlers directly from do_int3/do_debug
(2013/12/11 22:31), Jiri Kosina wrote: > On Tue, 3 Dec 2013, Steven Rostedt wrote: > >>> To avoid a kernel crash by probing on lockdep code, call >>> kprobe_int3_handler and kprobe_debug_handler directly >>> from do_int3 and do_debug. Since there is a locking code >>> in notify_die, lockdep code can be invoked. And because >>> the lockdep involves printk() related things, theoretically, >>> we need to prohibit probing on much more code... >>> >>> Anyway, most of the int3 handlers in the kernel are already >>> called from do_int3 directly, e.g. ftrace_int3_handler, >>> poke_int3_handler, kgdb_ll_trap. Actually only >>> kprobe_exceptions_notify is on the notifier_call_chain. >>> >>> So I think this is not a crazy thing. >> >> What? Oh, yeah. No, using notifiers in int3 handler is the crazy >> thing ;-) > > Yeah, it's broken. Obviously, if you happen to trigger int3 before the > notifier has been registered, it'd cause int3 exception to be unhandled. > See > > commit 17f41571bb2c4a398785452ac2718a6c5d77180e > Author: Jiri Kosina > Date: Tue Jul 23 10:09:28 2013 +0200 > > kprobes/x86: Call out into INT3 handler directly instead of using > notifier > > for one such issue that happened with jump labels. > >> Hmm, if there's no users of the int3 notifier, should we just remove it? > > Hmm, there are still uprobes, right? Right, uprobes still use it, however, since it only handles user-space breakpoint, there is no problem. Thank you! -- Masami HIRAMATSU IT Management Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/10] net: stmmac: Enable stmmac main clock when probing hardware
Hi, On Wed, Dec 11, 2013 at 4:05 AM, Maxime Ripard wrote: > Hi, > > On Mon, Dec 09, 2013 at 10:43:29AM +0800, Chen-Yu Tsai wrote: >> >> @@ -2759,15 +2760,18 @@ struct stmmac_priv *stmmac_dvr_probe(struct >> >> device *device, >> >> } >> >> } >> >> >> >> + clk_disable_unprepare(priv->stmmac_clk); >> >> + >> > >> > Hu? Why do you disable the clock? don't you need it afterwards? >> >> The clock is enabled in *_open (when the network interface is used), >> and disabled in *_close. > > Maybe it is the real issue then. > > Why don't you move the clk_disable to _remove then? I wasn't sure this was the proper way. However, looking around, it seems other drivers enable the clock in, _probe, and disable it in _remove. I will modify stmmac to do so as well. ChenYu -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: process 'stuck' at exit.
On Tue, Dec 10, 2013 at 02:48:52PM -0800, Linus Torvalds wrote: > Dave, can you re-create that trinity run and test that patch? I think > we've got this 24 hours later, all is well. I think we can call this one done. Tested-by: Dave Jones Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 50 Watt idle power regression bisected to Linux-3.10
On Wed, 2013-12-11 at 16:52 -0800, H. Peter Anvin wrote: > On 12/11/2013 03:14 PM, Borislav Petkov wrote: > > On Wed, Dec 11, 2013 at 03:08:35PM -0800, H. Peter Anvin wrote: > >> So I would like to propose that we switch to using a percpu variable > >> which is a single cache line of nothing at all. It would only ever > >> be touched by MONITOR and for explicit wakeup. Hopefully that will > >> resolve this problem without the need for the CLFLUSH. > > > > Yep, makes a lot of sense to me to have an exclusive (overloaded meaning > > here :-)) cacheline only for that. And, if it works, we'll save us the > > penalty from the CLFLUSH too, cool. > > > > Here is a POC patch... anyone willing to test it out? Got it built, but it went boom on boot. Off to rummage. [0.00] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:64 nr_node_ids:8 [0.00] PERCPU: Embedded 26 pages/cpu @88027ee0 s75904 r8192 d22400 u131072 [0.00] pcpu-alloc: s75904 r8192 d22400 u131072 alloc=1*2097152 [0.00] pcpu-alloc: [0] 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 [0.00] pcpu-alloc: [0] 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 [0.00] pcpu-alloc: [0] 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 [0.00] pcpu-alloc: [0] 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 [0.00] BUG: unable to handle kernel paging request at b8a0 [0.00] IP: [] setup_mwait_doorbell+0x20/0x38 [0.00] PGD 0 [0.00] Oops: 0002 [#1] SMP [0.00] Modules linked in: [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 3.13.0-master #185 [0.00] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 07/07/2010 [0.00] task: 81a10460 ti: 81a0 task.ti: 81a0 [0.00] RIP: 0010:[] [] setup_mwait_doorbell+0x20/0x38 [0.00] RSP: :81a01f28 EFLAGS: 00010002 [0.00] RAX: 00014880 RBX: 0040 RCX: [0.00] RDX: 0040 RSI: 0040 RDI: 81a38e60 [0.00] RBP: 81a01f28 R08: 0040 R09: [0.00] R10: 88027f5f4880 R11: 0001 R12: b850 [0.00] R13: b026 R14: b024 R15: b020 [0.00] FS: () GS:88027ee0() knlGS: [0.00] CS: 0010 DS: ES: CR0: 80050033 [0.00] CR2: b8a0 CR3: 01a0b000 CR4: 00b0 [0.00] Stack: [0.00] 81a01f78 81aa3641 81a01f98 cd48 [0.00] 88027ee0 [0.00] 81a01fa8 81a96d89 [0.00] Call Trace: [0.00] [] setup_per_cpu_areas+0x233/0x242 [0.00] [] start_kernel+0x84/0x370 [0.00] [] x86_64_start_reservations+0x1b/0x35 [0.00] [] x86_64_start_kernel+0x12e/0x135 [0.00] Code: 40 8f a7 81 e8 f6 fe ff ff c9 c3 55 48 8b 05 0a bf fd ff 48 89 e5 a8 08 75 02 c9 c3 0f b7 3d 84 bf fd ff 48 89 fe e8 fe dc 64 ff <48> 89 05 27 e8 56 7e 48 85 c0 75 e3 48 c7 c7 f0 83 78 81 e8 55 [0.00] RIP [] setup_mwait_doorbell+0x20/0x38 [0.00] RSP [0.00] CR2: b8a0 [0.00] ---[ end trace f6e32c58e0729292 ]--- [0.00] Kernel panic - not syncing: Attempted to kill the idle task! Build delta. --- arch/x86/include/asm/mwait.h |4 ++-- arch/x86/kernel/cpu/common.c |7 --- arch/x86/kernel/setup_percpu.c |1 + 3 files changed, 7 insertions(+), 5 deletions(-) Index: linux-2.6/arch/x86/kernel/cpu/common.c === --- linux-2.6.orig/arch/x86/kernel/cpu/common.c +++ linux-2.6/arch/x86/kernel/cpu/common.c @@ -65,13 +65,14 @@ void __init setup_cpu_local_masks(void) } /* allocate percpu area for mwait doorbell */ -char __percpu *mwait_doorbell; +DEFINE_PER_CPU(char *, mwait_doorbell); +EXPORT_PER_CPU_SYMBOL(mwait_doorbell); void __init setup_mwait_doorbell(void) { if (boot_cpu_has(X86_FEATURE_MWAIT)) { - mwait_doorbell = __alloc_percpu(boot_cpu_data.clflush_size, - boot_cpu_data.clflush_size); + mwait_doorbell = __alloc_percpu(boot_cpu_data.x86_clflush_size, + boot_cpu_data.x86_clflush_size); if (!mwait_doorbell) { /* This should never happen... */ Index: linux-2.6/arch/x86/kernel/setup_percpu.c === --- linux-2.6.orig/arch/x86/kernel/setup_percpu.c +++ linux-2.6/arch/x86/kernel/setup_percpu.c @@ -20,6 +20,7 @@ #include #include #include +#include DEFINE_PER_CPU_READ_MOSTLY(int,
Re: Re: [PATCH -tip v5.1 12/18] ftrace/kprobes: Use NOKPROBE_SYMBOL macro in ftrace
(2013/12/12 10:34), Steven Rostedt wrote: > On Tue, 10 Dec 2013 09:57:14 + > Masami Hiramatsu wrote: > > >> --- a/kernel/trace/trace_kprobe.c >> +++ b/kernel/trace/trace_kprobe.c >> @@ -51,45 +51,45 @@ struct event_file_link { >> (sizeof(struct probe_arg) * (n))) >> >> >> -static __kprobes bool trace_probe_is_return(struct trace_probe *tp) >> +static __always_inline bool trace_probe_is_return(struct trace_probe *tp) > > I wonder if we should have a comment somewhere explaining why we are > using __always_inline. Maybe we should add a new annotation: > > #define kprobes_inline __always_inline > > ? > > The above would be self documenting, and we can also include a comment > with the define that states why it is there. Otherwise 10 years from > now, someone is going to see these and say "WTF!" and remove them. Hm, agreed, and I think nokprobe_inline is better since it is similar to NOKPROBE_SYMBOL(). :) [...] >> @@ -755,8 +755,8 @@ static const struct file_operations kprobe_profile_ops = >> { >> }; >> >> /* Sum up total data length for dynamic arraies (strings) */ >> -static __kprobes int __get_data_size(struct trace_probe *tp, >> - struct pt_regs *regs) >> +static __always_inline >> +int __get_data_size(struct trace_probe *tp, struct pt_regs *regs) > > This function is used 4 times within the file and is not that small. I > think it's a bit too big for an inline, and qualifies for a normal > function with a NOKPROBE_SYMBOL() attached. OK, I'll do so. >> @@ -771,9 +771,9 @@ static __kprobes int __get_data_size(struct trace_probe >> *tp, >> } >> >> /* Store the value of each argument */ >> -static __kprobes void store_trace_args(int ent_size, struct trace_probe *tp, >> - struct pt_regs *regs, >> - u8 *data, int maxlen) >> +static __always_inline >> +void store_trace_args(int ent_size, struct trace_probe *tp, >> + struct pt_regs *regs, u8 *data, int maxlen) > > Same here (even more so!) OK. >> { >> int i; >> u32 end = tp->size; >> @@ -803,7 +803,7 @@ static __kprobes void store_trace_args(int ent_size, >> struct trace_probe *tp, >> } >> >> /* Kprobe handler */ >> -static __kprobes void >> +static __always_inline void >> __kprobe_trace_func(struct trace_probe *tp, struct pt_regs *regs, >> struct ftrace_event_file *ftrace_file) > > OK, this one is big, but it's only used once. Right, at least in my build binary, it is inlined. Thank you, -- Masami HIRAMATSU IT Management Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 tip/core/rcu 0/4] Documentation changes for 3.14
On Wed, Dec 11, 2013 at 03:08:23PM -0800, Paul E. McKenney wrote: > Hello! > > This series once again attempts to improve rcu_assign_pointer()'s > relationship with sparse. > > 1.Add a comment indicating that despite appearances, > rcu_assign_pointer() really only evaluates its arguments once, > as a cpp macro should. > > 2.Replace rcu_assign_pointer() of NULL with RCU_INIT_POINTER() to > silence a sparse warning. > > 3.Apply ACCESS_ONCE() to rcu_assign_pointer()'s target to prevent > comiler mischief. Also require that the source pointer be from > the kernel address space. Sometimes it can be from the RCU address > space, which necessitates the remaining patches in this series. > Which, it must be admitted, apply to a very small fraction of > the rcu_assign_pointer() invocations in the kernel. This commit > courtesy of Josh Triplett. > > 4.Add an RCU_INITIALIZER() for compile-time initialization of > global RCU-protected pointers. For all the patches (other than the one I wrote, for obvious reasons): Reviewed-by: Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] drivers: net: cpsw: fix for cpsw crash when build as modules
From: Mugunthan V N When CPSW and Davinci MDIO are build as modules, CPSW crashes when accessing CPSW registers in CPSW probe. The same is working in built-in as the CPSW clocks are enabled in Davindi MDIO probe, SO Enabling the clocks before accessing the version register and moving out the other register access to cpsw device open. Signed-off-by: Mugunthan V N Signed-off-by: Felipe Balbi --- drivers/net/ethernet/ti/cpsw.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index a91f0c9..5120d9c 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -1151,6 +1151,12 @@ static int cpsw_ndo_open(struct net_device *ndev) * receive descs */ cpsw_info(priv, ifup, "submitted %d rx descriptors\n", i); + + if (cpts_register(>pdev->dev, priv->cpts, + priv->data.cpts_clock_mult, + priv->data.cpts_clock_shift)) + dev_err(priv->dev, "error registering cpts device\n"); + } /* Enable Interrupt pacing if configured */ @@ -1197,6 +1203,7 @@ static int cpsw_ndo_stop(struct net_device *ndev) netif_carrier_off(priv->ndev); if (cpsw_common_res_usage_state(priv) <= 1) { + cpts_unregister(priv->cpts); cpsw_intr_disable(priv); cpdma_ctlr_int_ctrl(priv->dma, false); cpdma_ctlr_stop(priv->dma); @@ -1985,9 +1992,15 @@ static int cpsw_probe(struct platform_device *pdev) goto clean_runtime_disable_ret; } priv->regs = ss_regs; - priv->version = __raw_readl(>regs->id_ver); priv->host_port = HOST_PORT_NUM; + /* Need to enable clocks with runtime PM api to access module +* registers +*/ + pm_runtime_get_sync(>dev); + priv->version = readl(>regs->id_ver); + pm_runtime_put_sync(>dev); + res = platform_get_resource(pdev, IORESOURCE_MEM, 1); priv->wr_regs = devm_ioremap_resource(>dev, res); if (IS_ERR(priv->wr_regs)) { @@ -2157,8 +2170,6 @@ static int cpsw_remove(struct platform_device *pdev) unregister_netdev(cpsw_get_slave_ndev(priv, 1)); unregister_netdev(ndev); - cpts_unregister(priv->cpts); - cpsw_ale_destroy(priv->ale); cpdma_chan_destroy(priv->txch); cpdma_chan_destroy(priv->rxch); -- 1.8.4.GIT -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] drivers: net: cpsw: fix dt probe for one port ethernet
From: Mugunthan V N When only one port of the two port is pinned out, then dt probe is failing because second port phy is not found. fixing this by checking the number of slaves and breaking the loop. Signed-off-by: Mugunthan V N Signed-off-by: Felipe Balbi --- both patches were taken from TI's 3.12 tree [1] and have been tested on am335x, am437x and dra7xx. Mugunthan, I took the patches because I got bug reports on v3.13-rc which these patches fix. Let me know if you prefer to send another version of them for whatever reason. cheers drivers/net/ethernet/ti/cpsw.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index 7536a4c..a91f0c9 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -1816,6 +1816,8 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data, } i++; + if (i == data->slaves) + break; } return 0; -- 1.8.4.GIT -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tty-next 0/4] tty: Fix ^C echo
On 12/04/2013 07:13 PM, One Thousand Gnomes wrote: Not so much confused as simply merged. Input processing is inherently single-threaded; it makes sense to rely on that at the highest level possible. I would disagree entirely. You want to minimise the areas affected by a given lock. You also want to lock data not code. Correctness comes before speed. You optimise it when its right, otherwise you end up in a nasty mess when you discover you've optimised to assumptions that are flawed. Sorry for the delayed reply, Alan; what little free time I had was spent snuffing out regressions :/ Sure, I understand that ideally locks protect data, not operations. But I think maybe you're missing my point. Almost every lock, even at inception, is somewhat optimized; otherwise, every datum would have its own lock. Eliminating overlapping locks is a common optimization in stable code. In this case, an already broken bit of code is just only still broken. buf->lock is also fairly simple to break apart (although I don't want to because of the performance hit) which is not characteristic of locks which protect operations. Firewire, which is capable of sustained throughput in excess of 40MB/sec, struggles to get over 5MB/sec through the tty layer. [And drm output is orders-of-magnitude slower than that, which is just sad...] And what protocols do you care about 5MB/second - n_tty - no ? For the high speed protocols you are trying to fix a lost cause. By the time we've gone piddling around with tty buffers and serialized tty queues firing bytes through tasks and the like you already lost. For drm I assume you mean the framebuffer console logic ? Last time I benched that except for the Poulsbo it was bottlenecked on the GPU - not that I can type at 5MB/second anyway. Not that fixing the performance of the various bits wouldn't be a good thing too especially on the output end. For drm, I actually mean GEM object deletion, which is typically fenced and thus appears to be GPU-bound. What's really needed there is deferred deletion, like kfree_rcu(), with partial synchronization on allocation failures only. I mostly care about output speed; unfortunately, that's the input side at the other end :) While that would work, it's expensive extra locking in a path that 99.999% of the time doesn't need it. I'd rather explore other solutions. How about getting the high speed paths out of the whole tty buffer layer ? Almost every line discipline can be a fastpath directly to the network layer. If optimisation is the new obsession then we can cut the crap entirely by optimising for networking not making it a slave of n_tty. Starting at the beginning we have locks on rx because - we want serialized rx - we have buffer lifetimes - we have buffer queues - we have loads of flow control parameters Only n_tty needs the buffers (maybe some of irda but irda hasn't worked for years afaik). IRQ receive paths are serialized (and as a bonus can be pinned to a CPU). Flow control is n_tty stuff, everyone else simply fires it at their network layer as fast as possible and net already does the work. Keep a single tty_buf in the tty for batching at any given time, and private so no locks at all Have a wrapper via ld->receive(tty, buf) which fires the tty_buf at the ldisc and allocates a new empty one tty_queue_bytes(tty, buf, flags, len) which adds to the buffer, and if full calls ld->queue and then carries on the copying cycle and ld->receive_direct(tty, buf, flags, len) which allows block mode devices to blast bytes directly at the queue (ie all the USB 3G stuff, firewire, etc) without going via any additional copies. For almost all ldiscs ld->receive would be ld->receive_direct(tty, buf->buf, buf->flags, buf->len); free buffer For n_tty type stuff ld->receive is basically much of tty_flip_buffer_push ld->receive_direct allocates tty_buffers and copies into it We may even be able to optimise some of the n_tty cases into the fastpath afterwards (notably raw, no echo) For anything receiving in blocks that puts us close to (but not quite at) ethernet kinds of cleanness for network buffer delivery. Worth me looking into ? I have to give this a lot more thought. The universality of n_tty is important, and costs real cycles on servers and such. It's not just about typing speed. The clock/generation method seems like it might yield a lockless solution for this problem, but maybe creates another one because the driver-side would need to stamp the buffer (in essence, a flush could affect data that has not yet been copied from the driver). But it has arrived in the driver so might not matter. That requires a little thought! This is my next experiment. Regards, Peter Hurley -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -next 1/2] seq_file: Rename static bool seq_overflow to public bool seq_is_buf_full
The return values of seq_printf/puts/putc are frequently misused. Start down a path to remove all the return value uses of these functions. Make the static bool seq_overflow public along with a rename of the function to seq_is_buf_full. Rename the still static seq_set_overflow to seq_set_buf_full. Update the documentation to not show return types for seq_printf et al. Add a description of seq_is_buf_full. Signed-off-by: Joe Perches --- Documentation/filesystems/seq_file.txt | 28 fs/seq_file.c | 28 ++-- include/linux/seq_file.h | 8 3 files changed, 38 insertions(+), 26 deletions(-) diff --git a/Documentation/filesystems/seq_file.txt b/Documentation/filesystems/seq_file.txt index a1e2e0d..794dbde 100644 --- a/Documentation/filesystems/seq_file.txt +++ b/Documentation/filesystems/seq_file.txt @@ -171,27 +171,23 @@ output must be passed to the seq_file code. Some utility functions have been defined which make this task easy. Most code will simply use seq_printf(), which works pretty much like -printk(), but which requires the seq_file pointer as an argument. It is -common to ignore the return value from seq_printf(), but a function -producing complicated output may want to check that value and quit if -something non-zero is returned; an error return means that the seq_file -buffer has been filled and further output will be discarded. +printk(), but which requires the seq_file pointer as an argument. For straight character output, the following functions may be used: - int seq_putc(struct seq_file *m, char c); - int seq_puts(struct seq_file *m, const char *s); - int seq_escape(struct seq_file *m, const char *s, const char *esc); + seq_putc(struct seq_file *m, char c); + seq_puts(struct seq_file *m, const char *s); + seq_escape(struct seq_file *m, const char *s, const char *esc); The first two output a single character and a string, just like one would expect. seq_escape() is like seq_puts(), except that any character in s which is in the string esc will be represented in octal form in the output. -There is also a pair of functions for printing filenames: +There are also a pair of functions for printing filenames: - int seq_path(struct seq_file *m, struct path *path, char *esc); - int seq_path_root(struct seq_file *m, struct path *path, - struct path *root, char *esc) + seq_path(struct seq_file *m, struct path *path, char *esc); + seq_path_root(struct seq_file *m, struct path *path, + struct path *root, char *esc) Here, path indicates the file of interest, and esc is a set of characters which should be escaped in the output. A call to seq_path() will output @@ -200,6 +196,14 @@ root is desired, it can be used with seq_path_root(). Note that, if it turns out that path cannot be reached from root, the value of root will be changed in seq_file_root() to a root which *does* work. +A function producing complicated output may want to check + bool seq_is_buf_full(struct seq_file *m); +and avoid further seq_ calls if true is returned. + +A true return from seq_is_buf_full means that the seq_file buffer is full +and further output will be discarded. The seq_show function will attempt +to allocate a larger buffer and retry printing. + Making it all work diff --git a/fs/seq_file.c b/fs/seq_file.c index 1d641bb..2fda3a1 100644 --- a/fs/seq_file.c +++ b/fs/seq_file.c @@ -14,18 +14,18 @@ #include #include - /* - * seq_files have a buffer which can may overflow. When this happens a larger + * seq_files have a buffer which may overflow. When this happens a larger * buffer is reallocated and all the data will be printed again. * The overflow state is true when m->count == m->size. */ -static bool seq_overflow(struct seq_file *m) +bool seq_is_buf_full(struct seq_file *m) { return m->count == m->size; } +EXPORT_SYMBOL(seq_is_buf_full); -static void seq_set_overflow(struct seq_file *m) +static void seq_set_buf_full(struct seq_file *m) { m->count = m->size; } @@ -112,7 +112,7 @@ static int traverse(struct seq_file *m, loff_t offset) error = 0; m->count = 0; } - if (seq_overflow(m)) + if (seq_is_buf_full(m)) goto Eoverflow; if (pos + m->count > offset) { m->from = offset - pos; @@ -255,7 +255,7 @@ Fill: break; } err = m->op->show(m, p); - if (seq_overflow(m) || err) { + if (seq_is_buf_full(m) || err) { m->count = offs; if (likely(err <= 0)) break; @@ -384,7 +384,7 @@ int seq_escape(struct seq_file *m, const char *s, const char
[PATCH -next 2/2] netfilter: Convert print_tuple functions to return void
Since adding a new function to seq_file (seq_is_buf_full) there isn't any value for functions called from seq_show to return anything. Remove the int returns of the various print_tuple/_print_tuple functions. Signed-off-by: Joe Perches --- include/net/netfilter/nf_conntrack_core.h| 2 +- include/net/netfilter/nf_conntrack_l3proto.h | 4 ++-- include/net/netfilter/nf_conntrack_l4proto.h | 4 ++-- net/netfilter/nf_conntrack_l3proto_generic.c | 5 ++--- net/netfilter/nf_conntrack_proto_dccp.c | 10 +- net/netfilter/nf_conntrack_proto_generic.c | 5 ++--- net/netfilter/nf_conntrack_proto_gre.c | 10 +- net/netfilter/nf_conntrack_proto_sctp.c | 10 +- net/netfilter/nf_conntrack_proto_tcp.c | 10 +- net/netfilter/nf_conntrack_proto_udp.c | 10 +- net/netfilter/nf_conntrack_proto_udplite.c | 10 +- net/netfilter/nf_conntrack_standalone.c | 15 +++ 12 files changed, 46 insertions(+), 49 deletions(-) diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h index 15308b8..7b8f18c 100644 --- a/include/net/netfilter/nf_conntrack_core.h +++ b/include/net/netfilter/nf_conntrack_core.h @@ -72,7 +72,7 @@ static inline int nf_conntrack_confirm(struct sk_buff *skb) return ret; } -int +void print_tuple(struct seq_file *s, const struct nf_conntrack_tuple *tuple, const struct nf_conntrack_l3proto *l3proto, const struct nf_conntrack_l4proto *proto); diff --git a/include/net/netfilter/nf_conntrack_l3proto.h b/include/net/netfilter/nf_conntrack_l3proto.h index 3efab70..9e349db 100644 --- a/include/net/netfilter/nf_conntrack_l3proto.h +++ b/include/net/netfilter/nf_conntrack_l3proto.h @@ -38,8 +38,8 @@ struct nf_conntrack_l3proto { const struct nf_conntrack_tuple *orig); /* Print out the per-protocol part of the tuple. */ - int (*print_tuple)(struct seq_file *s, - const struct nf_conntrack_tuple *); + void (*print_tuple)(struct seq_file *s, + const struct nf_conntrack_tuple *); /* * Called before tracking. diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h index 4c8d573..fead8ee 100644 --- a/include/net/netfilter/nf_conntrack_l4proto.h +++ b/include/net/netfilter/nf_conntrack_l4proto.h @@ -56,8 +56,8 @@ struct nf_conntrack_l4proto { u_int8_t pf, unsigned int hooknum); /* Print out the per-protocol part of the tuple. Return like seq_* */ - int (*print_tuple)(struct seq_file *s, - const struct nf_conntrack_tuple *); + void (*print_tuple)(struct seq_file *s, + const struct nf_conntrack_tuple *); /* Print out the private part of the conntrack. */ int (*print_conntrack)(struct seq_file *s, struct nf_conn *); diff --git a/net/netfilter/nf_conntrack_l3proto_generic.c b/net/netfilter/nf_conntrack_l3proto_generic.c index e7eb807..cf9ace7 100644 --- a/net/netfilter/nf_conntrack_l3proto_generic.c +++ b/net/netfilter/nf_conntrack_l3proto_generic.c @@ -49,10 +49,9 @@ static bool generic_invert_tuple(struct nf_conntrack_tuple *tuple, return true; } -static int generic_print_tuple(struct seq_file *s, - const struct nf_conntrack_tuple *tuple) +static void generic_print_tuple(struct seq_file *s, + const struct nf_conntrack_tuple *tuple) { - return 0; } static int generic_get_l4proto(const struct sk_buff *skb, unsigned int nhoff, diff --git a/net/netfilter/nf_conntrack_proto_dccp.c b/net/netfilter/nf_conntrack_proto_dccp.c index a99b6c3..d357f11 100644 --- a/net/netfilter/nf_conntrack_proto_dccp.c +++ b/net/netfilter/nf_conntrack_proto_dccp.c @@ -618,12 +618,12 @@ out_invalid: return -NF_ACCEPT; } -static int dccp_print_tuple(struct seq_file *s, - const struct nf_conntrack_tuple *tuple) +static void dccp_print_tuple(struct seq_file *s, +const struct nf_conntrack_tuple *tuple) { - return seq_printf(s, "sport=%hu dport=%hu ", - ntohs(tuple->src.u.dccp.port), - ntohs(tuple->dst.u.dccp.port)); + seq_printf(s, "sport=%hu dport=%hu ", + ntohs(tuple->src.u.dccp.port), + ntohs(tuple->dst.u.dccp.port)); } static int dccp_print_conntrack(struct seq_file *s, struct nf_conn *ct) diff --git a/net/netfilter/nf_conntrack_proto_generic.c b/net/netfilter/nf_conntrack_proto_generic.c index d25f2937..0a3ded1 100644 --- a/net/netfilter/nf_conntrack_proto_generic.c +++ b/net/netfilter/nf_conntrack_proto_generic.c @@ -39,10 +39,9 @@ static bool generic_invert_tuple(struct nf_conntrack_tuple *tuple, } /* Print out the per-protocol part of the tuple.
[PATCH -next 0/2] seq_file/netfilter: Start removing returns from seq_
The return value from seq_printf/puts/putc/etc are frequently misused. Start removing the uses of the return values. Joe Perches (2): seq_file: Rename static bool seq_overflow to public bool seq_is_buf_full netfilter: Convert print_tuple functions to return void Documentation/filesystems/seq_file.txt | 28 fs/seq_file.c| 28 ++-- include/linux/seq_file.h | 8 include/net/netfilter/nf_conntrack_core.h| 2 +- include/net/netfilter/nf_conntrack_l3proto.h | 4 ++-- include/net/netfilter/nf_conntrack_l4proto.h | 4 ++-- net/netfilter/nf_conntrack_l3proto_generic.c | 5 ++--- net/netfilter/nf_conntrack_proto_dccp.c | 10 +- net/netfilter/nf_conntrack_proto_generic.c | 5 ++--- net/netfilter/nf_conntrack_proto_gre.c | 10 +- net/netfilter/nf_conntrack_proto_sctp.c | 10 +- net/netfilter/nf_conntrack_proto_tcp.c | 10 +- net/netfilter/nf_conntrack_proto_udp.c | 10 +- net/netfilter/nf_conntrack_proto_udplite.c | 10 +- net/netfilter/nf_conntrack_standalone.c | 15 +++ 15 files changed, 84 insertions(+), 75 deletions(-) -- 1.8.1.2.459.gbcd45b4.dirty -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] rcu: use simple wait queues where possible in rcutree
On Wed, Dec 11, 2013 at 08:06:39PM -0500, Paul Gortmaker wrote: > From: Thomas Gleixner > > As of commit dae6e64d2bcfd4b06304ab864c7e3a4f6b5fedf4 ("rcu: Introduce > proper blocking to no-CBs kthreads GP waits") the rcu subsystem started > making use of wait queues. > > Here we convert all additions of rcu wait queues to use simple wait queues, > since they don't need the extra overhead of the full wait queue features. > > Originally this was done for RT kernels, since we would get things like... > > BUG: sleeping function called from invalid context at kernel/rtmutex.c:659 > in_atomic(): 1, irqs_disabled(): 1, pid: 8, name: rcu_preempt > Pid: 8, comm: rcu_preempt Not tainted > Call Trace: >[] __might_sleep+0xd0/0xf0 >[] rt_spin_lock+0x24/0x50 >[] __wake_up+0x36/0x70 >[] rcu_gp_kthread+0x4d2/0x680 >[] ? __init_waitqueue_head+0x50/0x50 >[] ? rcu_gp_fqs+0x80/0x80 >[] kthread+0xdb/0xe0 >[] ? finish_task_switch+0x52/0x100 >[] kernel_thread_helper+0x4/0x10 >[] ? __init_kthread_worker+0x60/0x60 >[] ? gs_change+0xb/0xb > > ...and hence simple wait queues were deployed on RT out of necessity > (as simple wait uses a raw lock), but mainline might as well take > advantage of the more streamline support as well. > > Signed-off-by: Thomas Gleixner > Cc: Paul E. McKenney > Signed-off-by: Sebastian Andrzej Siewior > Signed-off-by: Steven Rostedt > [PG: adapt from multiple v3.10-rt patches and add a commit log.] > Signed-off-by: Paul Gortmaker You got the swake_up_all() this time, so: Reviewed-by: Paul E. McKenney ;-) > --- > kernel/rcu/tree.c| 16 > kernel/rcu/tree.h| 7 --- > kernel/rcu/tree_plugin.h | 14 +++--- > 3 files changed, 19 insertions(+), 18 deletions(-) > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index dd08198..b35babb 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -1550,9 +1550,9 @@ static int __noreturn rcu_gp_kthread(void *arg) > trace_rcu_grace_period(rsp->name, > ACCESS_ONCE(rsp->gpnum), > TPS("reqwait")); > - wait_event_interruptible(rsp->gp_wq, > - ACCESS_ONCE(rsp->gp_flags) & > - RCU_GP_FLAG_INIT); > + swait_event_interruptible(rsp->gp_wq, > + ACCESS_ONCE(rsp->gp_flags) & > + RCU_GP_FLAG_INIT); > if (rcu_gp_init(rsp)) > break; > cond_resched(); > @@ -1576,7 +1576,7 @@ static int __noreturn rcu_gp_kthread(void *arg) > trace_rcu_grace_period(rsp->name, > ACCESS_ONCE(rsp->gpnum), > TPS("fqswait")); > - ret = wait_event_interruptible_timeout(rsp->gp_wq, > + ret = swait_event_interruptible_timeout(rsp->gp_wq, > ((gf = ACCESS_ONCE(rsp->gp_flags)) & >RCU_GP_FLAG_FQS) || > (!ACCESS_ONCE(rnp->qsmask) && > @@ -1625,7 +1625,7 @@ static void rsp_wakeup(struct irq_work *work) > struct rcu_state *rsp = container_of(work, struct rcu_state, > wakeup_work); > > /* Wake up rcu_gp_kthread() to start the grace period. */ > - wake_up(>gp_wq); > + swake_up(>gp_wq); > } > > /* > @@ -1701,7 +1701,7 @@ static void rcu_report_qs_rsp(struct rcu_state *rsp, > unsigned long flags) > { > WARN_ON_ONCE(!rcu_gp_in_progress(rsp)); > raw_spin_unlock_irqrestore(_get_root(rsp)->lock, flags); > - wake_up(>gp_wq); /* Memory barrier implied by wake_up() path. */ > + swake_up(>gp_wq); /* Memory barrier implied by swake_up() path. */ > } > > /* > @@ -2271,7 +2271,7 @@ static void force_quiescent_state(struct rcu_state *rsp) > } > rsp->gp_flags |= RCU_GP_FLAG_FQS; > raw_spin_unlock_irqrestore(_old->lock, flags); > - wake_up(>gp_wq); /* Memory barrier implied by wake_up() path. */ > + swake_up(>gp_wq); /* Memory barrier implied by swake_up() path. */ > } > > /* > @@ -3304,7 +3304,7 @@ static void __init rcu_init_one(struct rcu_state *rsp, > } > > rsp->rda = rda; > - init_waitqueue_head(>gp_wq); > + init_swaitqueue_head(>gp_wq); > init_irq_work(>wakeup_work, rsp_wakeup); > rnp = rsp->level[rcu_num_lvls - 1]; > for_each_possible_cpu(i) { > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h > index 52be957..01476e1 100644 > --- a/kernel/rcu/tree.h > +++ b/kernel/rcu/tree.h > @@ -28,6 +28,7 @@ > #include > #include > #include > +#include > > /* > * Define shape of hierarchy based on NR_CPUS, CONFIG_RCU_FANOUT,
Re: kernel BUG in munlock_vma_pages_range
On 12/11/2013 05:59 PM, Vlastimil Babka wrote: On 12/09/2013 09:26 PM, Sasha Levin wrote: On 12/09/2013 12:12 PM, Vlastimil Babka wrote: On 12/09/2013 06:05 PM, Sasha Levin wrote: On 12/09/2013 04:34 AM, Vlastimil Babka wrote: Hello, I will look at it, thanks. Do you have specific reproduction instructions? Not really, the fuzzer hit it once and I've been unable to trigger it again. Looking at the piece of code involved it might have had something to do with hugetlbfs, so I'll crank up testing on that part. Thanks. Do you have trinity log and the .config file? I'm currently unable to even boot linux-next with my config/setup due to a GPF. Looking at code I wouldn't expect that it could encounter a tail page, without first encountering a head page and skipping the whole huge page. At least in THP case, as TLB pages should be split when a vma is split. As for hugetlbfs, it should be skipped for mlock/munlock operations completely. One of these assumptions is probably failing here... If it helps, I've added a dump_page() in case we hit a tail page there and got: [ 980.172299] page:ea003e5e8040 count:0 mapcount:1 mapping: (null) index:0 x0 [ 980.173412] page flags: 0x2f80008000(tail) I can also add anything else in there to get other debug output if you think of something else useful. Please try the following. Thanks in advance. [ 428.499889] page:ea003e5c0040 count:0 mapcount:4 mapping: (null) index:0x0 [ 428.499889] page flags: 0x2f80008000(tail) [ 428.499889] start=140117131923456 pfn=16347137 orig_start=140117130543104 page_increm =1 vm_start=140117130543104 vm_end=140117134688256 vm_flags=135266419 [ 428.499889] first_page pfn=16347136 [ 428.499889] page:ea003e5c count:204 mapcount:44 mapping:880fb5c466c1 inde x:0x7f6f8fe00 [ 428.499889] page flags: 0x2f80084068(uptodate|lru|active|head|swapbacked) [ 428.499889] pc:880fcfb7 pc->flags:2 pc->mem_cgroup:c90006034000 [ 428.374171] [ 428.374171] [ 428.374171] Call Trace: [ 428.374171] [] exit_mmap+0x59/0x170 [ 428.374171] [] ? __khugepaged_exit+0xe0/0x150 [ 428.374171] [] ? kmem_cache_free+0x26b/0x370 [ 428.374171] [] ? __khugepaged_exit+0xe0/0x150 [ 428.374171] [] mmput+0x70/0xe0 [ 428.374171] [] exit_mm+0x18d/0x1a0 [ 428.374171] [] ? acct_collect+0x175/0x1b0 [ 428.374171] [] do_exit+0x26f/0x520 [ 428.374171] [] do_group_exit+0xa9/0xe0 [ 428.374171] [] get_signal_to_deliver+0x4e2/0x570 [ 428.374171] [] do_signal+0x4b/0x120 [ 428.374171] [] ? vtime_account_user+0x96/0xb0 [ 428.374171] [] ? _raw_spin_unlock+0x35/0x60 [ 428.374171] [] ? vtime_account_user+0x96/0xb0 [ 428.374171] [] ? context_tracking_user_exit+0xb8/0x1d0 [ 428.374171] [] ? trace_hardirqs_on+0xd/0x10 [ 428.374171] [] do_notify_resume+0x5a/0xe0 [ 428.374171] [] int_signal+0x12/0x17 [ 428.374171] Code: 46 85 31 c0 e8 f9 60 12 03 48 8b 5b 30 48 c7 c7 b0 92 46 85 4a 8d 34 33 31 c0 48 c1 fe 06 e8 df 60 12 03 48 89 df e8 97 e1 fc ff <0f> 0b 0f 1f 44 00 00 eb fe 66 0f 1f 44 00 00 48 8b 03 66 85 c0 [ 428.374171] RIP [] munlock_vma_pages_range+0x109/0x240 [ 428.374171] RSP Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 09/14] efi: passing kexec necessary efi data via setup_data
On 12/11/13 at 11:20pm, Borislav Petkov wrote: > On Mon, Dec 09, 2013 at 05:42:22PM +0800, Dave Young wrote: > > Add a new setup_data type SETUP_EFI for kexec use. > > Passing the saved fw_vendor, runtime, config tables and efi runtime > > mappings. > > > > When entering virtual mode, directly mapping the efi runtime ragions which > > we passed in previously. And skip the step to call SetVirtualAddressMap. > > > > Specially for HP z420 workstation we need save the smbios physical address. > > The kernel boot sequence proceeds in the following order. Step 2 > > requires efi.smbios to be the physical address. However, I found that on > > HP z420 EFI system table has a virtual address of SMBIOS in step 1. Hence, > > we need set it back to the physical address with the smbios in > > efi_setup_data. (When it is still the physical address, it simply sets > > the same value.) > > > > 1. efi_init() - Set efi.smbios from EFI system table > > 2. dmi_scan_machine() - Temporary map efi.smbios to access SMBIOS table > > 3. efi_enter_virtual_mode() - Map EFI ranges > > > > Tested on ovmf+qemu, lenovo thinkpad, a dell laptop and an > > HP z420 workstation. > > > > v2: refresh based on previous patch changes, code cleanup. > > v3: use ioremap instead of phys_to_virt for efi_setup > > v5: improve some code structure per comments from Matt > > Boris: improve code structure, spell fix, etc. > > Improve changelog from Toshi. > > change the variable efi_setup to the physical address of efi setup_data > > instead of the ioremapped virt address > > > > Signed-off-by: Dave Young > > --- > > arch/x86/include/asm/efi.h| 11 ++ > > arch/x86/include/uapi/asm/bootparam.h | 1 + > > arch/x86/kernel/setup.c | 3 + > > arch/x86/platform/efi/efi.c | 195 > > ++ > > 4 files changed, 187 insertions(+), 23 deletions(-) > > ... > > > @@ -115,6 +116,25 @@ static int __init setup_storage_paranoia(char *arg) > > } > > early_param("efi_no_storage_paranoia", setup_storage_paranoia); > > > > +void __init parse_efi_setup(u64 phys_addr) > > +{ > > + struct setup_data *sd; > > + > > + if (!efi_enabled(EFI_64BIT)) { > > + pr_warn("SETUP_EFI not supported on 32-bit\n"); > > + return; > > + } > > Shouldn't this function be in two versions in efi_64.c and efi_32.c? > This way you don't need this check with cryptic printk message. Ok, will update. > > > + > > + sd = early_memremap(phys_addr, sizeof(struct setup_data)); > > + if (!sd) { > > + pr_warn("efi: early_memremap setup_data failed\n"); > > + return; > > + } > > + efi_setup = phys_addr + sizeof(struct setup_data); > > + nr_efi_runtime_map = (sd->len - sizeof(struct efi_setup_data)) / > > +sizeof(efi_memory_desc_t); > > + early_memunmap(sd, sizeof(struct setup_data)); > > +} > > > > static efi_status_t virt_efi_get_time(efi_time_t *tm, efi_time_cap_t *tc) > > { > > @@ -494,18 +514,28 @@ static int __init efi_systab_init(void *phys) > > { > > if (efi_enabled(EFI_64BIT)) { > > efi_system_table_64_t *systab64; > > + struct efi_setup_data *data = NULL; > > u64 tmp = 0; > > > > + if (efi_setup) { > > + data = early_memremap(efi_setup, sizeof(*data)); > > + if (!data) > > + return -ENOMEM; > > + } > > systab64 = early_memremap((unsigned long)phys, > > sizeof(*systab64)); > > if (systab64 == NULL) { > > pr_err("Couldn't map the system table!\n"); > > + if (data) > > + early_memunmap(data, sizeof(*data)); > > return -ENOMEM; > > } > > > > efi_systab.hdr = systab64->hdr; > > - efi_systab.fw_vendor = systab64->fw_vendor; > > - tmp |= systab64->fw_vendor; > > + > > + efi_systab.fw_vendor = data ? (unsigned long)data->fw_vendor : > > + systab64->fw_vendor; > > + tmp |= efi_systab.fw_vendor; > > efi_systab.fw_revision = systab64->fw_revision; > > efi_systab.con_in_handle = systab64->con_in_handle; > > tmp |= systab64->con_in_handle; > > @@ -519,15 +549,20 @@ static int __init efi_systab_init(void *phys) > > tmp |= systab64->stderr_handle; > > efi_systab.stderr = systab64->stderr; > > tmp |= systab64->stderr; > > - efi_systab.runtime = (void *)(unsigned long)systab64->runtime; > > - tmp |= systab64->runtime; > > + efi_systab.runtime = data ? > > +(void *)(unsigned long)data->runtime : > > +(void *)(unsigned long)systab64->runtime; > > + tmp |= (unsigned long)efi_systab.runtime; >
linux-next: manual merge of the block tree with the f2fs tree
Hi Jens, Today's linux-next merge of the block tree got a conflict in fs/f2fs/data.c between commit 8758e549e105 ("f2fs: add unlikely() macro for compiler more aggressively") from the f2fs tree and commit 2c30c71bd653 ("block: Convert various code to bio_for_each_segment()") from the block tree. I fixed it up (I think - see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc fs/f2fs/data.c index 15956fa584de,a2c8de8ba6ce.. --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@@ -25,203 -25,6 +25,199 @@@ #include /* + * Low-level block read/write IO operations. + */ +static struct bio *__bio_alloc(struct block_device *bdev, int npages) +{ + struct bio *bio; + + /* No failure on bio allocation */ + bio = bio_alloc(GFP_NOIO, npages); + bio->bi_bdev = bdev; + bio->bi_private = NULL; + return bio; +} + +static void f2fs_read_end_io(struct bio *bio, int err) +{ - const int uptodate = test_bit(BIO_UPTODATE, >bi_flags); - struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1; ++ struct bio_vec *bvec; ++ int i; + - do { ++ bio_for_each_segment_all(bvec, bio, i) { + struct page *page = bvec->bv_page; + - if (--bvec >= bio->bi_io_vec) - prefetchw(>bv_page->flags); - - if (unlikely(!uptodate)) { ++ if (unlikely(err)) { + ClearPageUptodate(page); + SetPageError(page); + } else { + SetPageUptodate(page); + } + unlock_page(page); - } while (bvec >= bio->bi_io_vec); ++ } + + bio_put(bio); +} + +static void f2fs_write_end_io(struct bio *bio, int err) +{ - const int uptodate = test_bit(BIO_UPTODATE, >bi_flags); - struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1; - struct f2fs_sb_info *sbi = F2FS_SB(bvec->bv_page->mapping->host->i_sb); ++ struct f2fs_sb_info *sbi = NULL; ++ struct bio_vec *bvec; ++ int i; + - do { ++ bio_for_each_segment_all(bvec, bio, i) { + struct page *page = bvec->bv_page; + - if (--bvec >= bio->bi_io_vec) - prefetchw(>bv_page->flags); - - if (unlikely(!uptodate)) { ++ if (!sbi) ++ sbi = F2FS_SB(bvec->bv_page->mapping->host->i_sb); ++ if (unlikely(err)) { + SetPageError(page); + set_bit(AS_EIO, >mapping->flags); + set_ckpt_flags(sbi->ckpt, CP_ERROR_FLAG); + sbi->sb->s_flags |= MS_RDONLY; + } + end_page_writeback(page); + dec_page_count(sbi, F2FS_WRITEBACK); - } while (bvec >= bio->bi_io_vec); ++ } + + if (bio->bi_private) + complete(bio->bi_private); + + if (!get_pages(sbi, F2FS_WRITEBACK) && + !list_empty(>cp_wait.task_list)) + wake_up(>cp_wait); + + bio_put(bio); +} + +static void __submit_merged_bio(struct f2fs_bio_info *io) +{ + struct f2fs_io_info *fio = >fio; + int rw; + + if (!io->bio) + return; + + rw = fio->rw | fio->rw_flag; + + if (is_read_io(rw)) { + trace_f2fs_submit_read_bio(io->sbi->sb, rw, fio->type, io->bio); + submit_bio(rw, io->bio); + io->bio = NULL; + return; + } + trace_f2fs_submit_write_bio(io->sbi->sb, rw, fio->type, io->bio); + + /* + * META_FLUSH is only from the checkpoint procedure, and we should wait + * this metadata bio for FS consistency. + */ + if (fio->type == META_FLUSH) { + DECLARE_COMPLETION_ONSTACK(wait); + io->bio->bi_private = + submit_bio(rw, io->bio); + wait_for_completion(); + } else { + submit_bio(rw, io->bio); + } + io->bio = NULL; +} + +void f2fs_submit_merged_bio(struct f2fs_sb_info *sbi, + enum page_type type, int rw) +{ + enum page_type btype = PAGE_TYPE_OF_BIO(type); + struct f2fs_bio_info *io; + + io = is_read_io(rw) ? >read_io : >write_io[btype]; + + mutex_lock(>io_mutex); + + /* change META to META_FLUSH in the checkpoint procedure */ + if (type >= META_FLUSH) { + io->fio.type = META_FLUSH; + io->fio.rw = WRITE_FLUSH_FUA; + } + __submit_merged_bio(io); + mutex_unlock(>io_mutex); +} + +/* + * Fill the locked page with data located in the block address. + * Return unlocked page. + */ +int f2fs_submit_page_bio(struct f2fs_sb_info *sbi, struct page *page, + block_t blk_addr, int rw) +{ +
Re: [RFC PATCH tip 0/5] tracing filters with BPF
On Tue, Dec 10, 2013 at 7:35 PM, Masami Hiramatsu wrote: > (2013/12/11 11:32), Alexei Starovoitov wrote: >> On Tue, Dec 10, 2013 at 7:47 AM, Ingo Molnar wrote: >>> >>> * Alexei Starovoitov wrote: >>> > I'm fine if it becomes a requirement to have a vmlinux built with > DEBUG_INFO to use BPF and have a tool like perf to translate the > filters. But it that must not replace what the current filters do > now. That is, it can be an add on, but not a replacement. Of course. tracing filters via bpf is an additional tool for kernel debugging. bpf by itself has use cases beyond tracing. >>> >>> Well, Steve has a point: forcing DEBUG_INFO is a big showstopper for >>> most people. >> >> there is a misunderstanding here. >> I was saying 'of course' to 'not replace current filter infra'. >> >> bpf does not depend on debug info. >> That's the key difference between 'perf probe' approach and bpf filters. >> >> Masami is right that what I was trying to achieve with bpf filters >> is similar to 'perf probe': insert a dynamic probe anywhere >> in the kernel, walk pointers, data structures, print interesting stuff. >> >> 'perf probe' does it via scanning vmlinux with debug info. >> bpf filters don't need it. >> tools/bpf/trace/*_orig.c examples only depend on linux headers >> in /lib/modules/../build/include/ >> Today bpf compiler struct layout is the same as x86_64. >> >> Tomorrow bpf compiler will have flags to adjust endianness, pointer size, etc >> of the front-end. Similar to -m32/-m64 and -m*-endian flags. >> Neat part is that I don't need to do any work, just enable it properly in >> the bpf backend. From gcc/llvm point of view, bpf is yet another 'hw' >> architecture that compiler is emitting code for. >> So when C code of filter_ex1_orig.c does 'skb->dev', compiler determines >> field offset by looking at /lib/modules/.../include/skbuff.h >> whereas for 'perf probe' 'skb->dev' means walk debug info. > > Right, the offset of the data structure can get from the header etc. > > However, how would the bpf get the register or stack assignment of > skb itself? In the tracepoint macro, it will be able to get it from > function parameters (it needs a trick, like jprobe does). > I doubt you can do that on kprobes/uprobes without any debuginfo > support. :( the 4/5 diff actually shows how it's working ;) for kprobes it works at the function entry, since arguments are still in the registers and walks the pointers further down. It cannot do func+line_number as perf-probe does, of course. for tracepoints it's the same trick: call no-inline func with traceprobe args and call inlined crash_setup_regs() that stores the regs. Of course, there are limitations. Like 7th func argument goes into stack and requires more work to get out. If struct is not defined in .h, it would need to be redefined in filter.c Corner cases as you said. Today user of bpf filter needs to know that arg1 goes into %rdi and so on. that is easy to cleanup. >> Another use case is to optimize fetch sequences of dynamic probes >> as Masami suggested, but backward compatibility requirement >> would preserve to ways of doing it as well. > > The backward compatibility issue is only for the interface, but not > for the implementation, I think. :) The fetch method and filter > pred do already parse the argument into a syntax tree. IMHO, bpf > can optimize that tree to just a simple opcode stream. ahh. yes. that's doable. Thanks Alexei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] perf symbols: symbol-minimal.c causes random fd to be closed
I hit a cryptic failure when testing a recent version of perf: # perf report write failure on standard output: Bad file descriptor The issue is in commit b68e2f91 (perf symbols: Introduce symsrc structure). symsrc__destroy() does a close(ss->fd) but ss->fd is only initialised in the symbol-elf.c case and not for symbol-minimal.c. The issue has been around for a while however most people will build with libelf which wont use the symbol-minimal.c code. Cc: sta...@vger.kernel.org # v3.8+ Signed-off-by: Anton Blanchard --- diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c index 2d2dd05..3528204 100644 --- a/tools/perf/util/symbol-minimal.c +++ b/tools/perf/util/symbol-minimal.c @@ -254,6 +254,7 @@ int symsrc__init(struct symsrc *ss, struct dso *dso __maybe_unused, goto out_close; ss->type = type; + ss->fd = fd; return 0; out_close: -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 08/14] efi: export efi runtime memory mapping to sysfs
> > +++ b/Documentation/ABI/testing/sysfs-firmware-efi-runtime-map > > @@ -0,0 +1,36 @@ > > +What: /sys/firmware/efi/runtime-map/ > > +Date: December 2013 > > +Contact: Dave Young > > +Description: > > This could start at the same line as Description Ok. [snip] > > + > > + Above values are all hexadecimal numbers with the '0x' prefix. > > + > > Superfluous newline. Will remove > > > +Users: Kexec > > diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c > > index 3e8b760..8289e0c 100644 > > --- a/arch/x86/platform/efi/efi.c > > +++ b/arch/x86/platform/efi/efi.c > > @@ -76,6 +76,9 @@ static __initdata efi_config_table_type_t arch_tables[] = > > { > > {NULL_GUID, NULL, NULL}, > > }; > > > > +void *efi_runtime_map; > > +int nr_efi_runtime_map; > > + > > /* > > * Returns 1 if 'facility' is enabled, 0 otherwise. > > */ > > @@ -810,6 +813,19 @@ static void __init efi_merge_regions(void) > > } > > } > > > > +static int __init save_runtime_map(efi_memory_desc_t *md, int idx) > > +{ > > + void *p; > > + p = krealloc(efi_runtime_map, (idx + 1) * memmap.desc_size, GFP_KERNEL); > > + if (!p) > > + return -ENOMEM; > > + > > + efi_runtime_map = p; > > + memcpy(efi_runtime_map + idx * memmap.desc_size, md, memmap.desc_size); > > + > > + return 0; > > +} > > + > > /* > > * Map efi memory ranges for runtime serivce and update new_memmap with > > virtual > > * addresses. > > @@ -820,6 +836,7 @@ static void * __init efi_map_regions(int *count) > > void *p, *tmp, *new_memmap = NULL; > > unsigned long size; > > u64 end, systab; > > + int err = 0; > > > > for (p = memmap.map; p < memmap.map_end; p += memmap.desc_size) { > > md = p; > > @@ -848,10 +865,21 @@ static void * __init efi_map_regions(int *count) > > new_memmap = tmp; > > memcpy(new_memmap + (*count * memmap.desc_size), md, > >memmap.desc_size); > > + if (md->type != EFI_BOOT_SERVICES_CODE && > > + md->type != EFI_BOOT_SERVICES_DATA) { > > + err = save_runtime_map(md, nr_efi_runtime_map); > > + if (err) > > + goto out_save_runtime; > > + nr_efi_runtime_map++; > > + } > > So why don't you move that code to save_runtime_map? > > > It would looks like this: > > ... > new_memmap = tmp; > memcpy(new_memmap + (*count * memmap.desc_size), md, >memmap.desc_size); > > save_runtime_map(md); > (*count)++; > > [nr_efi_runtime_map is global, no need to pass it to save_runtime_map() ] nr_efi_runtime_map is handled in diffrent way for 1st kernel and kexec kernel For 1st kernel (boot from firmware) it's increased one by one in above function. But for kexec kernel it is directly calculated from setup_data array len. And increasing nr_efi_runtime_map in save_runtime_map is not ok the main reason is I need the value firstly for the loop counter max value like below: static int __init map_regions_fixed(void) { [snip] for (i = 0, md = data->map; i < nr_efi_runtime_map; i++, md++) { [snip] save_runtime_map(md, i); [snip] } > > and the EFI_BOOT* tests can be done in save_runtime_map and also the > error handling can happen there. This way efi_map_regions() won't > need to know about anything. This way, you can later move the whole > save_runtime_map() function to efi-kexec.c just by taking it without any > need for untangling. > > > +out_save_runtime: > > + kfree(efi_runtime_map); > > + nr_efi_runtime_map = 0; > > + efi_runtime_map = NULL; > > This can go there too. This section can go the save_runtime_map but it looks clearer to put them here. > > > out_krealloc: > > kfree(new_memmap); > > return NULL; > > diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig > > index 3150aa4..3d8d6f6 100644 > > --- a/drivers/firmware/efi/Kconfig > > +++ b/drivers/firmware/efi/Kconfig > > @@ -39,4 +39,15 @@ config EFI_VARS_PSTORE_DEFAULT_DISABLE > > config UEFI_CPER > > def_bool n > > > > +config EFI_RUNTIME_MAP > > + bool "Export efi runtime maps to sysfs" if EXPERT > > What's with the EXPERT? It depends on KEXEC already. EXPERT can be removed safely, will do. > > > + depends on X86 && EFI && KEXEC > > + default y > > + help > > + Export efi runtime memory maps to /sys/firmware/efi/runtime-map. > > + That memory map is used for example by kexec to set up efi virtual > > + mapping the 2nd kernel, but can also be used for debugging purposes. > > + > > + See also Documentation/ABI/testing/sysfs-firmware-efi-runtime-map. > > + > > endmenu > > ... > > > +static int __init efi_runtime_map_init(void) > > +{ > > + int i, j, ret = 0; > > + struct
Re: [PATCH v5 07/14] efi: export more efi table variable to sysfs
On 12/11/13 at 07:32pm, Borislav Petkov wrote: > On Mon, Dec 09, 2013 at 05:42:20PM +0800, Dave Young wrote: > > Export fw_vendor, runtime and config table physical addresses to > > /sys/firmware/efi/fw_vendor, /sys/firmware/efi/runtime and > > /sys/firmware/efi/config_table because kexec kernel will need them. > > you might wanna shorten: > > ... sys/firmware/efi/{fw_vendor,runtime,config_table} ... Ok, will do. > > > > > From EFI spec these 3 variables will be updated to > > virtual address after entering virtual mode. But > > kernel startup code will need the physical address. > > > > changelog: > > Greg: add standalone sysfs files instead of add lines to systab > > Document them as testing ABI > > Greg: use group attrs and is_visible > > Boris: align comments lines > > Boris: add macros for _show functions > > Matt: Documentation fixes. > > > > Signed-off-by: Dave Young > > --- > > Documentation/ABI/testing/sysfs-firmware-efi | 24 + > > arch/x86/platform/efi/efi.c | 4 +++ > > drivers/firmware/efi/efi.c | 39 > > > > include/linux/efi.h | 3 +++ > > 4 files changed, 70 insertions(+) > > create mode 100644 Documentation/ABI/testing/sysfs-firmware-efi > > > > diff --git a/Documentation/ABI/testing/sysfs-firmware-efi > > b/Documentation/ABI/testing/sysfs-firmware-efi > > new file mode 100644 > > index 000..8c6e460 > > --- /dev/null > > +++ b/Documentation/ABI/testing/sysfs-firmware-efi > > @@ -0,0 +1,24 @@ > > +What: /sys/firmware/efi/fw_vendor > > +Date: December 2013 > > +Contact: Dave Young > > +Description: > > + It shows the physical address of firmware vendor field in the > > Why doesn't this start at the same line as "Description:"? It can, just in 1st version I copied the format from some template, I have found it's better so updated the Users line but missed the Description. Will do in next version. > > > + EFI system table. > > + > > Superfluous newline. Will remove it. [snip] > > + > > static struct attribute *efi_subsys_attrs[] = { > > _attr_systab.attr, > > + _attr_fw_vendor.attr, > > + _attr_runtime.attr, > > + _attr_config_table.attr, > > NULL, /* maybe more in the future? */ > ^ > > Now that there's more, you can drop that wise guy comment :) Ok. -- Thanks for review Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] tty: Fix hang at ldsem_down_read()
When a controlling tty is being hung up and the hang up is waiting for a just-signalled tty reader or writer to exit, and a new tty reader/writer tries to acquire an ldisc reference concurrently with the ldisc reference release from the signalled reader/writer, the hangup can hang. The new reader/writer is sleeping in ldsem_down_read() and the hangup is sleeping in ldsem_down_write() [1]. The new reader/writer fails to wakeup the waiting hangup because the wrong lock count value is checked (the old lock count rather than the new lock count) to see if the lock is unowned. Change helper function to return the new lock count if the cmpxchg was successful; document this behavior. [1] edited dmesg log from reporter SysRq : Show Blocked State taskPC stack pid father systemd D 88040c4f 0 1 0 0x 88040c49fbe0 0046 88040c4a 88040c49ffd8 001d3980 001d3980 88040c4a 88040593d840 88040c49fb40 810a4cc0 0006 0023 Call Trace: [] ? sched_clock_cpu+0x9f/0xe4 [] ? sched_clock_cpu+0x9f/0xe4 [] ? sched_clock_cpu+0x9f/0xe4 [] ? sched_clock_cpu+0x9f/0xe4 [] schedule+0x24/0x5e [] schedule_timeout+0x15b/0x1ec [] ? sched_clock_cpu+0x9f/0xe4 [] ? _raw_spin_unlock_irq+0x24/0x26 [] down_read_failed+0xe3/0x1b9 [] ldsem_down_read+0x8b/0xa5 [] ? tty_ldisc_ref_wait+0x1b/0x44 [] tty_ldisc_ref_wait+0x1b/0x44 [] tty_write+0x7d/0x28a [] redirected_tty_write+0x8d/0x98 [] ? tty_write+0x28a/0x28a [] do_loop_readv_writev+0x56/0x79 [] do_readv_writev+0x1b0/0x1ff [] ? do_vfs_ioctl+0x32a/0x489 [] ? final_putname+0x1d/0x3a [] vfs_writev+0x2e/0x49 [] SyS_writev+0x47/0xaa [] system_call_fastpath+0x16/0x1b bashD 81c104c0 0 5469 5302 0x0082 8800cf817ac0 0046 8804086b22a0 8800cf817fd8 001d3980 001d3980 8804086b22a0 8800cf817a48 b9a0 8800cf817a78 81004675 8800cf817a44 Call Trace: [] ? dump_trace+0x165/0x29c [] ? sched_clock_cpu+0x9f/0xe4 [] ? save_stack_trace+0x26/0x41 [] schedule+0x24/0x5e [] schedule_timeout+0x15b/0x1ec [] ? sched_clock_cpu+0x9f/0xe4 [] ? down_write_failed+0xa3/0x1c9 [] ? _raw_spin_unlock_irq+0x24/0x26 [] down_write_failed+0xab/0x1c9 [] ldsem_down_write+0x79/0xb1 [] ? tty_ldisc_lock_pair_timeout+0xa5/0xd9 [] tty_ldisc_lock_pair_timeout+0xa5/0xd9 [] tty_ldisc_hangup+0xc4/0x218 [] __tty_hangup+0x2e2/0x3ed [] disassociate_ctty+0x63/0x226 [] do_exit+0x79f/0xa11 [] ? get_signal_to_deliver+0x206/0x62f [] ? lock_release_holdtime.part.8+0xf/0x16e [] do_group_exit+0x47/0xb5 [] get_signal_to_deliver+0x241/0x62f [] do_signal+0x43/0x59d [] ? __audit_syscall_exit+0x21a/0x2a8 [] ? lock_release_holdtime.part.8+0xf/0x16e [] do_notify_resume+0x54/0x6c [] int_signal+0x12/0x17 Reported-by: Sami Farin Cc: # 3.12.x Signed-off-by: Peter Hurley --- drivers/tty/tty_ldsem.c | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/drivers/tty/tty_ldsem.c b/drivers/tty/tty_ldsem.c index 22fad8a..d8a55e8 100644 --- a/drivers/tty/tty_ldsem.c +++ b/drivers/tty/tty_ldsem.c @@ -86,11 +86,21 @@ static inline long ldsem_atomic_update(long delta, struct ld_semaphore *sem) return atomic_long_add_return(delta, (atomic_long_t *)>count); } +/* + * ldsem_cmpxchg() updates @*old with the last-known sem->count value. + * Returns 1 if count was successfully changed; @*old will have @new value. + * Returns 0 if count was not changed; @*old will have most recent sem->count + */ static inline int ldsem_cmpxchg(long *old, long new, struct ld_semaphore *sem) { - long tmp = *old; - *old = atomic_long_cmpxchg(>count, *old, new); - return *old == tmp; + long tmp = atomic_long_cmpxchg(>count, *old, new); + if (tmp == *old) { + *old = new; + return 1; + } else { + *old = tmp; + return 0; + } } /* -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/1] Fix hang report
Greg, I know it's late in the -rc cycle but I'd like to get this fix into 3.13. Although it's only likely to happen at shutdown/reboot, the hang frequency could be as often as 1 in 1. The patch fixes the count value returned when the cmpxchg() has successfully changed the count. Only one code path checks the returned count when the cmpxchg() is successful; down_read_failed(). After failed down_read attempt is reversed but before the reader waits for the lock, the new count is checked to ensure _someone_ has the lock: /* if there are no active locks, wake the new lock owner(s) */ if ((count & LDSEM_ACTIVE_MASK) == 0) __ldsem_wake(sem); Because ldsem_cmpxchg() was returning the _old_ value on success, this was checking the wrong count value. No other code paths are impacted by the patch. The equivalent diff below also fixes the problem; however, I feel the intent is less clear. | diff --git a/drivers/tty/tty_ldsem.c b/drivers/tty/tty_ldsem.c | index 22fad8a..29d9e7c 100644 | --- a/drivers/tty/tty_ldsem.c | +++ b/drivers/tty/tty_ldsem.c | @@ -222,7 +222,7 @@ down_read_failed(struct ld_semaphore *sem, long count, long timeout) | get_task_struct(tsk); | | /* if there are no active locks, wake the new lock owner(s) */ | - if ((count & LDSEM_ACTIVE_MASK) == 0) | + if ((count + adjust & LDSEM_ACTIVE_MASK) == 0) | __ldsem_wake(sem); | | raw_spin_unlock_irq(>wait_lock); Regards, Peter Hurley (1): tty: Fix hang at ldsem_down_read() drivers/tty/tty_ldsem.c | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] f2fs: introduce sysfs entry to control in-place-update policy
This patch introduces new sysfs entries for users to control the policy of in-place-updates, namely IPU, in f2fs. Sometimes f2fs suffers from performance degradation due to its out-of-place update policy that produces many additional node block writes. If the storage performance is very dependant on the amount of data writes instead of IO patterns, we'd better drop this out-of-place update policy. This patch suggests 5 polcies and their triggering conditions as follows. [sysfs entry name = ipu_policy] 0: F2FS_IPU_FORCE all the time, 1: F2FS_IPU_SSR if SSR mode is activated, 2: F2FS_IPU_UTILif FS utilization is over threashold, 3: F2FS_IPU_SSR_UTILif SSR mode is activated and FS utilization is over threashold, 4: F2FS_IPUT_DISABLEdisable IPU. (=default option) [sysfs entry name = min_ipu_util] This parameter controls the threshold to trigger in-place-updates. The number indicates percentage of the filesystem utilization, and used by F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies. For more details, see need_inplace_update() in segment.h. Signed-off-by: Jaegeuk Kim --- Documentation/filesystems/f2fs.txt | 11 ++ fs/f2fs/f2fs.h | 3 +++ fs/f2fs/segment.c | 2 ++ fs/f2fs/segment.h | 44 -- fs/f2fs/super.c| 4 5 files changed, 58 insertions(+), 6 deletions(-) diff --git a/Documentation/filesystems/f2fs.txt b/Documentation/filesystems/f2fs.txt index a3fe811..4f9b146 100644 --- a/Documentation/filesystems/f2fs.txt +++ b/Documentation/filesystems/f2fs.txt @@ -171,6 +171,17 @@ Files in /sys/fs/f2fs/ conduct checkpoint to reclaim the prefree segments to free segments. By default, 100 segments, 200MB. + ipu_policy This parameter controls the policy of in-place + updates in f2fs. There are five policies: + 0: F2FS_IPU_FORCE, 1: F2FS_IPU_SSR, + 2: F2FS_IPU_UTIL, 3: F2FS_IPU_SSR_UTIL, + 4: F2FS_IPUT_DISABLE. + + min_ipu_util This parameter controls the threshold to trigger + in-place-updates. The number indicates percentage + of the filesystem utilization, and used by + F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies. + USAGE diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 022ce32..1b05a62 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -325,6 +325,9 @@ struct f2fs_sm_info { struct list_head discard_list; /* 4KB discard list */ int nr_discards;/* # of discards in the list */ int max_discards; /* max. discards to be issued */ + + unsigned int ipu_policy;/* in-place-update policy */ + unsigned int min_ipu_util; /* in-place-update threshold */ }; /* diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 0b2e8ce..5b890ce 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -1799,6 +1799,8 @@ int build_segment_manager(struct f2fs_sb_info *sbi) sm_info->main_segments = le32_to_cpu(raw_super->segment_count_main); sm_info->ssa_blkaddr = le32_to_cpu(raw_super->ssa_blkaddr); sm_info->rec_prefree_segments = DEF_RECLAIM_PREFREE_SEGMENTS; + sm_info->ipu_policy = F2FS_IPU_DISABLE; + sm_info->min_ipu_util = DEF_MIN_IPU_UTIL; INIT_LIST_HEAD(_info->discard_list); sm_info->nr_discards = 0; diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index ea56376..e9a10bd 100644 --- a/fs/f2fs/segment.h +++ b/fs/f2fs/segment.h @@ -476,19 +476,51 @@ static inline int utilization(struct f2fs_sb_info *sbi) /* * Sometimes f2fs may be better to drop out-of-place update policy. - * So, if fs utilization is over MIN_IPU_UTIL, then f2fs tries to write - * data in the original place likewise other traditional file systems. - * But, currently set 100 in percentage, which means it is disabled. - * See below need_inplace_update(). + * And, users can control the policy through sysfs entries. + * There are five policies with triggering conditions as follows. + * F2FS_IPU_FORCE - all the time, + * F2FS_IPU_SSR - if SSR mode is activated, + * F2FS_IPU_UTIL - if FS utilization is over threashold, + * F2FS_IPU_SSR_UTIL - if SSR mode is activated and FS utilization is over + * threashold, + * F2FS_IPUT_DISABLE - disable IPU. (=default option) */ -#define MIN_IPU_UTIL 100 +#define DEF_MIN_IPU_UTIL 70 + +enum { + F2FS_IPU_FORCE, + F2FS_IPU_SSR, + F2FS_IPU_UTIL, + F2FS_IPU_SSR_UTIL, +
Re: [PATCH v5 09/14] efi: passing kexec necessary efi data via setup_data
On 12/11/13 at 12:13pm, Matt Fleming wrote: > On Mon, 09 Dec, at 05:42:22PM, Dave Young wrote: > > Add a new setup_data type SETUP_EFI for kexec use. > > Passing the saved fw_vendor, runtime, config tables and efi runtime > > mappings. > > > > When entering virtual mode, directly mapping the efi runtime ragions which > > we passed in previously. And skip the step to call SetVirtualAddressMap. > > > > Specially for HP z420 workstation we need save the smbios physical address. > > The kernel boot sequence proceeds in the following order. Step 2 > > requires efi.smbios to be the physical address. However, I found that on > > HP z420 EFI system table has a virtual address of SMBIOS in step 1. Hence, > > we need set it back to the physical address with the smbios in > > efi_setup_data. (When it is still the physical address, it simply sets > > the same value.) > > > > 1. efi_init() - Set efi.smbios from EFI system table > > 2. dmi_scan_machine() - Temporary map efi.smbios to access SMBIOS table > > 3. efi_enter_virtual_mode() - Map EFI ranges > > > > Tested on ovmf+qemu, lenovo thinkpad, a dell laptop and an > > HP z420 workstation. > > > > v2: refresh based on previous patch changes, code cleanup. > > v3: use ioremap instead of phys_to_virt for efi_setup > > v5: improve some code structure per comments from Matt > > Boris: improve code structure, spell fix, etc. > > Improve changelog from Toshi. > > change the variable efi_setup to the physical address of efi setup_data > > instead of the ioremapped virt address > > > > Signed-off-by: Dave Young > > --- > > arch/x86/include/asm/efi.h| 11 ++ > > arch/x86/include/uapi/asm/bootparam.h | 1 + > > arch/x86/kernel/setup.c | 3 + > > arch/x86/platform/efi/efi.c | 195 > > ++ > > 4 files changed, 187 insertions(+), 23 deletions(-) > > [...] > > > @@ -115,6 +116,25 @@ static int __init setup_storage_paranoia(char *arg) > > } > > early_param("efi_no_storage_paranoia", setup_storage_paranoia); > > > > +void __init parse_efi_setup(u64 phys_addr) > > +{ > > + struct setup_data *sd; > > + > > + if (!efi_enabled(EFI_64BIT)) { > > + pr_warn("SETUP_EFI not supported on 32-bit\n"); > > + return; > > + } > > + > > + sd = early_memremap(phys_addr, sizeof(struct setup_data)); > > + if (!sd) { > > + pr_warn("efi: early_memremap setup_data failed\n"); > > You shouldn't need the "efi:" prefix in the message. Hmm, remove efi: looks better, will update. > > > @@ -676,6 +766,8 @@ void __init efi_init(void) > > efi.systab->hdr.revision >> 16, > > efi.systab->hdr.revision & 0x, vendor); > > > > + efi_reuse_config(efi.systab->tables, efi.systab->nr_tables); > > + > > Please check the return value. I missed this one, will update. > > > if (efi_config_init(arch_tables)) > > return; > > > > @@ -886,6 +978,50 @@ out_krealloc: > > } > > > > /* > > + * Map efi regions which was passed via setup_data. The virt_addr is a > > fixed > > + * addr which was used in first kernel in case kexec boot. > > + */ > > +static int __init map_regions_fixed(void) > > +{ > > + int i, s, ret = 0; > > + u64 end, systab; > > + unsigned long size; > > + efi_memory_desc_t *md; > > + struct efi_setup_data *data; > > + > > + s = sizeof(*data) + nr_efi_runtime_map * sizeof(data->map[0]); > > + data = early_memremap(efi_setup, s); > > + if (!data) { > > + ret = -ENOMEM; > > + goto out; > > + } > > + for (i = 0, md = data->map; i < nr_efi_runtime_map; i++, md++) { > > + efi_map_region_fixed(md); /* FIXME: add error handling */ > > Oops. Please fix this ;-) Have discussed this with Boris, he will take care of this after he added error handling in his __map_region function. -- Thanks for review Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 09/14] efi: passing kexec necessary efi data via setup_data
On 12/11/13 at 03:05pm, Borislav Petkov wrote: > On Wed, Dec 11, 2013 at 12:13:52PM +, Matt Fleming wrote: > > > + for (i = 0, md = data->map; i < nr_efi_runtime_map; i++, md++) { > > > + efi_map_region_fixed(md); /* FIXME: add error handling */ > > > > Oops. Please fix this ;-) > > Yeah, this is on my TODO as it wraps around __map_region, the latter > needing to propagate error codes. > > I'll take care of it once you merge Dave's patchset. Thanks Boris.. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] drivers: char: Add a dynamic clock for the trace clock
On Wed, 11 Dec 2013 18:06:06 -0800 Sonny Rao wrote: > > ftrace has several clocks that it uses: > > > > o local - basically sched_clock() > > o global - something like hpet that is monotonic across CPUs but slower > > o counter - a simple atomic counter (no time associated to it) > > o uptime - jiffy counter > > o perf - trace_clock, which is what perf uses > > o x86_tsc - the raw tsc counter. > > > > # cat /sys/kernel/debug/trace_clock > > [local] global counter uptime perf x86-tsc > > > > Ah ok sorry for the incorrect info there, thanks for clarifying. > So, If I wanted to make sure everything is synced up between ftrace > events and perf events I should say perf here instead of local. Correct, that's why I created that clock. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] wait-simple: Introduce the simple waitqueue implementation
> --- /dev/null > +++ b/kernel/swait.c > @@ -0,0 +1,118 @@ > +/* > + * Simple waitqueues without fancy flags and callbacks We should probably have a more detailed description of when to use simple wait queues verses normal wait queues. These are obviously much lighter wait, and should be the preferred method unless you need a feature of the more heavy weight wait queues. -- Steve "weight wait" Ha! Don't get to use that very often ;-) > + * > + * (C) 2011 Thomas Gleixner > + * > + * Based on kernel/wait.c > + * > + * For licencing details see kernel-base/COPYING > + */ > +#include > +#include > +#include > +#include -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 01/14] x86/mm: sparse warning fix for early_memremap
On 12/11/13 at 12:12pm, Borislav Petkov wrote: > On Wed, Dec 11, 2013 at 10:20:25AM +, Matt Fleming wrote: > > This needs reviewing by at least one of the x86 folks, but it > > certainly makes sense to me. > > Ingo told me yesterday, it makes sense too. I'd guess we can try it. > FWIW, all callers of early_memremap use the memory they get remapped as > normal memory so we should be safe. > > Maybe this whole discussion should be noted down in the commit message > so that people know. Thanks for the info. will add. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] drivers: char: Add a dynamic clock for the trace clock
On Wed, Dec 11, 2013 at 5:49 PM, Steven Rostedt wrote: > On Wed, 11 Dec 2013 17:17:30 -0800 > Sonny Rao wrote: > >> On Wed, Dec 11, 2013 at 11:30 AM, Stephane Eranian >> wrote: >> > Sonny, >> > >> > Your patch has a couple of problems for me: >> > - requires CONFIG_TRACING >> > >> > You should directly invoke getrawmonotonic() >> > and inline the code from trace_clock_getres(). >> > >> > That's how I managed to compile your kernel module on my system. >> >> You need the changes in kernel/trace/trace.c which is why it's >> dependent on CONFIG_TRACING. >> If we put those functions elsewhere we could remove that dependency, >> but it sounds like people want to just fix the clock that perf uses so >> that it's exportable and not handle this with something like this >> patch, which is better. > > I have no issue moving the trace_clock.c code into lib/ and we can add > a CONFIG_TRACE_CLOCK option that can be set by perf and ftrace. > That sounds like a good idea to me, regardless of what we end up doing. >> >> Also, we should ensure that perf and ftrace are guaranteed to use the >> same clock, I think it just happens to be the same right now. > > ftrace has several clocks that it uses: > > o local - basically sched_clock() > o global - something like hpet that is monotonic across CPUs but slower > o counter - a simple atomic counter (no time associated to it) > o uptime - jiffy counter > o perf - trace_clock, which is what perf uses > o x86_tsc - the raw tsc counter. > > # cat /sys/kernel/debug/trace_clock > [local] global counter uptime perf x86-tsc > Ah ok sorry for the incorrect info there, thanks for clarifying. So, If I wanted to make sure everything is synced up between ftrace events and perf events I should say perf here instead of local. > -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 02/14] efi: use early_memremap and early_memunmap
On 12/11/13 at 10:39am, Matt Fleming wrote: > (Cc'ing Leif and Mark for the ARM-side of things) > > On Mon, 09 Dec, at 05:42:15PM, Dave Young wrote: > > In arch/x86/platform/efi/efi.c and drivers/firmware/efi/efi.c turn to use > > early_memremap/early_memunmap instead of early_ioremap/early_iounmap so > > sparse > > will be happy. > > > > Signed-off-by: Dave Young > > --- > > arch/x86/platform/efi/efi.c | 20 ++-- > > drivers/firmware/efi/efi.c | 4 ++-- > > 2 files changed, 12 insertions(+), 12 deletions(-) > > This looks like a rather nice cleanup but the commit log could use a > little bit of tweaking... > > - Please start your commit title (the part after the subsystem tag) > with a capital letter, e.g. > > efi: Use early_memremap... > > - You need to explain in the commit title that you're fixing a sparse > warning. Anyone reading the patch subject will have no idea _why_ > you're using early_memremap() and early_memunmap(). > > - In the commit message body explain why sparse is currently unhappy. Sure, will do. -- Thanks for review Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] wait-simple: Introduce the simple waitqueue implementation
On Wed, 11 Dec 2013 20:06:37 -0500 Paul Gortmaker wrote: > From: Thomas Gleixner > > The wait_queue is a swiss army knife and in most of the cases the > full complexity is not needed. Here we provide a slim version, as > it lowers memory consumption and runtime overhead. > > The concept originated from RT, where waitqueues are a constant > source of trouble, as we can't convert the head lock to a raw > spinlock due to fancy and long lasting callbacks. > > The smp_mb() was added (by Steven Rostedt) to fix a race condition > with swait wakeups vs. adding items to the list. For this part, you can also add my: Signed-off-by: Steven Rostedt I'll also look at these and test them a bit against mainline. Thanks for doing this! -- Steve > > Signed-off-by: Thomas Gleixner > Signed-off-by: Sebastian Andrzej Siewior > Cc: Steven Rostedt > [PG: carry forward from multiple v3.10-rt patches to mainline, align > function names with "normal" wait queue names, update commit log.] > Signed-off-by: Paul Gortmaker > --- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Staging: TIDSPBRIDGE: Use vm_iomap_memory for mmap-ing instead of remap_pfn_range
On Wed, Dec 11, 2013 at 01:27:17PM +0300, Dan Carpenter wrote: > On Wed, Dec 11, 2013 at 11:57:04AM +0200, Ivaylo Dimitrov wrote: > > On 11.12.2013 10:33, Dan Carpenter wrote: > > >On Wed, Dec 11, 2013 at 09:45:52AM +0200, Ivajlo Dimitrov wrote: > > >>I can pick your changes and re-send the original patch with them > > >>incorporated if there are no objections. Are you fine with that? > > >> > > >Do it on top of staging-next, don't redo the original. > > > > > >regards, > > >dan carpenter > > > > I don't see the original patch in the staging-next tree [0], how to > > proceed? Isn't it better to resend the original patch with Steven's > > changes included? > > > > [0] > > http://git.kernel.org/cgit/linux/kernel/git/gregkh/staging.git/log/drivers/staging/tidspbridge?h=staging-next > > > > Oops. It's in staging-linus not staging-next. I don't know how Greg > handles that tree. The same way I do my others: *-next : for the "next" kernel merge window *-linus : for Linus's tree now before the -final release comes out. The original patch here went to Linus, so it was in staging-linus and it's already in Linus's tree. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] drivers: char: Add a dynamic clock for the trace clock
On Wed, 11 Dec 2013 17:17:30 -0800 Sonny Rao wrote: > On Wed, Dec 11, 2013 at 11:30 AM, Stephane Eranian wrote: > > Sonny, > > > > Your patch has a couple of problems for me: > > - requires CONFIG_TRACING > > > > You should directly invoke getrawmonotonic() > > and inline the code from trace_clock_getres(). > > > > That's how I managed to compile your kernel module on my system. > > You need the changes in kernel/trace/trace.c which is why it's > dependent on CONFIG_TRACING. > If we put those functions elsewhere we could remove that dependency, > but it sounds like people want to just fix the clock that perf uses so > that it's exportable and not handle this with something like this > patch, which is better. I have no issue moving the trace_clock.c code into lib/ and we can add a CONFIG_TRACE_CLOCK option that can be set by perf and ftrace. > > Also, we should ensure that perf and ftrace are guaranteed to use the > same clock, I think it just happens to be the same right now. ftrace has several clocks that it uses: o local - basically sched_clock() o global - something like hpet that is monotonic across CPUs but slower o counter - a simple atomic counter (no time associated to it) o uptime - jiffy counter o perf - trace_clock, which is what perf uses o x86_tsc - the raw tsc counter. # cat /sys/kernel/debug/trace_clock [local] global counter uptime perf x86-tsc -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/