random: Wake up writers when random pools are zapped
As it is when the pool is zapped with RNDCLEARPOOL writers are not woken up and therefore the pool may remain in the empty state indefinitely. This patch wakes them up unless the write threshold is set to zero. Signed-off-by: Herbert Xu diff --git a/drivers/char/random.c b/drivers/char/random.c index e027e7f..32b7010 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1874,6 +1874,10 @@ static long random_ioctl(struct file *f, unsigned int cmd, unsigned long arg) return -EPERM; input_pool.entropy_count = 0; blocking_pool.entropy_count = 0; + if (random_write_wakeup_bits) { + wake_up_interruptible(&random_write_wait); + kill_fasync(&fasync, SIGIO, POLL_OUT); + } return 0; default: return -EINVAL; -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
htmldocs: include/net/mac80211.h:950: warning: Function parameter or member 'control.rates' not described in 'ieee80211_tx_info'
Hi Mauro, FYI, the error/warning still remains. tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 3acf4e395260e3bd30a6fa29ba7eada4bf7566ca commit: d404d57955a6f67365423f9d0b89ad1881799087 docs: kernel-doc: fix parsing of arrays date: 7 weeks ago reproduce: make htmldocs All warnings (new ones prefixed by >>): WARNING: convert(1) not found, for SVG to PDF conversion install ImageMagick (https://www.imagemagick.org) include/linux/crypto.h:477: warning: Function parameter or member 'cra_u.ablkcipher' not described in 'crypto_alg' include/linux/crypto.h:477: warning: Function parameter or member 'cra_u.blkcipher' not described in 'crypto_alg' include/linux/crypto.h:477: warning: Function parameter or member 'cra_u.cipher' not described in 'crypto_alg' include/linux/crypto.h:477: warning: Function parameter or member 'cra_u.compress' not described in 'crypto_alg' include/net/cfg80211.h:4129: warning: Function parameter or member 'wext.ibss' not described in 'wireless_dev' include/net/cfg80211.h:4129: warning: Function parameter or member 'wext.connect' not described in 'wireless_dev' include/net/cfg80211.h:4129: warning: Function parameter or member 'wext.keys' not described in 'wireless_dev' include/net/cfg80211.h:4129: warning: Function parameter or member 'wext.ie' not described in 'wireless_dev' include/net/cfg80211.h:4129: warning: Function parameter or member 'wext.ie_len' not described in 'wireless_dev' include/net/cfg80211.h:4129: warning: Function parameter or member 'wext.bssid' not described in 'wireless_dev' include/net/cfg80211.h:4129: warning: Function parameter or member 'wext.ssid' not described in 'wireless_dev' include/net/cfg80211.h:4129: warning: Function parameter or member 'wext.default_key' not described in 'wireless_dev' include/net/cfg80211.h:4129: warning: Function parameter or member 'wext.default_mgmt_key' not described in 'wireless_dev' include/net/cfg80211.h:4129: warning: Function parameter or member 'wext.prev_bssid_valid' not described in 'wireless_dev' include/net/mac80211.h:2259: warning: Function parameter or member 'radiotap_timestamp.units_pos' not described in 'ieee80211_hw' include/net/mac80211.h:2259: warning: Function parameter or member 'radiotap_timestamp.accuracy' not described in 'ieee80211_hw' >> include/net/mac80211.h:950: warning: Function parameter or member >> 'control.rates' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'control.rts_cts_rate_idx' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'control.use_rts' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'control.use_cts_prot' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'control.short_preamble' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'control.skip_table' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'control.jiffies' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'control.vif' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'control.hw_key' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'control.flags' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'control.enqueue_time' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'ack' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'ack.cookie' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'status.rates' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'status.ack_signal' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'status.ampdu_ack_len' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'status.ampdu_len' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'status.antenna' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'status.tx_time' not described in 'ieee80211_tx_info' >> include/net/mac80211.h:950: warning: Function parameter or member >> 'status.status_driver_data' not described in 'ieee80211_tx_info' include/net/mac80211.h:950: warning: Function parameter or member 'driver_rates' n
Re: [PATCH v3 1/6] x86/stacktrace: do not unwind after user regs
* Jiri Slaby wrote: > Josh pointed out, that there is no way a frame can be after user regs. > So remove the last unwind and the check. > > Signed-off-by: Jiri Slaby > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: "H. Peter Anvin" > Cc: x...@kernel.org > Cc: Josh Poimboeuf Josh: an Acked-by or Reviewed-by for the whole series from you would be nice. Thanks, Ingo
Re: [PATCH] ALSA: usb: stream: fix potential memory leak during uac3 interface parsing
On Fri, 18 May 2018 00:08:59 +0200, Ruslan Bilovol wrote: > > UAC3 channel map is created during interface parsing, > and in some cases was not freed in failure paths. > > Reported-by: Dan Carpenter > Signed-off-by: Ruslan Bilovol Applied, thanks. Takashi
Re: [PATCHv7] gpio: Remove VLA from gpiolib
Hi Laura, On Fri, May 18, 2018 at 12:32 AM, Laura Abbott wrote: > The new challenge is to remove VLAs from the kernel > (see https://lkml.org/lkml/2018/3/7/621) to eventually > turn on -Wvla. > > Using a kmalloc array is the easy way to fix this but kmalloc is still > more expensive than stack allocation. Introduce a fast path with a > fixed size stack array to cover most chip with gpios below some fixed > amount. The slow path dynamically allocates an array to cover those > chips with a large number of gpios. > > Reviewed-by: Phil Reid > Reviewed-and-tested-by: Lukas Wunner > Signed-off-by: Lukas Wunner > Signed-off-by: Laura Abbott > --- > v7: Tweaked the Kconfig text to clarify the wording. Also fixed a few > other comments from Geert, including the earlier suggestion to reduce > the zeroing since I was wrong about that. Thanks for the update! With the minor nit below resolved: Reviewed-by: Geert Uytterhoeven > --- a/drivers/gpio/Kconfig > +++ b/drivers/gpio/Kconfig > @@ -22,6 +22,18 @@ menuconfig GPIOLIB > > if GPIOLIB > > +config GPIOLIB_FASTPATH_LIMIT > + int "Maximum number of GPIOs for fast path" > + range 16 512 While you can indeed fit 2 * 16 bits in a long, a lower limit of 16 doesn't make much sense, as you will use 2 longs (mask + bits) anyway. So please increase it to 32. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: Some questions about the spi mem framework
On Fri, 18 May 2018 13:50:00 +0800 Xiangsheng Hou wrote: > Hi Boris, > > On Thu, 2018-05-17 at 09:42 +0200, Boris Brezillon wrote: > > On Thu, 17 May 2018 15:35:04 +0800 > > Xiangsheng Hou wrote: > > > > > On Thu, 2018-05-17 at 09:13 +0200, Boris Brezillon wrote: > > > > On Thu, 17 May 2018 14:58:24 +0800 > > > > Xiangsheng Hou wrote: > > > > > > > > > Hi Boris, > > > > > > > > > > On Wed, 2018-05-16 at 14:42 +0200, Boris Brezillon wrote: > > > > > > On Wed, 16 May 2018 20:11:39 +0800 > > > > > > Xiangsheng Hou wrote: > > > > > > > > > > > > > Hi Boris, > > > > > > > > > > > > > > On Tue, 2018-05-15 at 17:25 +0200, Boris Brezillon wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > > On Tue, 15 May 2018 11:43:20 +0800 > > > > > > > > Xiangsheng Hou wrote: > > > > > > > > > > > > > > > > > Hello Boris, > > > > > > > > > > > > > > > > > > I have seen you are working on extend the framework to > > > > > > > > > generically > > > > > > > > > support spi memory devices. > > > > > > > > > And, I am working on upstream SPI Nand driver of Mediatek SPI > > > > > > > > > NAND > > > > > > > > > controller based on your branch[1]. > > > > > > > > > > > > > > > > Great! > > > > > > > > > > > > > > > > > I have some questions need your comment. > > > > > > > > > > > > > > > > > > 1) There is a difference between different SPI NAND Flash > > > > > > > > > when using the > > > > > > > > > Quad SPI command,for example Macronix,Etron and GigaDevice, > > > > > > > > > Quad SPI commands require the Quad Enable bit in Status > > > > > > > > > Register(B0H) to > > > > > > > > > be set. > > > > > > > > > However, current spi-mem framework does not have this > > > > > > > > > operation, > > > > > > > > > do you have a plan to support it? > > > > > > > > > > > > > > > > I added support for the QE bit in the v7 I sent just a few > > > > > > > > minutes ago > > > > > > > > [1]. > > > > > > > > > > > > > > Ok,I have studied v7. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2) I see that current spi-mem framework doesn't support ECC, > > > > > > > > > But we need ECC, and we use Mediatek controller's HW ECC > > > > > > > > > instead of spi nand on-chip ECC, > > > > > > > > > maybe other companies also have this behavior, > > > > > > > > > So the ECC part must be implemented in controller's driver. > > > > > > > > > Will you abstract ECC interface in future? > > > > > > > > > > > > > > > > Well, I added support for on-die ECC in my v7 since all chips > > > > > > > > seem to > > > > > > > > provide this feature. I was initially planning on abstracting > > > > > > > > ECC > > > > > > > > engines, but I decided to go for a simpler approach and only > > > > > > > > support > > > > > > > > on-die ECC. That does not mean we shouldn't work on this "ECC > > > > > > > > engine > > > > > > > > abstraction", just that I wanted to get something out and > > > > > > > > didn't have > > > > > > > > time to spend on this topic. > > > > > > > > > > > > > > > > I'd be happy if someone else could work on that aspect though. > > > > > > > > BTW, do > > > > > > > > you plan to use this engine [2], or is this yet another ECC > > > > > > > > engine? > > > > > > > > > > > > > > Yes,I plan to use this ecc engine[2]. > > > > > > > > > > > > Cool. That probably means we'll have to move the driver one level up > > > > > > (in drivers/mtd/nand) and work on this ECC engine interface I was > > > > > > talking about. > > > > > > > > > > > > > > > 3) You know, some nand controller need configure their > > > > > > > > > registers when > > > > > > > > > getting some information(page size, spare size) of nand flash, > > > > > > > > > But the spi-mem framework doesn't has an interface for > > > > > > > > > scanning NAND > > > > > > > > > flash, when controller driver initialization. > > > > > > > > > > > > > > > > You seem to mix 2 different things: > > > > > > > > - spi-mem: this is generic interface provided by the SPI > > > > > > > > framework to > > > > > > > > send spi_mem_op. There's nothing NOR or NAND specific in > > > > > > > > there, and > > > > > > > > I'd like it to stay like that as much as possible > > > > > > > > - spinand: this the spi-mem driver that is dealing with SPI NAND > > > > > > > > devices, and this is where all the code related to SPI NAND > > > > > > > > support > > > > > > > > should end up. > > > > > > > > > > > > > > > > Can you tell me exactly why your SPI controller needs such a > > > > > > > > detailed > > > > > > > > description? Is it able to program/read pages or erase blocks > > > > > > > > on its > > > > > > > > own? Do you have a spec of this controller publicly available? > > > > > > > > > > > > > > > > > > > > > > For Mediatek SPI Nand controller,I have to configure registers > > > > > > > for ECC > > > > > > > engine,page
[PATCH v3 3/6] x86/stacktrace: clarify the reliable success paths
Make clear which path is for user tasks and for kthreads and idle tasks. This will allow easier plug-in of the ORC unwinder in the next patches. Note that we added a check for unwind error to the top of the loop, so that an error is returned also for user tasks (the 'goto success' would skip the check after the loop otherwise). [v3] check for unwind_error in the loop Signed-off-by: Jiri Slaby Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: x...@kernel.org Cc: Josh Poimboeuf --- arch/x86/kernel/stacktrace.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c index f9dacf6d4667..6acf1d5ca832 100644 --- a/arch/x86/kernel/stacktrace.c +++ b/arch/x86/kernel/stacktrace.c @@ -89,21 +89,24 @@ __save_stack_trace_reliable(struct stack_trace *trace, struct pt_regs *regs; unsigned long addr; - for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state); + for (unwind_start(&state, task, NULL, NULL); +!unwind_done(&state) && !unwind_error(&state); unwind_next_frame(&state)) { regs = unwind_get_entry_regs(&state, NULL); if (regs) { + /* Success path for user tasks */ + if (user_mode(regs)) + goto success; + /* * Kernel mode registers on the stack indicate an * in-kernel interrupt or exception (e.g., preemption * or a page fault), which can make frame pointers * unreliable. */ - if (!user_mode(regs)) - return -EINVAL; - break; + return -EINVAL; } addr = unwind_get_return_address(&state); @@ -124,6 +127,11 @@ __save_stack_trace_reliable(struct stack_trace *trace, if (unwind_error(&state)) return -EINVAL; + /* Success path for non-user tasks, i.e. kthreads and idle tasks */ + if (!(task->flags & (PF_KTHREAD | PF_IDLE))) + return -EINVAL; + +success: if (trace->nr_entries < trace->max_entries) trace->entries[trace->nr_entries++] = ULONG_MAX; -- 2.16.3
[PATCH v3 1/6] x86/stacktrace: do not unwind after user regs
Josh pointed out, that there is no way a frame can be after user regs. So remove the last unwind and the check. Signed-off-by: Jiri Slaby Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: x...@kernel.org Cc: Josh Poimboeuf --- arch/x86/kernel/stacktrace.c | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c index 093f2ea5dd56..8948b7d9c064 100644 --- a/arch/x86/kernel/stacktrace.c +++ b/arch/x86/kernel/stacktrace.c @@ -113,15 +113,6 @@ __save_stack_trace_reliable(struct stack_trace *trace, if (!user_mode(regs)) return -EINVAL; - /* -* The last frame contains the user mode syscall -* pt_regs. Skip it and finish the unwind. -*/ - unwind_next_frame(&state); - if (!unwind_done(&state)) { - STACKTRACE_DUMP_ONCE(task); - return -EINVAL; - } break; } -- 2.16.3
[PATCH v3 4/6] x86/stacktrace: do not fail for ORC with regs on stack
save_stack_trace_reliable now returns "non reliable" when there are kernel pt_regs on stack. This means an interrupt or exception happened somewhere down the route. It is a problem for the frame pointer unwinder, because the frame might not have been set up yet when the irq happened, so the unwinder might fail to unwind from the interrupted function. With ORC, this is not a problem, as ORC has out-of-band data. We can find ORC data even for the IP in the interrupted function and always unwind one level up reliably. So lift the check to apply only when CONFIG_FRAME_POINTER is enabled. [v2] - rewrite the code in favor of Josh's suggestions Signed-off-by: Jiri Slaby Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: x...@kernel.org Cc: Josh Poimboeuf --- arch/x86/kernel/stacktrace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c index 6acf1d5ca832..7627455047c2 100644 --- a/arch/x86/kernel/stacktrace.c +++ b/arch/x86/kernel/stacktrace.c @@ -106,7 +106,8 @@ __save_stack_trace_reliable(struct stack_trace *trace, * unreliable. */ - return -EINVAL; + if (IS_ENABLED(CONFIG_FRAME_POINTER)) + return -EINVAL; } addr = unwind_get_return_address(&state); -- 2.16.3
[PATCH v3 5/6] x86/unwind/orc: Detect the end of the stack
From: Josh Poimboeuf The existing UNWIND_HINT_EMPTY annotations happen to be good indicators of where entry code calls into C code for the first time. So also use them to mark the end of the stack for the ORC unwinder. Use that information to set unwind->error if the ORC unwinder doesn't unwind all the way to the end. This will be needed for enabling HAVE_RELIABLE_STACKTRACE for the ORC unwinder so we can use it with the livepatch consistency model. Thanks to Jiri Slaby for teaching the ORCs about the unwind hints. [v2] this patch is new in v2 Signed-off-by: Josh Poimboeuf Signed-off-by: Jiri Slaby --- arch/x86/entry/entry_64.S | 1 + arch/x86/include/asm/orc_types.h | 2 + arch/x86/include/asm/unwind_hints.h| 16 +--- arch/x86/kernel/unwind_orc.c | 52 +++--- tools/objtool/arch/x86/include/asm/orc_types.h | 2 + tools/objtool/check.c | 1 + tools/objtool/check.h | 2 +- tools/objtool/orc_dump.c | 3 +- tools/objtool/orc_gen.c| 2 + 9 files changed, 52 insertions(+), 29 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 702ff719443b..e40ec8549fdc 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -410,6 +410,7 @@ SYM_CODE_START(ret_from_fork) 1: /* kernel thread */ + UNWIND_HINT_EMPTY movq%r12, %rdi CALL_NOSPEC %rbx /* diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h index 9c9dc579bd7d..46f516dd80ce 100644 --- a/arch/x86/include/asm/orc_types.h +++ b/arch/x86/include/asm/orc_types.h @@ -88,6 +88,7 @@ struct orc_entry { unsignedsp_reg:4; unsignedbp_reg:4; unsignedtype:2; + unsignedend:1; } __packed; /* @@ -101,6 +102,7 @@ struct unwind_hint { s16 sp_offset; u8 sp_reg; u8 type; + u8 end; }; #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h index bae46fc6b9de..0bcdb1279361 100644 --- a/arch/x86/include/asm/unwind_hints.h +++ b/arch/x86/include/asm/unwind_hints.h @@ -26,7 +26,7 @@ * the debuginfo as necessary. It will also warn if it sees any * inconsistencies. */ -.macro UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=0 type=ORC_TYPE_CALL +.macro UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=0 type=ORC_TYPE_CALL end=0 #ifdef CONFIG_STACK_VALIDATION .Lunwind_hint_ip_\@: .pushsection .discard.unwind_hints @@ -35,12 +35,14 @@ .short \sp_offset .byte \sp_reg .byte \type + .byte \end + .balign 4 .popsection #endif .endm .macro UNWIND_HINT_EMPTY - UNWIND_HINT sp_reg=ORC_REG_UNDEFINED + UNWIND_HINT sp_reg=ORC_REG_UNDEFINED end=1 .endm .macro UNWIND_HINT_REGS base=%rsp offset=0 indirect=0 extra=1 iret=0 @@ -86,19 +88,21 @@ #else /* !__ASSEMBLY__ */ -#define UNWIND_HINT(sp_reg, sp_offset, type) \ +#define UNWIND_HINT(sp_reg, sp_offset, type, end) \ "987: \n\t" \ ".pushsection .discard.unwind_hints\n\t"\ /* struct unwind_hint */\ ".long 987b - .\n\t"\ - ".short " __stringify(sp_offset) "\n\t" \ + ".short " __stringify(sp_offset) "\n\t" \ ".byte " __stringify(sp_reg) "\n\t" \ ".byte " __stringify(type) "\n\t" \ + ".byte " __stringify(end) "\n\t"\ + ".balign 4 \n\t"\ ".popsection\n\t" -#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE) +#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE, 0) -#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE) +#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE, 0) #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index feb28fee6cea..26038eacf74a 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -198,7 +198,7 @@ static int orc_sort_cmp(const void *_a, const void *_b) * whitelisted .o files which didn't get objtool generation. */ orc_a = cur_orc_table + (a - cur_orc_ip_table); - return orc_a->sp_reg == ORC_REG_UNDEFINED ? -1 : 1; + return orc_a->sp_reg == ORC_REG_UNDEFINED && !orc_a->end ? -1 : 1; } #ifdef CONFIG_MODULES @@ -352,7 +352,7 @@ static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr bool unwind_next_frame(struct unwind_state *state) { -
[PATCH v3 2/6] x86/stacktrace: remove STACKTRACE_DUMP_ONCE
The stack unwinding can sometimes fail yet. Especially with the generated debug info. So do not yell at users -- live patching (the only user of this interface) will inform the user about the failure gracefully. And given this was the only user of the macro, remove the macro proper too. [v3] remove also the macro Signed-off-by: Jiri Slaby Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: x...@kernel.org Cc: Josh Poimboeuf --- arch/x86/kernel/stacktrace.c | 18 ++ 1 file changed, 2 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c index 8948b7d9c064..f9dacf6d4667 100644 --- a/arch/x86/kernel/stacktrace.c +++ b/arch/x86/kernel/stacktrace.c @@ -81,16 +81,6 @@ EXPORT_SYMBOL_GPL(save_stack_trace_tsk); #ifdef CONFIG_HAVE_RELIABLE_STACKTRACE -#define STACKTRACE_DUMP_ONCE(task) ({ \ - static bool __section(.data.unlikely) __dumped; \ - \ - if (!__dumped) {\ - __dumped = true;\ - WARN_ON(1); \ - show_stack(task, NULL); \ - } \ -}) - static int __always_inline __save_stack_trace_reliable(struct stack_trace *trace, struct task_struct *task) @@ -123,20 +113,16 @@ __save_stack_trace_reliable(struct stack_trace *trace, * generated code which __kernel_text_address() doesn't know * about. */ - if (!addr) { - STACKTRACE_DUMP_ONCE(task); + if (!addr) return -EINVAL; - } if (save_stack_address(trace, addr, false)) return -EINVAL; } /* Check for stack corruption */ - if (unwind_error(&state)) { - STACKTRACE_DUMP_ONCE(task); + if (unwind_error(&state)) return -EINVAL; - } if (trace->nr_entries < trace->max_entries) trace->entries[trace->nr_entries++] = ULONG_MAX; -- 2.16.3
[PATCH v3 6/6] x86/stacktrace: enable HAVE_RELIABLE_STACKTRACE for the ORC unwinder
In SUSE, we need a reliable stack unwinder for kernel live patching, but we do not want to enable frame pointers for performance reasons. So after the previous patches to make the ORC reliable, mark ORC as a reliable stack unwinder on x86. Signed-off-by: Jiri Slaby Cc: Josh Poimboeuf Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: x...@kernel.org --- arch/x86/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 47e7f582f86a..e4199fbcc7f2 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -181,7 +181,7 @@ config X86 select HAVE_PERF_USER_STACK_DUMP select HAVE_RCU_TABLE_FREE select HAVE_REGS_AND_STACK_ACCESS_API - select HAVE_RELIABLE_STACKTRACE if X86_64 && UNWINDER_FRAME_POINTER && STACK_VALIDATION + select HAVE_RELIABLE_STACKTRACE if X86_64 && (UNWINDER_FRAME_POINTER || UNWINDER_ORC) && STACK_VALIDATION select HAVE_STACK_VALIDATIONif X86_64 select HAVE_SYSCALL_TRACEPOINTS select HAVE_UNSTABLE_SCHED_CLOCK -- 2.16.3
Re: [RFC PATCH 00/09] Implement direct user I/O interfaces for RDMA
On Fri, May 18, 2018 at 06:03:09AM +, Long Li wrote: > I also want to point out that, I choose to implement .read_iter and > .write_iter from file_operations to implement direct I/O (CIFS is already > doing this for O_DIRECT, so following this code path will avoid a big mess > up). The ideal choice is to implement .direct_IO from > address_space_operations that I think eventually we want to move to. No, the direct_IO address space operation is the mess. We're moving away from it.
Re: [RFC PATCH 00/09] Implement direct user I/O interfaces for RDMA
On Thu, May 17, 2018 at 07:10:04PM -0400, Tom Talpey wrote: > What's the security risk? This type of direct i/o behavior is not > uncommon, and can certainly be made safe, using the appropriate > memory registration and protection domains. Any risk needs to be > stated explicitly, and mitigation provided, or at least described. And in fact it is the same behavior you'll see on NFS over RDMA, or a block device or any local fs over SRP/iSER/NVMe over Fabrics..
[PATCH net-next 1/2] net: mscc: ocelot: add bonding support
Add link aggregation hardware offload support for Ocelot. ocelot_get_link_ksettings() is not great but it does work until the driver is reworked to switch to phylink. Signed-off-by: Alexandre Belloni --- drivers/net/ethernet/mscc/ocelot.c | 177 + drivers/net/ethernet/mscc/ocelot.h | 2 +- 2 files changed, 178 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index c8c74aa548d9..f22f5467001e 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -775,10 +775,43 @@ static int ocelot_get_sset_count(struct net_device *dev, int sset) return ocelot->num_stats; } +static int ocelot_get_link_ksettings(struct net_device *dev, +struct ethtool_link_ksettings *ks) +{ + ethtool_link_ksettings_zero_link_mode(ks, supported); + ethtool_link_ksettings_zero_link_mode(ks, advertising); + + ethtool_link_ksettings_add_link_mode(ks, supported, 10baseT_Half); + ethtool_link_ksettings_add_link_mode(ks, supported, 10baseT_Full); + ethtool_link_ksettings_add_link_mode(ks, supported, 100baseT_Half); + ethtool_link_ksettings_add_link_mode(ks, supported, 100baseT_Full); + ethtool_link_ksettings_add_link_mode(ks, supported, 1000baseT_Half); + ethtool_link_ksettings_add_link_mode(ks, supported, 1000baseT_Full); + ethtool_link_ksettings_add_link_mode(ks, supported, 2500baseT_Full); + ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg); + + if (!dev->phydev) + return 0; + + ethtool_link_ksettings_add_link_mode(ks, advertising, 10baseT_Full); + ethtool_link_ksettings_add_link_mode(ks, advertising, 100baseT_Full); + ethtool_link_ksettings_add_link_mode(ks, advertising, 1000baseT_Full); + ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg); + + if (!dev->phydev->link) + return 0; + + ks->base.speed = dev->phydev->speed; + ks->base.duplex = DUPLEX_FULL; + + return 0; +} + static const struct ethtool_ops ocelot_ethtool_ops = { .get_strings= ocelot_get_strings, .get_ethtool_stats = ocelot_get_ethtool_stats, .get_sset_count = ocelot_get_sset_count, + .get_link_ksettings = ocelot_get_link_ksettings, }; static int ocelot_port_attr_get(struct net_device *dev, @@ -1087,6 +1120,137 @@ static void ocelot_port_bridge_leave(struct ocelot_port *ocelot_port, ocelot->hw_bridge_dev = NULL; } +static void ocelot_set_aggr_pgids(struct ocelot *ocelot) +{ + int i, port, lag; + + /* Reset destination and aggregation PGIDS */ + for (port = 0; port < ocelot->num_phys_ports; port++) + ocelot_write_rix(ocelot, BIT(port), ANA_PGID_PGID, port); + + for (i = PGID_AGGR; i < PGID_SRC; i++) + ocelot_write_rix(ocelot, GENMASK(ocelot->num_phys_ports - 1, 0), +ANA_PGID_PGID, i); + + /* Now, set PGIDs for each LAG */ + for (lag = 0; lag < ocelot->num_phys_ports; lag++) { + unsigned long bond_mask; + int aggr_count = 0; + u8 aggr_idx[16]; + + bond_mask = ocelot->lags[lag]; + if (!bond_mask) + continue; + + for_each_set_bit(port, &bond_mask, ocelot->num_phys_ports) { + // Destination mask + ocelot_write_rix(ocelot, bond_mask, +ANA_PGID_PGID, port); + aggr_idx[aggr_count] = port; + aggr_count++; + } + + for (i = PGID_AGGR; i < PGID_SRC; i++) { + u32 ac; + + ac = ocelot_read_rix(ocelot, ANA_PGID_PGID, i); + ac &= ~bond_mask; + ac |= BIT(aggr_idx[i % aggr_count]); + ocelot_write_rix(ocelot, ac, ANA_PGID_PGID, i); + } + } +} + +static void ocelot_setup_lag(struct ocelot *ocelot, int lag) +{ + unsigned long bond_mask = ocelot->lags[lag]; + unsigned int p; + + for_each_set_bit(p, &bond_mask, ocelot->num_phys_ports) { + u32 port_cfg = ocelot_read_gix(ocelot, ANA_PORT_PORT_CFG, p); + + port_cfg &= ~ANA_PORT_PORT_CFG_PORTID_VAL_M; + + /* Use lag port as logical port for port i */ + ocelot_write_gix(ocelot, port_cfg | +ANA_PORT_PORT_CFG_PORTID_VAL(lag), +ANA_PORT_PORT_CFG, p); + } +} + +static int ocelot_port_lag_join(struct ocelot_port *ocelot_port, + struct net_device *bond) +{ + struct ocelot *ocelot = ocelot_port->ocelot; + int p = ocelot_port->chip_port; + int lag, lp; + struct net_devi
[PATCH net-next 0/2] net: mscc; ocelot: add more features
Hi, This series adds link aggregation and VLAN filtering hardware offload support to the ocelot driver. PTP is also on the list of features but it will probably not be submitted this cycle. Alexandre Belloni (1): net: mscc: ocelot: add bonding support Antoine Tenart (1): net: mscc: ocelot: add VLAN filtering drivers/net/ethernet/mscc/ocelot.c | 462 - drivers/net/ethernet/mscc/ocelot.h | 2 +- 2 files changed, 461 insertions(+), 3 deletions(-) -- 2.17.0
[PATCH net-next 2/2] net: mscc: ocelot: add VLAN filtering
From: Antoine Tenart Add hardware VLAN filtering offloading on ocelot. Signed-off-by: Antoine Tenart Signed-off-by: Alexandre Belloni --- drivers/net/ethernet/mscc/ocelot.c | 285 - 1 file changed, 283 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index f22f5467001e..84a7687455a5 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -148,12 +148,191 @@ static inline int ocelot_vlant_wait_for_completion(struct ocelot *ocelot) return 0; } +static int ocelot_vlant_set_mask(struct ocelot *ocelot, u16 vid, u32 mask) +{ + /* Select the VID to configure */ + ocelot_write(ocelot, ANA_TABLES_VLANTIDX_V_INDEX(vid), +ANA_TABLES_VLANTIDX); + /* Set the vlan port members mask and issue a write command */ + ocelot_write(ocelot, ANA_TABLES_VLANACCESS_VLAN_PORT_MASK(mask) | +ANA_TABLES_VLANACCESS_CMD_WRITE, +ANA_TABLES_VLANACCESS); + + return ocelot_vlant_wait_for_completion(ocelot); +} + +static void ocelot_vlan_mode(struct ocelot_port *port, +netdev_features_t features) +{ + struct ocelot *ocelot = port->ocelot; + u8 p = port->chip_port; + u32 val; + + /* Filtering */ + val = ocelot_read(ocelot, ANA_VLANMASK); + if (features & NETIF_F_HW_VLAN_CTAG_FILTER) + val |= BIT(p); + else + val &= ~BIT(p); + ocelot_write(ocelot, val, ANA_VLANMASK); +} + +static void ocelot_vlan_port_apply(struct ocelot *ocelot, + struct ocelot_port *port) +{ + u32 val; + + /* Ingress clasification (ANA_PORT_VLAN_CFG) */ + /* Default vlan to clasify for untagged frames (may be zero) */ + val = ANA_PORT_VLAN_CFG_VLAN_VID(port->pvid); + if (port->vlan_aware) + val |= ANA_PORT_VLAN_CFG_VLAN_AWARE_ENA | + ANA_PORT_VLAN_CFG_VLAN_POP_CNT(1); + + ocelot_rmw_gix(ocelot, val, + ANA_PORT_VLAN_CFG_VLAN_VID_M | + ANA_PORT_VLAN_CFG_VLAN_AWARE_ENA | + ANA_PORT_VLAN_CFG_VLAN_POP_CNT_M, + ANA_PORT_VLAN_CFG, port->chip_port); + + /* Drop frames with multicast source address */ + val = ANA_PORT_DROP_CFG_DROP_MC_SMAC_ENA; + if (port->vlan_aware && !port->vid) + /* If port is vlan-aware and tagged, drop untagged and priority +* tagged frames. +*/ + val |= ANA_PORT_DROP_CFG_DROP_UNTAGGED_ENA | + ANA_PORT_DROP_CFG_DROP_PRIO_S_TAGGED_ENA | + ANA_PORT_DROP_CFG_DROP_PRIO_C_TAGGED_ENA; + ocelot_write_gix(ocelot, val, ANA_PORT_DROP_CFG, port->chip_port); + + /* Egress configuration (REW_TAG_CFG): VLAN tag type to 8021Q. */ + val = REW_TAG_CFG_TAG_TPID_CFG(0); + + if (port->vlan_aware) { + if (port->vid) + /* Tag all frames except when VID == DEFAULT_VLAN */ + val |= REW_TAG_CFG_TAG_CFG(1); + else + /* Tag all frames */ + val |= REW_TAG_CFG_TAG_CFG(3); + } + ocelot_rmw_gix(ocelot, val, + REW_TAG_CFG_TAG_TPID_CFG_M | + REW_TAG_CFG_TAG_CFG_M, + REW_TAG_CFG, port->chip_port); + + /* Set default VLAN and tag type to 8021Q. */ + val = REW_PORT_VLAN_CFG_PORT_TPID(ETH_P_8021Q) | + REW_PORT_VLAN_CFG_PORT_VID(port->vid); + ocelot_rmw_gix(ocelot, val, + REW_PORT_VLAN_CFG_PORT_TPID_M | + REW_PORT_VLAN_CFG_PORT_VID_M, + REW_PORT_VLAN_CFG, port->chip_port); +} + +static int ocelot_vlan_vid_add(struct net_device *dev, u16 vid, bool pvid, + bool untagged) +{ + struct ocelot_port *port = netdev_priv(dev); + struct ocelot *ocelot = port->ocelot; + int ret; + + /* Add the port MAC address to with the right VLAN information */ + ocelot_mact_learn(ocelot, PGID_CPU, dev->dev_addr, vid, + ENTRYTYPE_LOCKED); + + /* Make the port a member of the VLAN */ + ocelot->vlan_mask[vid] |= BIT(port->chip_port); + ret = ocelot_vlant_set_mask(ocelot, vid, ocelot->vlan_mask[vid]); + if (ret) + return ret; + + /* Default ingress vlan classification */ + if (pvid) + port->pvid = vid; + + /* Untagged egress vlan clasification */ + if (untagged) + port->vid = vid; + + ocelot_vlan_port_apply(ocelot, port); + + return 0; +} + +static int ocelot_vlan_vid_del(struct net_device *dev, u16 vid) +{ + struct ocelot_port *port = netdev_priv(dev); + struct ocelot *ocelot = port
Re: [PATCH] printk/nmi: Prevent deadlock when serializing NMI backtraces
On (05/18/18 11:07), Sergey Senozhatsky wrote: > > if (this_cpu_read(printk_context) & PRINTK_SAFE_CONTEXT_MASK) || > raw_spin_is_locked(&logbuf_lock) > > just to check per-CPU `printk_context' first and only afterwards > access the global `logbuf_lock'. printk_nmi_enter() happens on > every CPU, so maybe we can avoid some overhead by checking the > local per-CPU data first. Nah, may be it won't. This, probably, would have been the case if we had continue to call console drivers from printk_safe section [at least]. CPUs don't spend that much time in printk_safe sections. -ss
Re: [RFC PATCH 01/09] Introduce offset for the 1st page in data transfer structures
merged into cifs-2.6.git for-next On Thu, May 17, 2018 at 7:22 PM, Long Li wrote: > From: Long Li > > Currently CIFS allocates its own pages for data transfer, they don't need > offset > since it's always 0 in the 1st page. > > Direct data transfer needs to define an offset because user-data may not start > on the page boundary > > Signed-off-by: Long Li > --- > fs/cifs/cifsglob.h | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h > index cb950a5..a51855c 100644 > --- a/fs/cifs/cifsglob.h > +++ b/fs/cifs/cifsglob.h > @@ -176,6 +176,7 @@ struct smb_rqst { > struct kvec *rq_iov;/* array of kvecs */ > unsigned intrq_nvec;/* number of kvecs in array */ > struct page **rq_pages; /* pointer to array of page ptrs */ > + unsigned intrq_offset; /* the offset to the 1st page */ > unsigned intrq_npages; /* number pages in array */ > unsigned intrq_pagesz; /* page size to use */ > unsigned intrq_tailsz; /* length of last page */ > @@ -1167,8 +1168,10 @@ struct cifs_readdata { > struct kvec iov[2]; > #ifdef CONFIG_CIFS_SMB_DIRECT > struct smbd_mr *mr; > + struct page **direct_pages; > #endif > unsigned intpagesz; > + unsigned intpage_offset; > unsigned inttailsz; > unsigned intcredits; > unsigned intnr_pages; > @@ -1192,8 +1195,10 @@ struct cifs_writedata { > int result; > #ifdef CONFIG_CIFS_SMB_DIRECT > struct smbd_mr *mr; > + struct page **direct_pages; > #endif > unsigned intpagesz; > + unsigned intpage_offset; > unsigned inttailsz; > unsigned intcredits; > unsigned intnr_pages; > -- > 2.7.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Thanks, Steve
Re: [PATCH v4 3/4] tpm: migrate tpm2_get_tpm_pt() to use struct tpm_buf
On 03/26/2018 05:44 PM, Jarkko Sakkinen wrote: In order to make struct tpm_buf the first class object for constructing TPM commands, migrate tpm2_get_tpm_pt() to use it. Signed-off-by: Jarkko Sakkinen Reviewed-by: Nayna Jain Tested-by: Nayna Jain Thanks & Regards, - Nayna --- drivers/char/tpm/tpm2-cmd.c | 63 + 1 file changed, 23 insertions(+), 40 deletions(-) diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c index 7bffd0fd1dca..b3b52f9eb65f 100644 --- a/drivers/char/tpm/tpm2-cmd.c +++ b/drivers/char/tpm/tpm2-cmd.c @@ -27,20 +27,6 @@ enum tpm2_session_attributes { TPM2_SA_CONTINUE_SESSION= BIT(0), }; -struct tpm2_get_tpm_pt_in { - __be32 cap_id; - __be32 property_id; - __be32 property_cnt; -} __packed; - -struct tpm2_get_tpm_pt_out { - u8 more_data; - __be32 subcap_id; - __be32 property_cnt; - __be32 property_id; - __be32 value; -} __packed; - struct tpm2_get_random_in { __be16 size; } __packed; @@ -51,8 +37,6 @@ struct tpm2_get_random_out { } __packed; union tpm2_cmd_params { - struct tpm2_get_tpm_pt_in get_tpm_pt_in; - struct tpm2_get_tpm_pt_out get_tpm_pt_out; struct tpm2_get_random_in getrandom_in; struct tpm2_get_random_out getrandom_out; }; @@ -379,19 +363,6 @@ int tpm2_get_random(struct tpm_chip *chip, u8 *out, size_t max) return total ? total : -EIO; } -#define TPM2_GET_TPM_PT_IN_SIZE \ - (sizeof(struct tpm_input_header) + \ -sizeof(struct tpm2_get_tpm_pt_in)) - -#define TPM2_GET_TPM_PT_OUT_BODY_SIZE \ -sizeof(struct tpm2_get_tpm_pt_out) - -static const struct tpm_input_header tpm2_get_tpm_pt_header = { - .tag = cpu_to_be16(TPM2_ST_NO_SESSIONS), - .length = cpu_to_be32(TPM2_GET_TPM_PT_IN_SIZE), - .ordinal = cpu_to_be32(TPM2_CC_GET_CAPABILITY) -}; - /** * tpm2_flush_context_cmd() - execute a TPM2_FlushContext command * @chip: TPM chip to use @@ -725,6 +696,14 @@ int tpm2_unseal_trusted(struct tpm_chip *chip, return rc; } +struct tpm2_get_cap_out { + u8 more_data; + __be32 subcap_id; + __be32 property_cnt; + __be32 property_id; + __be32 value; +} __packed; + /** * tpm2_get_tpm_pt() - get value of a TPM_CAP_TPM_PROPERTIES type property * @chip: TPM chip to use. @@ -737,19 +716,23 @@ int tpm2_unseal_trusted(struct tpm_chip *chip, ssize_t tpm2_get_tpm_pt(struct tpm_chip *chip, u32 property_id, u32 *value, const char *desc) { - struct tpm2_cmd cmd; + struct tpm2_get_cap_out *out; + struct tpm_buf buf; int rc; - cmd.header.in = tpm2_get_tpm_pt_header; - cmd.params.get_tpm_pt_in.cap_id = cpu_to_be32(TPM2_CAP_TPM_PROPERTIES); - cmd.params.get_tpm_pt_in.property_id = cpu_to_be32(property_id); - cmd.params.get_tpm_pt_in.property_cnt = cpu_to_be32(1); - - rc = tpm_transmit_cmd(chip, NULL, &cmd, sizeof(cmd), - TPM2_GET_TPM_PT_OUT_BODY_SIZE, 0, desc); - if (!rc) - *value = be32_to_cpu(cmd.params.get_tpm_pt_out.value); - + rc = tpm_buf_init(&buf, TPM2_ST_NO_SESSIONS, TPM2_CC_GET_CAPABILITY); + if (rc) + return rc; + tpm_buf_append_u32(&buf, TPM2_CAP_TPM_PROPERTIES); + tpm_buf_append_u32(&buf, property_id); + tpm_buf_append_u32(&buf, 1); + rc = tpm_transmit_cmd(chip, NULL, buf.data, PAGE_SIZE, 0, 0, NULL); + if (!rc) { + out = (struct tpm2_get_cap_out *) + &buf.data[TPM_HEADER_SIZE]; + *value = be32_to_cpu(out->value); + } + tpm_buf_destroy(&buf); return rc; } EXPORT_SYMBOL_GPL(tpm2_get_tpm_pt);
Re: [PATCH v4 2/4] tpm: migrate tpm2_probe() to use struct tpm_buf
On 03/26/2018 05:44 PM, Jarkko Sakkinen wrote: In order to make struct tpm_buf the first class object for constructing TPM commands, migrate tpm2_probe() to use it. Signed-off-by: Jarkko Sakkinen Acked-by: Jay Freyensee Reviewed-by: Nayna Jain Tested-by: Nayna Jain Thanks & Regards, - Nayna --- drivers/char/tpm/tpm2-cmd.c | 37 + 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c index 7665661d9230..7bffd0fd1dca 100644 --- a/drivers/char/tpm/tpm2-cmd.c +++ b/drivers/char/tpm/tpm2-cmd.c @@ -844,30 +844,35 @@ static int tpm2_do_selftest(struct tpm_chip *chip) /** * tpm2_probe() - probe TPM 2.0 - * @chip: TPM chip to use + * @chip: a TPM chip to probe * - * Return: < 0 error and 0 on success. + * Return: 0 on success, + * -errno otherwise * - * Send idempotent TPM 2.0 command and see whether TPM 2.0 chip replied based on - * the reply tag. + * Send an idempotent TPM 2.0 command and see whether there is TPM2 chip in the + * other end based on the response tag. The flag TPM_CHIP_FLAG_TPM2 is set if + * this is the case. */ int tpm2_probe(struct tpm_chip *chip) { - struct tpm2_cmd cmd; + struct tpm_output_header *out; + struct tpm_buf buf; int rc; - cmd.header.in = tpm2_get_tpm_pt_header; - cmd.params.get_tpm_pt_in.cap_id = cpu_to_be32(TPM2_CAP_TPM_PROPERTIES); - cmd.params.get_tpm_pt_in.property_id = cpu_to_be32(0x100); - cmd.params.get_tpm_pt_in.property_cnt = cpu_to_be32(1); - - rc = tpm_transmit_cmd(chip, NULL, &cmd, sizeof(cmd), 0, 0, NULL); - if (rc < 0) + rc = tpm_buf_init(&buf, TPM2_ST_NO_SESSIONS, TPM2_CC_GET_CAPABILITY); + if (rc) return rc; - - if (be16_to_cpu(cmd.header.out.tag) == TPM2_ST_NO_SESSIONS) - chip->flags |= TPM_CHIP_FLAG_TPM2; - + tpm_buf_append_u32(&buf, TPM2_CAP_TPM_PROPERTIES); + tpm_buf_append_u32(&buf, TPM_PT_TOTAL_COMMANDS); + tpm_buf_append_u32(&buf, 1); + rc = tpm_transmit_cmd(chip, NULL, buf.data, PAGE_SIZE, 0, 0, NULL); + /* We ignore TPM return codes on purpose. */ + if (rc >= 0) { + out = (struct tpm_output_header *)buf.data; + if (be16_to_cpu(out->tag) == TPM2_ST_NO_SESSIONS) + chip->flags |= TPM_CHIP_FLAG_TPM2; + } + tpm_buf_destroy(&buf); return 0; } EXPORT_SYMBOL_GPL(tpm2_probe);
Re: [PATCH v4 1/4] tpm: migrate tpm2_shutdown() to use struct tpm_buf
On 03/26/2018 05:44 PM, Jarkko Sakkinen wrote: In order to make struct tpm_buf the first class object for constructing TPM commands, migrate tpm2_shutdown() to use it. In addition, removed the klog entry when tpm_transmit_cmd() fails because tpm_tansmit_cmd() already prints an error message. Signed-off-by: Jarkko Sakkinen Reviewed-by: Nayna Jain Tested-by: Nayna Jain --- drivers/char/tpm/tpm2-cmd.c | 44 1 file changed, 12 insertions(+), 32 deletions(-) diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c index 96c77c8e7f40..7665661d9230 100644 --- a/drivers/char/tpm/tpm2-cmd.c +++ b/drivers/char/tpm/tpm2-cmd.c @@ -27,10 +27,6 @@ enum tpm2_session_attributes { TPM2_SA_CONTINUE_SESSION= BIT(0), }; -struct tpm2_startup_in { - __be16 startup_type; -} __packed; - struct tpm2_get_tpm_pt_in { __be32 cap_id; __be32 property_id; @@ -55,7 +51,6 @@ struct tpm2_get_random_out { } __packed; union tpm2_cmd_params { - struct tpm2_startup_in startup_in; struct tpm2_get_tpm_pt_in get_tpm_pt_in; struct tpm2_get_tpm_pt_out get_tpm_pt_out; struct tpm2_get_random_in getrandom_in; @@ -412,11 +407,8 @@ void tpm2_flush_context_cmd(struct tpm_chip *chip, u32 handle, int rc; rc = tpm_buf_init(&buf, TPM2_ST_NO_SESSIONS, TPM2_CC_FLUSH_CONTEXT); - if (rc) { - dev_warn(&chip->dev, "0x%08x was not flushed, out of memory\n", -handle); + if (rc) return; - } tpm_buf_append_u32(&buf, handle); @@ -762,40 +754,28 @@ ssize_t tpm2_get_tpm_pt(struct tpm_chip *chip, u32 property_id, u32 *value, } EXPORT_SYMBOL_GPL(tpm2_get_tpm_pt); -#define TPM2_SHUTDOWN_IN_SIZE \ - (sizeof(struct tpm_input_header) + \ -sizeof(struct tpm2_startup_in)) - -static const struct tpm_input_header tpm2_shutdown_header = { - .tag = cpu_to_be16(TPM2_ST_NO_SESSIONS), - .length = cpu_to_be32(TPM2_SHUTDOWN_IN_SIZE), - .ordinal = cpu_to_be32(TPM2_CC_SHUTDOWN) -}; - /** * tpm2_shutdown() - send shutdown command to the TPM chip * + * In places where shutdown command is sent there's no much we can do except + * print the error code on a system failure. + * * @chip: TPM chip to use. * @shutdown_type:shutdown type. The value is either *TPM_SU_CLEAR or TPM_SU_STATE. */ void tpm2_shutdown(struct tpm_chip *chip, u16 shutdown_type) { - struct tpm2_cmd cmd; + struct tpm_buf buf; int rc; - cmd.header.in = tpm2_shutdown_header; - cmd.params.startup_in.startup_type = cpu_to_be16(shutdown_type); - - rc = tpm_transmit_cmd(chip, NULL, &cmd, sizeof(cmd), 0, 0, - "stopping the TPM"); - - /* In places where shutdown command is sent there's no much we can do -* except print the error code on a system failure. -*/ - if (rc < 0 && rc != -EPIPE) - dev_warn(&chip->dev, "transmit returned %d while stopping the TPM", -rc); + rc = tpm_buf_init(&buf, TPM2_ST_NO_SESSIONS, TPM2_CC_SHUTDOWN); + if (rc) + return; + tpm_buf_append_u16(&buf, shutdown_type); + tpm_transmit_cmd(chip, NULL, buf.data, PAGE_SIZE, 0, 0, +"stopping the TPM"); + tpm_buf_destroy(&buf); } /*
Re: [PATCH 1/2] dt-bindings: power: Add ZynqMP power domain bindings
Hi Jolly, On 2018-05-17 23:10, Jolly Shah wrote: >> +Example: >> + zynqmp-genpd { >> + compatible = "xlnx,zynqmp-genpd"; > What's the control interface for controlling the domains? >> + >> + pd_usb0: pd-usb0 { >> + pd-id = <22>; >> + #power-domain-cells = <0>; > There's no need for all these sub nodes. Make #power-domain-cells 1 > and put the id in the cell value. That was my first reaction, too... >> + }; >> + >> + pd_sata: pd-sata { >> + pd-id = <28>; >> + #power-domain-cells = <0>; >> + }; >> + >> + pd_gpu: pd-gpu { >> + pd-id = <58 20 21>; ... until I saw the above. Controlling the GPU power area requires controlling 3 physical areas? However, doing it this way may bite you in the future, if a need arises to control a subset. And what about power up/down order? >>> What about defining 3 separate domains and arranging them in >>> parent-child relationship? generic power domains already supports that >>> and this allows to nicely define the power on/off order. >>> >> + #power-domain-cells = <0x0>; >> + }; >> + }; >> I agree it should be arranged in as parent child order to control subset or >> control >> order. Will incorporate those changes in next version. > > As suggested, I tried out parent, child approach. However what I found is > Genpd core takes care of parent child dependencies for power on off routines > only. In our case, We need them in attach-detach routines too. In that case, > we need to handle dependencies manually for those routines. Please suggest > better approach, if any. What do you mean to handle attach-detach? Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: Revert "dmaengine: pl330: add DMA_PAUSE feature"
Hi Vinod, On 2018-05-18 06:03, Vinod wrote: > On 17-05-18, 12:20, Frank Mori Hess wrote: >> Sorry to keep coming back to this, but I'm experiencing a bit of >> incredulity that you are saying what you seem to be saying. You seem >> to be saying dmaengine provides no way to permanently stop a transfer >> safely other than transferring the full number of bytes initially >> requested. So the proper resolution is the 8250 serial driver needs >> to remove rx dma support, because they are just trying to do something >> that is not supported. ... >> I see two ways of interpreting what you are saying. First, from the >> point of view of the user of the dmaengine api. From this point of >> view it is impossible for data loss to occur during pause or reading >> the residue, so saying they cause no data loss during >> pause/residue/terminate is meaningless. This is because the user >> can't confirm any data loss until after they have read the residue and >> the transfer is terminated, since optimistically the data may still be >> available if only the user would resume and allow the transfer to >> continue. >> >> Second there is the interpretation I want to believe. This is "no >> data loss on pause" means that after the pause, no data has been >> discarded by the dma controller hardware, in fact all the data it has >> read before being paused has been fully transferred to its >> destination. Reading the residue while paused gives you an accurate, >> up-to-date state of the paused transfer. Then finally, although in >> general dma terminate causes data loss, it does not in this case since >> we terminated while we were paused and read the up-to-date residue. >> This is the interpretation implicit in the 8250 serial driver. > You are simply mixing things up! On Pause we don't expect data loss, as user > can > resume the transfer. This means as you rightly guessed, the DMA HW should not > drop > any data, nor should SW. > > Now if you want to read residue at this point it is perfectly valid. But if > you > decide to terminate the channel (yes it is terminate_all API), we abort and > don't > have context to report back! > > As Lars rightly pointed out, residue calculation are very tricky, DMA fifo may > have data, some data may be in device FIFO, so residue is always from DMA > point > of view and may differ from device view (more or less depending upon > direction) > > Now if you require to add more features for your usecase, please do feel free > to > send a patch. The framework can always be improved, we haven't solved world > hunger yet! Okay, I see that in theory, there are some tricky bits in implementing DMA support in UART drivers. On the other hand there are already drivers with seems to be working fine. This discussion is about revert of the feature present in pl330 driver, which is required by other in-kernel driver without proper fix to them. Can we drop it for now and discuss what and how should be implemented to make everyone happy, without any regressions? Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
[PATCH v3 1/3] scsi: ufs: Extract devfreq registration
Failing to register with devfreq leaves hba->devfreq assigned, which causes the error path to dereference the ERR_PTR(). Rather than bolting on more conditionals, move the call of devm_devfreq_add_device() into it's own function and only update hba->devfreq once it's successfully registered. The subsequent patch builds upon this to make UFS actually work again, as it's been broken since f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency") Also switch to use DEVFREQ_GOV_SIMPLE_ONDEMAND constant. Reviewed-by: Chanwoo Choi Signed-off-by: Bjorn Andersson --- Changes since v2: - Use DEVFREQ_GOV_SIMPLE_ONDEMAND, per Chanwoo's recommendation - Picked up Chanwoo's R-b Changes since v1: - None drivers/scsi/ufs/ufshcd.c | 31 ++- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 00e79057f870..f04902a066cb 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -1287,6 +1287,26 @@ static struct devfreq_dev_profile ufs_devfreq_profile = { .get_dev_status = ufshcd_devfreq_get_dev_status, }; +static int ufshcd_devfreq_init(struct ufs_hba *hba) +{ + struct devfreq *devfreq; + int ret; + + devfreq = devm_devfreq_add_device(hba->dev, + &ufs_devfreq_profile, + DEVFREQ_GOV_SIMPLE_ONDEMAND, + NULL); + if (IS_ERR(devfreq)) { + ret = PTR_ERR(devfreq); + dev_err(hba->dev, "Unable to register with devfreq %d\n", ret); + return ret; + } + + hba->devfreq = devfreq; + + return 0; +} + static void __ufshcd_suspend_clkscaling(struct ufs_hba *hba) { unsigned long flags; @@ -6439,16 +6459,9 @@ static int ufshcd_probe_hba(struct ufs_hba *hba) sizeof(struct ufs_pa_layer_attr)); hba->clk_scaling.saved_pwr_info.is_valid = true; if (!hba->devfreq) { - hba->devfreq = devm_devfreq_add_device(hba->dev, - &ufs_devfreq_profile, - "simple_ondemand", - NULL); - if (IS_ERR(hba->devfreq)) { - ret = PTR_ERR(hba->devfreq); - dev_err(hba->dev, "Unable to register with devfreq %d\n", - ret); + ret = ufshcd_devfreq_init(hba); + if (ret) goto out; - } } hba->clk_scaling.is_allowed = true; } -- 2.17.0
Re: [PATCH -tip v3 5/7] x86: kprobes: Ignore break_handler
* Masami Hiramatsu wrote: > Remove break_handler related code since that was used > only for jprobe and jprobe is removed now. > > Signed-off-by: Masami Hiramatsu > --- > arch/x86/kernel/kprobes/common.h | 10 -- > arch/x86/kernel/kprobes/core.c | 13 ++--- > arch/x86/kernel/kprobes/ftrace.c | 16 ++-- > 3 files changed, 4 insertions(+), 35 deletions(-) Even after this change there's a stale reference to ->break_handler(): arch/x86/include/asm/kprobes.h: * a post_handler or break_handler). Please use 'git grep'! Also, there's no reason to send an incomplete series: please remove _all_ ->break_handler() references so that it's 100% gone! The pain of having and not having ->break_handler() is completely unnecessary. Thanks, Ingo
[PATCH v3 2/3] scsi: ufs: Use freq table with devfreq
devfreq requires that the client operates on actual frequencies, not only 0 and UMAX_INT and as such UFS brok with the introduction of f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency"). This patch registers the frequencies of the first clock with devfreq and use these to determine if we're trying to step up or down. Reviewed-by: Chanwoo Choi [for devfreq & OPP part] Reviewed-by: Subhash Jadavani Signed-off-by: Bjorn Andersson --- Changes since v2: - Picked up R-b from Chanwoo and Subhash Chances since v1: - Register min_freq and max_freq as opp levels. - Unregister opp levels on removal, to make e.g. probe defer working drivers/scsi/ufs/ufshcd.c | 47 +-- 1 file changed, 40 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index f04902a066cb..3d46a70ed41d 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -1200,16 +1200,13 @@ static int ufshcd_devfreq_target(struct device *dev, struct ufs_hba *hba = dev_get_drvdata(dev); ktime_t start; bool scale_up, sched_clk_scaling_suspend_work = false; + struct list_head *clk_list = &hba->clk_list_head; + struct ufs_clk_info *clki; unsigned long irq_flags; if (!ufshcd_is_clkscaling_supported(hba)) return -EINVAL; - if ((*freq > 0) && (*freq < UINT_MAX)) { - dev_err(hba->dev, "%s: invalid freq = %lu\n", __func__, *freq); - return -EINVAL; - } - spin_lock_irqsave(hba->host->host_lock, irq_flags); if (ufshcd_eh_in_progress(hba)) { spin_unlock_irqrestore(hba->host->host_lock, irq_flags); @@ -1219,7 +1216,13 @@ static int ufshcd_devfreq_target(struct device *dev, if (!hba->clk_scaling.active_reqs) sched_clk_scaling_suspend_work = true; - scale_up = (*freq == UINT_MAX) ? true : false; + if (list_empty(clk_list)) { + spin_unlock_irqrestore(hba->host->host_lock, irq_flags); + goto out; + } + + clki = list_first_entry(&hba->clk_list_head, struct ufs_clk_info, list); + scale_up = (*freq == clki->max_freq) ? true : false; if (!ufshcd_is_devfreq_scaling_required(hba, scale_up)) { spin_unlock_irqrestore(hba->host->host_lock, irq_flags); ret = 0; @@ -1289,16 +1292,29 @@ static struct devfreq_dev_profile ufs_devfreq_profile = { static int ufshcd_devfreq_init(struct ufs_hba *hba) { + struct list_head *clk_list = &hba->clk_list_head; + struct ufs_clk_info *clki; struct devfreq *devfreq; int ret; - devfreq = devm_devfreq_add_device(hba->dev, + /* Skip devfreq if we don't have any clocks in the list */ + if (list_empty(clk_list)) + return 0; + + clki = list_first_entry(clk_list, struct ufs_clk_info, list); + dev_pm_opp_add(hba->dev, clki->min_freq, 0); + dev_pm_opp_add(hba->dev, clki->max_freq, 0); + + devfreq = devfreq_add_device(hba->dev, &ufs_devfreq_profile, DEVFREQ_GOV_SIMPLE_ONDEMAND, NULL); if (IS_ERR(devfreq)) { ret = PTR_ERR(devfreq); dev_err(hba->dev, "Unable to register with devfreq %d\n", ret); + + dev_pm_opp_remove(hba->dev, clki->min_freq); + dev_pm_opp_remove(hba->dev, clki->max_freq); return ret; } @@ -1307,6 +1323,22 @@ static int ufshcd_devfreq_init(struct ufs_hba *hba) return 0; } +static void ufshcd_devfreq_remove(struct ufs_hba *hba) +{ + struct list_head *clk_list = &hba->clk_list_head; + struct ufs_clk_info *clki; + + if (!hba->devfreq) + return; + + devfreq_remove_device(hba->devfreq); + hba->devfreq = NULL; + + clki = list_first_entry(clk_list, struct ufs_clk_info, list); + dev_pm_opp_remove(hba->dev, clki->min_freq); + dev_pm_opp_remove(hba->dev, clki->max_freq); +} + static void __ufshcd_suspend_clkscaling(struct ufs_hba *hba) { unsigned long flags; @@ -7005,6 +7037,7 @@ static void ufshcd_hba_exit(struct ufs_hba *hba) if (hba->devfreq) ufshcd_suspend_clkscaling(hba); destroy_workqueue(hba->clk_scaling.workq); + ufshcd_devfreq_remove(hba); } ufshcd_setup_clocks(hba, false); ufshcd_setup_hba_vreg(hba, false); -- 2.17.0
[PATCH v3 3/3] arm64: dts: qcom: msm8996: Add ufs related nodes
Add the UFS QMP phy node and the UFS host controller node, now that we have working UFS and the necessary clocks in place. Signed-off-by: Bjorn Andersson --- arch/arm64/boot/dts/qcom/apq8096-db820c.dtsi | 8 ++ arch/arm64/boot/dts/qcom/msm8996.dtsi| 85 2 files changed, 93 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/apq8096-db820c.dtsi b/arch/arm64/boot/dts/qcom/apq8096-db820c.dtsi index 8be666ea92bd..00e3ecd1180a 100644 --- a/arch/arm64/boot/dts/qcom/apq8096-db820c.dtsi +++ b/arch/arm64/boot/dts/qcom/apq8096-db820c.dtsi @@ -122,6 +122,14 @@ status = "okay"; }; + phy@627000 { + status = "okay"; + }; + + ufshc@624000 { + status = "okay"; + }; + phy@34000 { status = "okay"; }; diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi b/arch/arm64/boot/dts/qcom/msm8996.dtsi index 37b7152cb064..221bb3d383c5 100644 --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi @@ -633,6 +633,91 @@ #interrupt-cells = <4>; }; + ufsphy: phy@627000 { + compatible = "qcom,msm8996-ufs-phy-qmp-14nm"; + reg = <0x627000 0xda8>; + reg-names = "phy_mem"; + #phy-cells = <0>; + + vdda-phy-supply = <&pm8994_l28>; + vdda-pll-supply = <&pm8994_l12>; + + vdda-phy-max-microamp = <18380>; + vdda-pll-max-microamp = <9440>; + + vddp-ref-clk-supply = <&pm8994_l25>; + vddp-ref-clk-max-microamp = <100>; + vddp-ref-clk-always-on; + + clock-names = "ref_clk_src", "ref_clk"; + clocks = <&rpmcc RPM_SMD_LN_BB_CLK>, +<&gcc GCC_UFS_CLKREF_CLK>; + status = "disabled"; + + power-domains = <&gcc UFS_GDSC>; + }; + + ufshc@624000 { + compatible = "qcom,ufshc"; + reg = <0x624000 0x2500>; + interrupts = ; + + phys = <&ufsphy>; + phy-names = "ufsphy"; + + vcc-supply = <&pm8994_l20>; + vccq-supply = <&pm8994_l25>; + vccq2-supply = <&pm8994_s4>; + + vcc-max-microamp = <60>; + vccq-max-microamp = <45>; + vccq2-max-microamp = <45>; + + clock-names = + "core_clk_src", + "core_clk", + "bus_clk", + "bus_aggr_clk", + "iface_clk", + "core_clk_unipro_src", + "core_clk_unipro", + "core_clk_ice", + "ref_clk", + "tx_lane0_sync_clk", + "rx_lane0_sync_clk"; + clocks = + <&gcc UFS_AXI_CLK_SRC>, + <&gcc GCC_UFS_AXI_CLK>, + <&gcc GCC_SYS_NOC_UFS_AXI_CLK>, + <&gcc GCC_AGGRE2_UFS_AXI_CLK>, + <&gcc GCC_UFS_AHB_CLK>, + <&gcc UFS_ICE_CORE_CLK_SRC>, + <&gcc GCC_UFS_UNIPRO_CORE_CLK>, + <&gcc GCC_UFS_ICE_CORE_CLK>, + <&rpmcc RPM_SMD_LN_BB_CLK>, + <&gcc GCC_UFS_TX_SYMBOL_0_CLK>, + <&gcc GCC_UFS_RX_SYMBOL_0_CLK>; + freq-table-hz = + <1 2>, + <0 0>, + <0 0>, + <0 0>, + <0 0>, + <15000 3>, + <0 0>, + <0 0>, + <0 0>, + <0 0>, + <0 0>; + + lanes-per-direction = <1>; + status = "disabled"; + + ufs_variant { + compatible = "qcom,ufs_variant"; + }; + }; + mmcc: clock-controller@8c { compatible = "qcom,mmcc-msm8996"; #clock-cells = <1>; -- 2.17.0
[PATCH v3 0/3] Fix UFS and devfreq interaction
With the introduction of f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency") the UFS host controller driver (UFSHCD) stopped probing for platforms that supports frequency scaling, e.g. all modern Qualcomm platforms. The cause of this was UFSHCD's reliance of not registering any frequencies and then being called by devfreq to switch between the frequencies 0 and UINT_MAX. The devfreq code implies that the client is able to pass the frequency table, instead of relying on opp tables, but as concluded after v1 this is not compliant with devfreq cooling, which will enable and disable opp entries in order to limit the valid frequencies. So instead the UFSHCD driver is modified to read the freq-table and register the first clock's two rates as the two available opp levels. This follows the first patch which facilitates the implementation of this in a clean fashion, and removes the kernel panic which previously happened when devfreq initialization failed. With this UFS is once again functional on the db820c, and is needed to get UFS working on SDM845 (both tested). Added in v3 is the dts patch for Andy to introduce UFS in msm8996 and db820c, now that it finally works again. Bjorn Andersson (3): scsi: ufs: Extract devfreq registration scsi: ufs: Use freq table with devfreq arm64: dts: qcom: msm8996: Add ufs related nodes arch/arm64/boot/dts/qcom/apq8096-db820c.dtsi | 8 ++ arch/arm64/boot/dts/qcom/msm8996.dtsi| 85 drivers/scsi/ufs/ufshcd.c| 76 + 3 files changed, 154 insertions(+), 15 deletions(-) -- 2.17.0
[PATCH v3] scripts/tags.sh: don't parse `ls` for $ALLSOURCE_ARCHS generation
Parsing `ls` is fragile at best and _will_ fail when $tree contains spaces. Replace this with a glob-generated string and directly assign it to $ALLSOURCE_ARCHS (Kconfig is removed as it isn't an architecture); a subshell is implied by $(), so `cd` doesn't affect the current working directory. Signed-off-by: Joey Pabalinas 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/scripts/tags.sh b/scripts/tags.sh index 78e546ff689c2d5f40..e4aba2983f6272fc44 100755 --- a/scripts/tags.sh +++ b/scripts/tags.sh @@ -29,14 +29,12 @@ fi ignore="$ignore ( -path ${tree}tools ) -prune -o" # Find all available archs find_all_archs() { - ALLSOURCE_ARCHS="" - for arch in `ls ${tree}arch`; do - ALLSOURCE_ARCHS="${ALLSOURCE_ARCHS} "${arch##\/} - done + ALLSOURCE_ARCHS="$(cd "${tree}arch/" && echo *)" + ALLSOURCE_ARCHS="${ALLSOURCE_ARCHS/Kconfig}" } # Detect if ALLSOURCE_ARCHS is set. If not, we assume SRCARCH if [ "${ALLSOURCE_ARCHS}" = "" ]; then ALLSOURCE_ARCHS=${SRCARCH} -- 2.17.0.rc1.35.g90bbd502d54fe92035.dirty signature.asc Description: PGP signature
Re: [PATCH -mm] mm, huge page: Copy to access sub-page last when copy huge page
On Fri 18-05-18 11:03:16, Huang, Ying wrote: [...] > The patch is a generic optimization which should benefit quite some > workloads, not for a specific use case. To demonstrate the performance > benefit of the patch, we tested it with vm-scalability run on > transparent huge page. It is also adds quite some non-intuitive code. So is this worth? Does any _real_ workload benefits from the change? > include/linux/mm.h | 3 ++- > mm/huge_memory.c | 3 ++- > mm/memory.c| 43 +++ > 3 files changed, 43 insertions(+), 6 deletions(-) -- Michal Hocko SUSE Labs
Re: [PATCH -tip v3 0/7] kprobes: x86: Cleanup jprobe implementation on x86
* Masami Hiramatsu wrote: > x86: kprobes: Remove jprobe implementation > x86: kprobes: Ignore break_handler > x86: kprobes: Do not disable preempt on int3 path So this is another annoyance. Do you ever compare the changelogs and titles you write with the ones I actually commit? They are almost never the same, for example the titles to x86 level kprobes code are always trying to use this kind of prefix: b664d57f39d0: kprobes/x86: Remove IRQ disabling from jprobe handlers ee213fc72fd6: kprobes/x86: Set up frame pointer in kprobe trampoline a19b2e3d7839: kprobes/x86: Remove IRQ disabling from ftrace-based/optimized kprobes 5bb4fc2d8641: kprobes/x86: Disable preemption in ftrace-based jprobes 9a09f261a4fa: kprobes/x86: Disable preemption in optprobe cd52edad55fb: kprobes/x86: Move the get_kprobe_ctlblk() into irq-disabled block a8976fc84b64: kprobes/x86: Remove addressof() operators Not "x86: kprobes:" which is the wrong order anyway... Could you please be more careful about all this in the future? Thanks, Ingo
Re: [PATCH v9 03/11] arm64: kexec_file: invoke the kernel without purgatory
James, On Tue, May 15, 2018 at 05:15:52PM +0100, James Morse wrote: > Hi Akashi, > > On 15/05/18 05:45, AKASHI Takahiro wrote: > > On Fri, May 11, 2018 at 06:03:49PM +0100, James Morse wrote: > >> On 07/05/18 06:22, AKASHI Takahiro wrote: > >>> On Tue, May 01, 2018 at 06:46:06PM +0100, James Morse wrote: > On 25/04/18 07:26, AKASHI Takahiro wrote: > > diff --git a/arch/arm64/kernel/machine_kexec.c > > b/arch/arm64/kernel/machine_kexec.c > > index f76ea92dff91..f7dbba00be10 100644 > > --- a/arch/arm64/kernel/machine_kexec.c > > +++ b/arch/arm64/kernel/machine_kexec.c > > @@ -205,10 +205,17 @@ void machine_kexec(struct kimage *kimage) > >> > > cpu_soft_restart(kimage != kexec_crash_image, > > - reboot_code_buffer_phys, kimage->head, kimage->start, > > 0); > > + reboot_code_buffer_phys, kimage->head, kimage->start, > > +#ifdef CONFIG_KEXEC_FILE > > + kimage->purgatory_info.purgatory_buf ? > > + 0 : > > kimage->arch.dtb_mem); > > +#else > > + 0); > > +#endif > >> > >> > purgatory_buf seems to only be set in kexec_purgatory_setup_kbuf(), > called from > kexec_load_purgatory(), which we don't use. How does this get a value? > > Would it be better to always use kimage->arch.dtb_mem, and ensure that > is 0 for > regular kexec (as we can't know where the dtb is)? (image_arg may then > be a > better name). > >>> > >>> The problem is arch.dtb_mem is currently defined only if > >>> CONFIG_KEXEC_FILE. > >> > >> I thought it was ARCH_HAS_KIMAGE_ARCH, which we can define all the time if > >> that's what we want. > >> > >> > >>> So I would like to > >>> - merge this patch with patch#8 > >>> - change the condition > >>> #ifdef CONFIG_KEXEC_FILE > >>> kimage->file_mode ? > >>> kimage->arch.dtb_mem : 0); We don't need "kimage->file_mode ?" since arch.dtb_mem is 0 if !file_mode. > >>> #else > >>> 0); > >>> #endif > >> > >> If we can avoid even this #ifdef by always having kimage->arch, I'd prefer > >> that. > >> If we do that 'dtb_mem' would need some thing that indicates its for > >> kexec_file, > >> as kexec has a DTB too, we just don't know where it is... > > > > OK, but I want to have a minimum of kexec.arch always exist. > > I'm curious, why? Its 32bytes that is allocated a maximum of twice. I believe that I'm a stingy minimalist :) > (my questions on what needs to go in there were because it looked like a third > user was missing...) > > > > How about this? > > > > | struct kimage_arch { > > | phys_addr_t dtb_mem; > > | #ifdef CONFIG_KEXEC_FILE > > #ifdef in structs just breeds more #ifdefs, as the code that accesses those > members has to be behind the same set of conditions. > > Given this, I prefer the #ifdefs around cpu_soft_restart() as it doesn't force > us to add more #ifdefs later. OK > For either option without purgatory_info: > Reviewed-by: James Morse Thanks, -Takahiro AKASHI > > Thanks, > > James
[PATCH v3] f2fs: Fix deadlock in shutdown ioctl
f2fs_ioc_shutdown() ioctl gets stuck in the below path when issued with F2FS_GOING_DOWN_FULLSYNC option. __switch_to+0x90/0xc4 percpu_down_write+0x8c/0xc0 freeze_super+0xec/0x1e4 freeze_bdev+0xc4/0xcc f2fs_ioctl+0xc0c/0x1ce0 f2fs_compat_ioctl+0x98/0x1f0 Signed-off-by: Sahitya Tummala --- fs/f2fs/file.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 6b94f19..5d99fd1 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1851,9 +1851,11 @@ static int f2fs_ioc_shutdown(struct file *filp, unsigned long arg) if (get_user(in, (__u32 __user *)arg)) return -EFAULT; - ret = mnt_want_write_file(filp); - if (ret) - return ret; + if (in != F2FS_GOING_DOWN_FULLSYNC) { + ret = mnt_want_write_file(filp); + if (ret) + return ret; + } switch (in) { case F2FS_GOING_DOWN_FULLSYNC: @@ -1894,7 +1896,8 @@ static int f2fs_ioc_shutdown(struct file *filp, unsigned long arg) f2fs_update_time(sbi, REQ_TIME); out: - mnt_drop_write_file(filp); + if (in != F2FS_GOING_DOWN_FULLSYNC) + mnt_drop_write_file(filp); return ret; } -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH v2] Print the memcg's name when system-wide OOM happened
On Fri 18-05-18 04:07:14, ufo19890607 wrote: > From: yuzhoujian > > The dump_header does not print the memcg's name when the system > oom happened. So users cannot locate the certain container which > contains the task that has been killed by the oom killer. System > oom report will contain the memcg's name after this patch. It would be great to mention what you can the name for. > Changes since v1: > - replace adding mem_cgroup_print_oom_info with printing the memcg's > name only. > > Signed-off-by: yuzhoujian > --- > mm/oom_kill.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 8ba6cb88cf58..b0abb5930232 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -433,6 +433,9 @@ static void dump_header(struct oom_control *oc, struct > task_struct *p) > if (is_memcg_oom(oc)) > mem_cgroup_print_oom_info(oc->memcg, p); > else { > + pr_info("Task in "); > + pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id)); > + pr_cont(" killed as a result of limit of "); > show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask); > if (is_dump_unreclaim_slabs()) > dump_unreclaimable_slab(); I bet this doesn't compile with CONFIG_MEMCG=n. You either need to put these pr_info lines inside ifdef CONFIG_MEMCG or add helper. The later would reduce code duplication. -- Michal Hocko SUSE Labs
Re: [PATCH -tip v3 4/7] kprobes: Ignore break_handler
* Masami Hiramatsu wrote: > Ignore break_handler related code because it was only > used by jprobe and jprobe is removed. I changed this description to: Subject: kprobes: Don't call the ->break_handler() in generic kprobes code Don't call the ->break_handler() from the core kprobes code, because it was only used by jprobes which got removed. ( In a followup patch we'll remove the remaining calls in low level arch handlers as well and remove the callback altogether. ) Please try to be a lot less vague in changelogs when it's possible and relevant! I.e. saying "Ignore break_handler related code" is annoyingly vague, it doesn't explain things well at all. Saying "Don't call the ->break_handler()" is just as compact, yet it also makes it very clear what's done in the patch... Thanks, Ingo
Re: [PATCHv2][SMB3] Add kernel trace support
On Thu, May 17, 2018 at 10:28 PM, Ronnie Sahlberg wrote: > Very nice. > > Acked-by: Ronnie Sahlberg > > Possibly change the output from > pid=6633 tid=0x0 sid=0x0 cmd=0 mid=0 > to > cmd=0 mid=0 pid=6633 tid=0x0 sid=0x0 > > just to make it easier for human-searching. I think the cmd will be useful > much more often than pid/tid/sid > and this would make it easier to look for as all cmd= entries will be aligned > to the same column. My instinct is to preserve the consistency by beginning with the the fields that will be in 90% of the commands: tree id and session id (tid and sid), which would cause pid to move after sid or after cmd, but I would prefer to wait on reordering fields and fixing alignment till we add another set of tracepoints (e.g. in FreeXid, and in SMB2_open and in the caller of negprot/sessionsetup) - we should then have a better idea what formatting would make it slightly more consistent and readable.
Re: [PATCH -tip v3 4/7] kprobes: Ignore break_handler
* Masami Hiramatsu wrote: > Ignore break_handler related code because it was only > used by jprobe and jprobe is removed. > > Signed-off-by: Masami Hiramatsu > --- > Documentation/kprobes.txt |2 +- > kernel/kprobes.c | 39 +-- > 2 files changed, 6 insertions(+), 35 deletions(-) Even after all these changes there's quite a bit of ->break_handler() use: triton:~/tip> git grep -w break_handler arch/arc/kernel/kprobes.c: if (p->break_handler && p->break_handler(p, regs)) { arch/arm/probes/kprobes/core.c: * for calling the break_handler below on re-entry, arch/arm/probes/kprobes/core.c: if (cur->break_handler && cur->break_handler(cur, regs)) { arch/arm64/kernel/probes/kprobes.c: * for calling the break_handler below on re-entry, arch/arm64/kernel/probes/kprobes.c: if (cur_kprobe->break_handler && arch/arm64/kernel/probes/kprobes.c: cur_kprobe->break_handler(cur_kprobe, regs)) { arch/ia64/kernel/kprobes.c: if (p->break_handler && p->break_handler(p, regs)) { arch/mips/kernel/kprobes.c: if (p->break_handler && p->break_handler(p, regs)) arch/powerpc/kernel/kprobes.c: if (p->break_handler && p->break_handler(p, regs)) { arch/s390/kernel/kprobes.c: * for calling the break_handler below on re-entry arch/s390/kernel/kprobes.c: if (p->break_handler && p->break_handler(p, regs)) { arch/s390/kernel/kprobes.c: * break_handler "returns" to the original arch/sh/kernel/kprobes.c: if (p->break_handler && p->break_handler(p, regs)) { arch/sh/kernel/kprobes.c: if (p->break_handler && arch/sh/kernel/kprobes.c: p->break_handler(p, args->regs)) arch/sparc/kernel/kprobes.c:if (p->break_handler && p->break_handler(p, regs)) arch/x86/include/asm/kprobes.h: * a post_handler or break_handler). include/linux/kprobes.h:kprobe_break_handler_t break_handler; Nothing appears to be actually _setting_ the break_handler, so all of this is stale code as well which should be removed? Thanks, Ingo
Re: [PATCH 2/2] ACPI: EC: Dispatch the EC GPE directly on s2idle wake
Hi Rafael, I love your patch! Yet something to improve: [auto build test ERROR on pm/linux-next] [also build test ERROR on v4.17-rc5] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Rafael-J-Wysocki/ACPI-PM-Dispatch-EC-GPE-early-on-s2idle-resume-if-LPS0-_DSM-is-used/20180518-070817 base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next config: arm64-defconfig (attached as .config) compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=arm64 All errors (new ones prefixed by >>): drivers/acpi/ec.o: In function `acpi_ec_dispatch_gpe': >> ec.c:(.text+0x239c): undefined reference to `acpi_dispatch_gpe' ec.c:(.text+0x239c): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `acpi_dispatch_gpe' --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH v4 4/4] tpm: migrate tpm2_get_random() to use struct tpm_buf
On 03/26/2018 05:44 PM, Jarkko Sakkinen wrote: In order to make struct tpm_buf the first class object for constructing TPM commands, migrate tpm2_get_random() to use it. In addition, removed remaining references to struct tpm2_cmd. All of them use it to acquire the length of the response, which can be achieved by using tpm_buf_length(). Signed-off-by: Jarkko Sakkinen --- drivers/char/tpm/tpm.h | 19 - drivers/char/tpm/tpm2-cmd.c | 98 ++--- 2 files changed, 49 insertions(+), 68 deletions(-) diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h index 7f2d0f489e9c..aa849a1b2641 100644 --- a/drivers/char/tpm/tpm.h +++ b/drivers/char/tpm/tpm.h @@ -421,23 +421,24 @@ struct tpm_buf { u8 *data; }; -static inline int tpm_buf_init(struct tpm_buf *buf, u16 tag, u32 ordinal) +static inline void tpm_buf_reset(struct tpm_buf *buf, u16 tag, u32 ordinal) { struct tpm_input_header *head; + head = (struct tpm_input_header *)buf->data; + head->tag = cpu_to_be16(tag); + head->length = cpu_to_be32(sizeof(*head)); + head->ordinal = cpu_to_be32(ordinal); +} +static inline int tpm_buf_init(struct tpm_buf *buf, u16 tag, u32 ordinal) +{ buf->data_page = alloc_page(GFP_HIGHUSER); if (!buf->data_page) return -ENOMEM; buf->flags = 0; buf->data = kmap(buf->data_page); - - head = (struct tpm_input_header *) buf->data; - - head->tag = cpu_to_be16(tag); - head->length = cpu_to_be32(sizeof(*head)); - head->ordinal = cpu_to_be32(ordinal); - + tpm_buf_reset(buf, tag, ordinal); return 0; } @@ -566,7 +567,7 @@ static inline u32 tpm2_rc_value(u32 rc) int tpm2_pcr_read(struct tpm_chip *chip, int pcr_idx, u8 *res_buf); int tpm2_pcr_extend(struct tpm_chip *chip, int pcr_idx, u32 count, struct tpm2_digest *digests); -int tpm2_get_random(struct tpm_chip *chip, u8 *out, size_t max); +int tpm2_get_random(struct tpm_chip *chip, u8 *dest, size_t max); void tpm2_flush_context_cmd(struct tpm_chip *chip, u32 handle, unsigned int flags); int tpm2_seal_trusted(struct tpm_chip *chip, diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c index b3b52f9eb65f..d5c222f98515 100644 --- a/drivers/char/tpm/tpm2-cmd.c +++ b/drivers/char/tpm/tpm2-cmd.c @@ -27,25 +27,6 @@ enum tpm2_session_attributes { TPM2_SA_CONTINUE_SESSION= BIT(0), }; -struct tpm2_get_random_in { - __be16 size; -} __packed; - -struct tpm2_get_random_out { - __be16 size; - u8 buffer[TPM_MAX_RNG_DATA]; -} __packed; - -union tpm2_cmd_params { - struct tpm2_get_random_in getrandom_in; - struct tpm2_get_random_out getrandom_out; -}; - -struct tpm2_cmd { - tpm_cmd_header header; - union tpm2_cmd_params params; -} __packed; - struct tpm2_hash { unsigned int crypto_id; unsigned int tpm_id; @@ -300,67 +281,70 @@ int tpm2_pcr_extend(struct tpm_chip *chip, int pcr_idx, u32 count, } -#define TPM2_GETRANDOM_IN_SIZE \ - (sizeof(struct tpm_input_header) + \ -sizeof(struct tpm2_get_random_in)) - -static const struct tpm_input_header tpm2_getrandom_header = { - .tag = cpu_to_be16(TPM2_ST_NO_SESSIONS), - .length = cpu_to_be32(TPM2_GETRANDOM_IN_SIZE), - .ordinal = cpu_to_be32(TPM2_CC_GET_RANDOM) -}; +struct tpm2_get_random_out { + __be16 size; + u8 buffer[TPM_MAX_RNG_DATA]; +} __packed; /** * tpm2_get_random() - get random bytes from the TPM RNG * * @chip: TPM chip to use - * @out: destination buffer for the random bytes + * @dest: destination buffer for the random bytes * @max: the max number of bytes to write to @out * * Return: - *Size of the output buffer, or -EIO on error. + * size of the output buffer when the operation is successful. + * A negative number for system errors (errno). */ -int tpm2_get_random(struct tpm_chip *chip, u8 *out, size_t max) +int tpm2_get_random(struct tpm_chip *chip, u8 *dest, size_t max) { - struct tpm2_cmd cmd; - u32 recd, rlength; - u32 num_bytes; + struct tpm2_get_random_out *out; + struct tpm_buf buf; + u32 recd; + u32 num_bytes = max; int err; int total = 0; int retries = 5; - u8 *dest = out; - - num_bytes = min_t(u32, max, sizeof(cmd.params.getrandom_out.buffer)); + u8 *dest_ptr = dest; - if (!out || !num_bytes || - max > sizeof(cmd.params.getrandom_out.buffer)) + if (!num_bytes || max > TPM_MAX_RNG_DATA) return -EINVAL; - do { - cmd.header.in = tpm2_getrandom_header; - cmd.params.getrandom_in.size = cpu_to_be16(num_bytes); + err = tpm_buf_init(&buf, 0, 0); + if (err) + return err; - err = tpm_transmit_cmd(chip, NULL, &cmd, s
Re: mmotm 2018-05-17-16-26 uploaded (autofs)
On 18/05/18 12:38, Ian Kent wrote: > On 18/05/18 12:23, Randy Dunlap wrote: >> On 05/17/2018 08:50 PM, Ian Kent wrote: >>> On 18/05/18 08:21, Randy Dunlap wrote: On 05/17/2018 04:26 PM, a...@linux-foundation.org wrote: > The mm-of-the-moment snapshot 2018-05-17-16-26 has been uploaded to > >http://www.ozlabs.org/~akpm/mmotm/ > > mmotm-readme.txt says > > README for mm-of-the-moment: > > http://www.ozlabs.org/~akpm/mmotm/ > > This is a snapshot of my -mm patch queue. Uploaded at random hopefully > more than once a week. > > You will need quilt to apply these patches to the latest Linus release > (4.x > or 4.x-rcY). The series file is in broken-out.tar.gz and is duplicated in > http://ozlabs.org/~akpm/mmotm/series > > The file broken-out.tar.gz contains two datestamp files: .DATE and > .DATE--mm-dd-hh-mm-ss. Both contain the string -mm-dd-hh-mm-ss, > followed by the base kernel version against which this patch series is to > be applied. > > This tree is partially included in linux-next. To see which patches are > included in linux-next, consult the `series' file. Only the patches > within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in > linux-next. > > A git tree which contains the memory management portion of this tree is > maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git > by Michal Hocko. It contains the patches which are between the > "#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series > file, http://www.ozlabs.org/~akpm/mmotm/series. > > > A full copy of the full kernel tree with the linux-next and mmotm patches > already applied is available through git within an hour of the mmotm > release. Individual mmotm releases are tagged. The master branch always > points to the latest release, so it's constantly rebasing. on x86_64: with (randconfig): CONFIG_AUTOFS_FS=y CONFIG_AUTOFS4_FS=y >>> >>> Oh right, I need to make these exclusive. >>> >>> I seem to remember trying to do that along the way, can't remember why >>> I didn't do it in the end. >>> >>> Any suggestions about potential problems when doing it? >> >> I think that just using "depends on" for each of them will cause kconfig to >> complain about circular dependencies, so probably using "choice" will be >> needed. Or (since this is just temporary?) just say "don't do that." >> > > No doubt that was what happened, unfortunately I forgot to return to it. > > Right, a conditional with a message should work thanks. It looks like adding: depends on AUTOFS_FS = n && AUTOFS_FS != m to autofs4/Kconfig results in autofs4 appearing under the autofs entry if AUTOFS_FS is not set which should call attention to it. It also results in AUTOFS4_FS=n for any setting of AUTOFS_FS except n. Together with some words about it in the AUTOFS4_FS help it should be enough to raise awareness of the change. Ian
Re: [PATCH v10 25/27] ARM: davinci: add device tree support to timer
On Thursday 17 May 2018 08:39 PM, David Lechner wrote: > On 05/17/2018 09:35 AM, Sekhar Nori wrote: >> Hi David, >> >> On Wednesday 09 May 2018 10:56 PM, David Lechner wrote: >>> This adds device tree support to the davinci timer so that when clocks >>> are moved to device tree, the timer will still work. >>> >>> Signed-off-by: David Lechner >>> --- >> >>> +static int __init of_davinci_timer_init(struct device_node *np) >>> +{ >>> + struct clk *clk; >>> + >>> + clk = of_clk_get(np, 0); >>> + if (IS_ERR(clk)) { >>> + struct of_phandle_args clkspec; >>> + >>> + /* >>> + * Fall back to using ref_clk if the actual clock is not >>> + * available. There will be problems later if the real clock >>> + * source is disabled. >>> + */ >>> + >>> + pr_warn("%s: falling back to ref_clk\n", __func__); >>> + >>> + clkspec.np = of_find_node_by_name(NULL, "ref_clk"); >>> + if (IS_ERR(clkspec.np)) { >>> + pr_err("%s: No clock available for timer!\n", __func__); >>> + return PTR_ERR(clkspec.np); >>> + } >>> + clk = of_clk_get_from_provider(&clkspec); >>> + of_node_put(clkspec.np); >>> + } >> >> Do we need this error path now? >> >> Thanks, >> Sekhar >> > > No, not really. Then lets just print an error and return the error number. Thanks, Sekhar
RE: [RFC PATCH 00/09] Implement direct user I/O interfaces for RDMA
> Subject: Re: [RFC PATCH 00/09] Implement direct user I/O interfaces for > RDMA > > On 5/17/2018 8:22 PM, Long Li wrote: > > From: Long Li > > > > This patchset implements direct user I/O through RDMA. > > > > In normal code path (even with cache=none), CIFS copies I/O data from > > user-space to kernel-space for security reasons. > > > > With this patchset, a new mounting option is introduced to have CIFS > > pin the user-space buffer into memory and performs I/O through RDMA. > > This avoids memory copy, at the cost of added security risk. > > What's the security risk? This type of direct i/o behavior is not uncommon, > and can certainly be made safe, using the appropriate memory registration > and protection domains. Any risk needs to be stated explicitly, and mitigation > provided, or at least described. I think it's an assumption that user-mode buffer can't be trusted, so CIFS always copies them into internal buffers, and calculate signature and encryption based on protocol used. With the direct buffer, the user can potentially modify the buffer when signature or encryption is in progress or after they are done. I also want to point out that, I choose to implement .read_iter and .write_iter from file_operations to implement direct I/O (CIFS is already doing this for O_DIRECT, so following this code path will avoid a big mess up). The ideal choice is to implement .direct_IO from address_space_operations that I think eventually we want to move to. > > Tom. > > > > > This patchset is RFC. The work is in progress, do not merge. > > > > > > Long Li (9): > >Introduce offset for the 1st page in data transfer structures > >Change wdata alloc to support direct pages > >Change rdata alloc to support direct pages > >Change function to support offset when reading pages > >Change RDMA send to regonize page offset in the 1st page > >Change RDMA recv to support offset in the 1st page > >Support page offset in memory regsitrations > >Implement no-copy file I/O interfaces > >Introduce cache=rdma moutning option > > > > > > fs/cifs/cifs_fs_sb.h | 2 + > > fs/cifs/cifsfs.c | 19 +++ > > fs/cifs/cifsfs.h | 3 + > > fs/cifs/cifsglob.h| 6 + > > fs/cifs/cifsproto.h | 4 +- > > fs/cifs/cifssmb.c | 10 +- > > fs/cifs/connect.c | 13 +- > > fs/cifs/dir.c | 5 + > > fs/cifs/file.c| 351 > ++ > > fs/cifs/inode.c | 4 +- > > fs/cifs/smb2ops.c | 2 +- > > fs/cifs/smb2pdu.c | 22 ++- > > fs/cifs/smbdirect.c | 132 ++--- > > fs/cifs/smbdirect.h | 2 +- > > fs/read_write.c | 7 + > > include/linux/ratelimit.h | 2 +- > > 16 files changed, 489 insertions(+), 95 deletions(-) > >
Re: [PATCH ghak81 V3a] fixup! audit: collect audit task parameters
* Richard Guy Briggs wrote: > Enable fork.c compilation with audit disabled. > > Signed-off-by: Richard Guy Briggs > --- > Hi Paul, this one got caught by the 0-day kbuildbot. Can you squash it > down if you haven't merged it yet? > --- > kernel/fork.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/kernel/fork.c b/kernel/fork.c > index 92ab849..ff82928 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -1713,7 +1713,9 @@ static __latent_entropy struct task_struct > *copy_process( > p->start_time = ktime_get_ns(); > p->real_start_time = ktime_get_boot_ns(); > p->io_context = NULL; > +#ifdef CONFIG_AUDITSYSCALL > p->audit = NULL; > +#endif /* CONFIG_AUDITSYSCALL */ Please, simply use: #endif the comment is used for (much) larger blocks, to make it clear which block ends there if the top is not visible. Thanks, Ingo
Re: [PATCH] scripts/tags.sh: don't rely on parsing `ls` for $ALLSOURCE_ARCHS generation
On Fri, May 18, 2018 at 02:46:32PM +0900, Masahiro Yamada wrote: > Andrew picked it up, but this patch is *bad* > > You missed arch/Kconfig. > > $(cd "${tree}arch/" && echo *) > contains Kconfig, but it is not arch. That was also something that I found a bit weird myself, but I had assumed there was a good reason for keeping that. The original command also returns a string containing Kconfig: > tree="$PWD/" > echo "$tree" > ALLSOURCE_ARCHS="" > for arch in `ls ${tree}arch`; do > ALLSOURCE_ARCHS="${ALLSOURCE_ARCHS} "${arch##\/} > done > echo "$ALLSOURCE_ARCHS"' gives the same output as my command (albeit with an extra leading space that shouldn't be important): > /store/code/projects/kernel/linux/ > Kconfig alpha arc arm arm64 c6x h8300 hexagon ia64 m68k microblaze mips > nds32 nios2 openrisc parisc powerpc riscv s390 sh sparc um unicore32 x86 > xtensa However, if there really is no reason for that being there, I have no complaints against fixing it. I'll send a v3 in a bit. -- Cheers, Joey Pabalinas signature.asc Description: PGP signature
Re: Some questions about the spi mem framework
Hi Boris, On Thu, 2018-05-17 at 09:42 +0200, Boris Brezillon wrote: > On Thu, 17 May 2018 15:35:04 +0800 > Xiangsheng Hou wrote: > > > On Thu, 2018-05-17 at 09:13 +0200, Boris Brezillon wrote: > > > On Thu, 17 May 2018 14:58:24 +0800 > > > Xiangsheng Hou wrote: > > > > > > > Hi Boris, > > > > > > > > On Wed, 2018-05-16 at 14:42 +0200, Boris Brezillon wrote: > > > > > On Wed, 16 May 2018 20:11:39 +0800 > > > > > Xiangsheng Hou wrote: > > > > > > > > > > > Hi Boris, > > > > > > > > > > > > On Tue, 2018-05-15 at 17:25 +0200, Boris Brezillon wrote: > > > > > > > Hi, > > > > > > > > > > > > > > On Tue, 15 May 2018 11:43:20 +0800 > > > > > > > Xiangsheng Hou wrote: > > > > > > > > > > > > > > > Hello Boris, > > > > > > > > > > > > > > > > I have seen you are working on extend the framework to > > > > > > > > generically > > > > > > > > support spi memory devices. > > > > > > > > And, I am working on upstream SPI Nand driver of Mediatek SPI > > > > > > > > NAND > > > > > > > > controller based on your branch[1]. > > > > > > > > > > > > > > Great! > > > > > > > > > > > > > > > I have some questions need your comment. > > > > > > > > > > > > > > > > 1) There is a difference between different SPI NAND Flash when > > > > > > > > using the > > > > > > > > Quad SPI command,for example Macronix,Etron and GigaDevice, > > > > > > > > Quad SPI commands require the Quad Enable bit in Status > > > > > > > > Register(B0H) to > > > > > > > > be set. > > > > > > > > However, current spi-mem framework does not have this operation, > > > > > > > > do you have a plan to support it? > > > > > > > > > > > > > > I added support for the QE bit in the v7 I sent just a few > > > > > > > minutes ago > > > > > > > [1]. > > > > > > > > > > > > Ok,I have studied v7. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2) I see that current spi-mem framework doesn't support ECC, > > > > > > > > But we need ECC, and we use Mediatek controller's HW ECC > > > > > > > > instead of spi nand on-chip ECC, > > > > > > > > maybe other companies also have this behavior, > > > > > > > > So the ECC part must be implemented in controller's driver. > > > > > > > > Will you abstract ECC interface in future? > > > > > > > > > > > > > > Well, I added support for on-die ECC in my v7 since all chips > > > > > > > seem to > > > > > > > provide this feature. I was initially planning on abstracting ECC > > > > > > > engines, but I decided to go for a simpler approach and only > > > > > > > support > > > > > > > on-die ECC. That does not mean we shouldn't work on this "ECC > > > > > > > engine > > > > > > > abstraction", just that I wanted to get something out and didn't > > > > > > > have > > > > > > > time to spend on this topic. > > > > > > > > > > > > > > I'd be happy if someone else could work on that aspect though. > > > > > > > BTW, do > > > > > > > you plan to use this engine [2], or is this yet another ECC > > > > > > > engine? > > > > > > > > > > > > Yes,I plan to use this ecc engine[2]. > > > > > > > > > > Cool. That probably means we'll have to move the driver one level up > > > > > (in drivers/mtd/nand) and work on this ECC engine interface I was > > > > > talking about. > > > > > > > > > > > > > 3) You know, some nand controller need configure their > > > > > > > > registers when > > > > > > > > getting some information(page size, spare size) of nand flash, > > > > > > > > But the spi-mem framework doesn't has an interface for scanning > > > > > > > > NAND > > > > > > > > flash, when controller driver initialization. > > > > > > > > > > > > > > You seem to mix 2 different things: > > > > > > > - spi-mem: this is generic interface provided by the SPI > > > > > > > framework to > > > > > > > send spi_mem_op. There's nothing NOR or NAND specific in there, > > > > > > > and > > > > > > > I'd like it to stay like that as much as possible > > > > > > > - spinand: this the spi-mem driver that is dealing with SPI NAND > > > > > > > devices, and this is where all the code related to SPI NAND > > > > > > > support > > > > > > > should end up. > > > > > > > > > > > > > > Can you tell me exactly why your SPI controller needs such a > > > > > > > detailed > > > > > > > description? Is it able to program/read pages or erase blocks on > > > > > > > its > > > > > > > own? Do you have a spec of this controller publicly available? > > > > > > > > > > > > > > > > > > > For Mediatek SPI Nand controller,I have to configure registers for > > > > > > ECC > > > > > > engine,page format and spare format according to nand information > > > > > > just > > > > > > like[3] in mtk_nfc_hw_runtime_config() function. > > > > > > > > > > So it's all related to the NAND controller, nothing specific to the > > > > > SPI > > > > > controller, right? > > > > > > > > Yes,we use NAND controller rather than SPI controller. > > >
Re: [PATCH] scripts/tags.sh: don't rely on parsing `ls` for $ALLSOURCE_ARCHS generation
2018-05-16 9:13 GMT+09:00 Joey Pabalinas : > Parsing `ls` is fragile at best and _will_ fail when $tree > contains spaces. Replace this with a glob-generated string > and directly assign it to $ALLSOURCE_ARCHS; use a subshell > so `cd` doesn't affect the current working directory. > > Signed-off-by: Joey Pabalinas > > 1 file changed, 1 insertion(+), 4 deletions(-) Andrew picked it up, but this patch is *bad* You missed arch/Kconfig. $(cd "${tree}arch/" && echo *) contains Kconfig, but it is not arch. > diff --git a/scripts/tags.sh b/scripts/tags.sh > index 78e546ff689c2d5f40..b84acf8889fe836c60 100755 > --- a/scripts/tags.sh > +++ b/scripts/tags.sh > @@ -29,14 +29,11 @@ fi > ignore="$ignore ( -path ${tree}tools ) -prune -o" > > # Find all available archs > find_all_archs() > { > - ALLSOURCE_ARCHS="" > - for arch in `ls ${tree}arch`; do > - ALLSOURCE_ARCHS="${ALLSOURCE_ARCHS} "${arch##\/} > - done > + ALLSOURCE_ARCHS="$( (cd "${tree}arch/" && echo *) )" > } > > # Detect if ALLSOURCE_ARCHS is set. If not, we assume SRCARCH > if [ "${ALLSOURCE_ARCHS}" = "" ]; then > ALLSOURCE_ARCHS=${SRCARCH} > -- > 2.17.0.rc1.35.g90bbd502d54fe92035.dirty > -- Best Regards Masahiro Yamada
Re: [RFC PATCH linux-next] USB: dwc3: dwc3_get_extcon() can be static
On 17.05.2018 18:06, kbuild test robot wrote: > Fixes: 5f0b74e54890 ("USB: dwc3: get extcon device by OF graph bindings") > Signed-off-by: kbuild test robot It should be static of course, my bad. Reviewed-by: Andrzej Hajda -- Regards Andrzej > --- > drd.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/usb/dwc3/drd.c b/drivers/usb/dwc3/drd.c > index 2706824..218371f 100644 > --- a/drivers/usb/dwc3/drd.c > +++ b/drivers/usb/dwc3/drd.c > @@ -440,7 +440,7 @@ static int dwc3_drd_notifier(struct notifier_block *nb, > return NOTIFY_DONE; > } > > -struct extcon_dev *dwc3_get_extcon(struct dwc3 *dwc) > +static struct extcon_dev *dwc3_get_extcon(struct dwc3 *dwc) > { > struct device *dev = dwc->dev; > struct device_node *np_phy, *np_conn; > > >
Re: [PATCH v3 1/2] media: rc: introduce BPF_PROG_RAWIR_EVENT
On Thu, May 17, 2018 at 2:45 PM, Sean Young wrote: > Hi, > > Again thanks for a thoughtful review. This will definitely will improve > the code. > > On Thu, May 17, 2018 at 10:02:52AM -0700, Y Song wrote: >> On Wed, May 16, 2018 at 2:04 PM, Sean Young wrote: >> > Add support for BPF_PROG_RAWIR_EVENT. This type of BPF program can call >> > rc_keydown() to reported decoded IR scancodes, or rc_repeat() to report >> > that the last key should be repeated. >> > >> > The bpf program can be attached to using the bpf(BPF_PROG_ATTACH) syscall; >> > the target_fd must be the /dev/lircN device. >> > >> > Signed-off-by: Sean Young >> > --- >> > drivers/media/rc/Kconfig | 13 ++ >> > drivers/media/rc/Makefile | 1 + >> > drivers/media/rc/bpf-rawir-event.c | 363 + >> > drivers/media/rc/lirc_dev.c| 24 ++ >> > drivers/media/rc/rc-core-priv.h| 24 ++ >> > drivers/media/rc/rc-ir-raw.c | 14 +- >> > include/linux/bpf_rcdev.h | 30 +++ >> > include/linux/bpf_types.h | 3 + >> > include/uapi/linux/bpf.h | 55 - >> > kernel/bpf/syscall.c | 7 + >> > 10 files changed, 531 insertions(+), 3 deletions(-) >> > create mode 100644 drivers/media/rc/bpf-rawir-event.c >> > create mode 100644 include/linux/bpf_rcdev.h >> > >> > diff --git a/drivers/media/rc/Kconfig b/drivers/media/rc/Kconfig >> > index eb2c3b6eca7f..2172d65b0213 100644 >> > --- a/drivers/media/rc/Kconfig >> > +++ b/drivers/media/rc/Kconfig >> > @@ -25,6 +25,19 @@ config LIRC >> >passes raw IR to and from userspace, which is needed for >> >IR transmitting (aka "blasting") and for the lirc daemon. >> > >> > +config BPF_RAWIR_EVENT >> > + bool "Support for eBPF programs attached to lirc devices" >> > + depends on BPF_SYSCALL >> > + depends on RC_CORE=y >> > + depends on LIRC >> > + help >> > + Allow attaching eBPF programs to a lirc device using the bpf(2) >> > + syscall command BPF_PROG_ATTACH. This is supported for raw IR >> > + receivers. >> > + >> > + These eBPF programs can be used to decode IR into scancodes, for >> > + IR protocols not supported by the kernel decoders. >> > + >> > menuconfig RC_DECODERS >> > bool "Remote controller decoders" >> > depends on RC_CORE >> > diff --git a/drivers/media/rc/Makefile b/drivers/media/rc/Makefile >> > index 2e1c87066f6c..74907823bef8 100644 >> > --- a/drivers/media/rc/Makefile >> > +++ b/drivers/media/rc/Makefile >> > @@ -5,6 +5,7 @@ obj-y += keymaps/ >> > obj-$(CONFIG_RC_CORE) += rc-core.o >> > rc-core-y := rc-main.o rc-ir-raw.o >> > rc-core-$(CONFIG_LIRC) += lirc_dev.o >> > +rc-core-$(CONFIG_BPF_RAWIR_EVENT) += bpf-rawir-event.o >> > obj-$(CONFIG_IR_NEC_DECODER) += ir-nec-decoder.o >> > obj-$(CONFIG_IR_RC5_DECODER) += ir-rc5-decoder.o >> > obj-$(CONFIG_IR_RC6_DECODER) += ir-rc6-decoder.o >> > diff --git a/drivers/media/rc/bpf-rawir-event.c >> > b/drivers/media/rc/bpf-rawir-event.c >> > new file mode 100644 >> > index ..7cb48b8d87b5 >> > --- /dev/null >> > +++ b/drivers/media/rc/bpf-rawir-event.c >> > @@ -0,0 +1,363 @@ >> > +// SPDX-License-Identifier: GPL-2.0 >> > +// bpf-rawir-event.c - handles bpf >> > +// >> > +// Copyright (C) 2018 Sean Young >> > + >> > +#include >> > +#include >> > +#include >> > +#include "rc-core-priv.h" >> > + >> > +/* >> > + * BPF interface for raw IR >> > + */ >> > +const struct bpf_prog_ops rawir_event_prog_ops = { >> > +}; >> > + >> > +BPF_CALL_1(bpf_rc_repeat, struct bpf_rawir_event*, event) >> > +{ >> > + struct ir_raw_event_ctrl *ctrl; >> > + >> > + ctrl = container_of(event, struct ir_raw_event_ctrl, >> > bpf_rawir_event); >> > + >> > + rc_repeat(ctrl->dev); >> > + >> > + return 0; >> > +} >> > + >> > +static const struct bpf_func_proto rc_repeat_proto = { >> > + .func = bpf_rc_repeat, >> > + .gpl_only = true, /* rc_repeat is EXPORT_SYMBOL_GPL */ >> > + .ret_type = RET_INTEGER, >> > + .arg1_type = ARG_PTR_TO_CTX, >> > +}; >> > + >> > +BPF_CALL_4(bpf_rc_keydown, struct bpf_rawir_event*, event, u32, protocol, >> > + u32, scancode, u32, toggle) >> > +{ >> > + struct ir_raw_event_ctrl *ctrl; >> > + >> > + ctrl = container_of(event, struct ir_raw_event_ctrl, >> > bpf_rawir_event); >> > + >> > + rc_keydown(ctrl->dev, protocol, scancode, toggle != 0); >> > + >> > + return 0; >> > +} >> > + >> > +static const struct bpf_func_proto rc_keydown_proto = { >> > + .func = bpf_rc_keydown, >> > + .gpl_only = true, /* rc_keydown is EXPORT_SYMBOL_GPL */ >> > + .ret_type = RET_INTEGER, >> > + .arg1_type = ARG_PTR_TO_CTX, >> > + .arg2_type = ARG_ANYTHING, >> > + .arg3_type = ARG_ANYTHING, >> > + .arg4_type = ARG_ANYTHING, >> > +}; >> > + >> > +static const struct bpf_func_proto * >> > +rawir_event_func_proto(enu
Re: [PATCH 1/4] soc: qcom: mdt_loader: Add check to make scm calls
On Thu 17 May 04:32 PDT 2018, Vikash Garodia wrote: > In order to invoke scm calls, ensure that the platform > has the required support to invoke the scm calls in > secure world. > > Signed-off-by: Vikash Garodia > --- > drivers/soc/qcom/mdt_loader.c | 21 + > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c > index 17b314d..db55d53 100644 > --- a/drivers/soc/qcom/mdt_loader.c > +++ b/drivers/soc/qcom/mdt_loader.c > @@ -121,10 +121,12 @@ int qcom_mdt_load(struct device *dev, const struct > firmware *fw, > if (!fw_name) > return -ENOMEM; > > - ret = qcom_scm_pas_init_image(pas_id, fw->data, fw->size); > - if (ret) { > - dev_err(dev, "invalid firmware metadata\n"); > - goto out; > + if (qcom_scm_is_available()) { qcom_scm_is_available() tells you if the qcom_scm driver has been probed, not if your platform implements PAS. Please add a DT property to tell the driver if it should require PAS or not (the absence of such property should indicate PAS is required, for backwards compatibility purposes). For the MDT loader we need to merge the following patch to make this work: https://patchwork.kernel.org/patch/10397889/ Regards, Bjorn
Re: [PATCH v2] ipc: Adding new return type vm_fault_t
On Wed, 16 May 2018, Souptick Joarder wrote: On Thu, May 10, 2018 at 7:34 PM, Souptick Joarder wrote: On Wed, Apr 25, 2018 at 10:04 AM, Souptick Joarder wrote: Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Commit 1c8f422059ae ("mm: change return type to vm_fault_t") Signed-off-by: Souptick Joarder Reviewed-by: Matthew Wilcox Acked-by: Davidlohr Bueso
[PATCH] perf/x86/intel/uncore: allocate pmu index for pci device dynamically
Some boxes/devices of uncore are exported as pcie devices. However, the box number is different on different micro-architecture. For example, the max memory channels for Broadwell is up to 8. However, there are only 2 channels for Broadwell-DE, 4 channels for Broadwell-EP, and 8 channels for Broadwell-EX. The current code allocates pmu index statically so that on Broadwell-EP machine "perf list|grep uncore" shows discontinuous iMC number, which doesn't look nice: Test on Broadwell-EP using "ls /sys/devices | grep -i imc": Without this patch, uncore_imc_0 uncore_imc_1 uncore_imc_4 uncore_imc_5 To maintain pmu index dynamically, move index allocation logic to uncore_pci_probe(). As a result, we can get continuous index of iMC devices under /sys/devices directory: Applied this patch:, uncore_imc_0 uncore_imc_1 uncore_imc_2 uncore_imc_3 Signed-off-by: Shanpei Chen Signed-off-by: Eric Ren --- arch/x86/events/intel/uncore.c | 7 ++- arch/x86/events/intel/uncore.h | 1 + 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c index a7956fc..88d390e 100644 --- a/arch/x86/events/intel/uncore.c +++ b/arch/x86/events/intel/uncore.c @@ -818,7 +818,9 @@ static int __init uncore_type_init(struct intel_uncore_type *type, bool setid) for (i = 0; i < type->num_boxes; i++) { pmus[i].func_id = setid ? i : -1; - pmus[i].pmu_idx = i; + /* The pmu idx will be decided at probe for pci device. */ + if (setid) + pmus[i].pmu_idx = i; pmus[i].type= type; pmus[i].boxes = kzalloc(size, GFP_KERNEL); if (!pmus[i].boxes) @@ -957,6 +959,9 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id if (atomic_inc_return(&pmu->activeboxes) > 1) return 0; + /* Count the real number of pmus for pci uncore device */ + pmu->pmu_idx = type->num_pmus++; + /* First active box registers the pmu */ ret = uncore_pmu_register(pmu); if (ret) { diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h index 414dc7e..c4f54fb 100644 --- a/arch/x86/events/intel/uncore.h +++ b/arch/x86/events/intel/uncore.h @@ -40,6 +40,7 @@ struct intel_uncore_type { const char *name; int num_counters; int num_boxes; + int num_pmus; /* for pci uncore device */ int perf_ctr_bits; int fixed_ctr_bits; unsigned perf_ctr; -- 1.8.3.1
[RFC PATCH net-next] tcp: tcp_rack_reo_wnd() can be static
Fixes: 20b654dfe1be ("tcp: support DUPACK threshold in RACK") Signed-off-by: kbuild test robot --- tcp_recovery.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c index 30cbfb6..71593e4 100644 --- a/net/ipv4/tcp_recovery.c +++ b/net/ipv4/tcp_recovery.c @@ -21,7 +21,7 @@ static bool tcp_rack_sent_after(u64 t1, u64 t2, u32 seq1, u32 seq2) return t1 > t2 || (t1 == t2 && after(seq1, seq2)); } -u32 tcp_rack_reo_wnd(const struct sock *sk) +static u32 tcp_rack_reo_wnd(const struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk);
[net-next:master 1200/1233] net/ipv4/tcp_recovery.c:24:5: sparse: symbol 'tcp_rack_reo_wnd' was not declared. Should it be static?
tree: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master head: 538e2de104cfb4ef1acb35af42427bff42adbe4d commit: 20b654dfe1beaca60ab51894ff405a049248433d [1200/1233] tcp: support DUPACK threshold in RACK reproduce: # apt-get install sparse git checkout 20b654dfe1beaca60ab51894ff405a049248433d make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by >>) net/ipv4/tcp_recovery.c:46:16: sparse: expression using sizeof(void) net/ipv4/tcp_recovery.c:46:16: sparse: expression using sizeof(void) >> net/ipv4/tcp_recovery.c:24:5: sparse: symbol 'tcp_rack_reo_wnd' was not >> declared. Should it be static? include/net/tcp.h:738:16: sparse: expression using sizeof(void) net/ipv4/tcp_recovery.c:102:40: sparse: expression using sizeof(void) net/ipv4/tcp_recovery.c:102:40: sparse: expression using sizeof(void) include/net/tcp.h:738:16: sparse: expression using sizeof(void) net/ipv4/tcp_recovery.c:210:42: sparse: expression using sizeof(void) Please review and possibly fold the followup patch. --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
Re: [PATCH v2] kbuild: check for pkg-config on make {menu,n,g,x}config
Hi Randy, 2018-04-07 6:37 GMT+09:00 Randy Dunlap : > On 03/14/2018 10:50 PM, Masahiro Yamada wrote: >> 2018-03-13 11:30 GMT+09:00 Randy Dunlap : >>> From: Randy Dunlap >>> >>> Each of 'make {menu,n,g,x}config' uses (needs) pkg-config to make sure >>> that other required files are present, but none of these check that >>> pkg-config itself is present. Add a check for all 4 of these targets. >>> >>> Fixes kernel bugzilla #77511: >>> https://bugzilla.kernel.org/show_bug.cgi?id=77511 >>> >>> Signed-off-by: Randy Dunlap >>> --- >>> v2: use 'command -v' instead of 'which' >>> >>> I'm also OK with just documenting the pkg-config requirement in >>> Documentation/Changes (= Documentation/process/changes.rst). >>> >>> scripts/kconfig/Makefile | 15 ++- >>> scripts/kconfig/check-pkgconfig.sh | 12 >>> 2 files changed, 26 insertions(+), 1 deletion(-) >>> >>> --- lnx-416-rc3.orig/scripts/kconfig/Makefile >>> +++ lnx-416-rc3/scripts/kconfig/Makefile >>> @@ -160,6 +160,9 @@ help: >>> @echo ' xenconfig - Enable additional options for xen dom0 >>> and guest kernel support' >>> @echo ' tinyconfig - Configure the tiniest possible kernel' >>> >>> +# pkg-config check >>> +check-pkgconfig := $(srctree)/$(src)/check-pkgconfig.sh >>> + >>> # lxdialog stuff >>> check-lxdialog := $(srctree)/$(src)/lxdialog/check-lxdialog.sh >>> >>> @@ -205,7 +208,17 @@ $(addprefix $(obj)/, mconf.o $(lxdialog) >>> $(obj)/dochecklxdialog: >>> $(Q)$(CONFIG_SHELL) $(check-lxdialog) -check $(HOSTCC) >>> $(HOST_EXTRACFLAGS) $(HOSTLOADLIBES_mconf) >>> >>> -always := dochecklxdialog >>> +# Check that we have pkg-config (used by each of menu/n/g/xconfig) >>> +PHONY += $(obj)/docheckpkgconfig >>> +$(addprefix $(obj)/, mconf.o): $(obj)/docheckpkgconfig >>> +$(addprefix $(obj)/, nconf.o): $(obj)/docheckpkgconfig >>> +$(addprefix $(obj)/, gconf.o): $(obj)/docheckpkgconfig >>> +$(addprefix $(obj)/, qconf.o): $(obj)/docheckpkgconfig >>> + >>> +$(obj)/docheckpkgconfig: >>> + $(Q)$(CONFIG_SHELL) $(check-pkgconfig) >>> + >>> +always := docheckpkgconfig dochecklxdialog >> >> >> I did not test this patch, but does this check work as expected? >> >> Probably we want to run 'docheckpkgconfig' >> before 'dochecklxdiag', '.tmp_gtkcheck', '.tmp_qtcheck', etc. >> But, I do not see such dependencies. >> >> >> Also, if we make 'pkg-config' mandatory, >> should we entirely drop fall-back logics like follows? >> >> https://github.com/torvalds/linux/blob/v4.16-rc5/scripts/kconfig/lxdialog/check-lxdialog.sh#L10 >> https://github.com/torvalds/linux/blob/v4.16-rc5/scripts/kconfig/lxdialog/check-lxdialog.sh#L29 >> https://github.com/torvalds/linux/blob/v4.16-rc5/scripts/kconfig/Makefile#L230 >> >> >> What do you think? >> > > Hi, > > I'm willing to keep patching/testing on this, but both pkg-config and depmod > (for depmod, see: https://bugzilla.kernel.org/show_bug.cgi?id=198965) > are basic requirements IMO, so just documenting their requirements is good > enough to me, but might not be good enough for some users. > > Comments? Sorry for late comments. OK, I am fine with making pkg-config a requirement. (and it should be documented) But, I'd like to clean-up scripts/kconfig/Makefile first. It is already cluttered, and hesitate to add new code based on that. I posted the patches. Could you send v3 after the cleaning work is done? -- Best Regards Masahiro Yamada
Re: [PATCH v9 04/11] arm64: kexec_file: allocate memory walking through memblock list
Baoquan, On Fri, May 18, 2018 at 09:37:35AM +0800, Baoquan He wrote: > On 05/17/18 at 07:04pm, James Morse wrote: > > Hi Baoquan, > > > > On 17/05/18 03:15, Baoquan He wrote: > > > On 05/17/18 at 10:10am, Baoquan He wrote: > > >> On 05/07/18 at 02:59pm, AKASHI Takahiro wrote: > > >>> On Tue, May 01, 2018 at 06:46:09PM +0100, James Morse wrote: > > On 25/04/18 07:26, AKASHI Takahiro wrote: > > > We need to prevent firmware-reserved memory regions, particularly EFI > > > memory map as well as ACPI tables, from being corrupted by loading > > > kernel/initrd (or other kexec buffers). We also want to support memory > > > allocation in top-down manner in addition to default bottom-up. > > > So let's have arm64 specific arch_kexec_walk_mem() which will search > > > for available memory ranges in usable memblock list, > > > i.e. !NOMAP & !reserved, > > > > > instead of system resource tree. > > > > Didn't we try to fix the system-resource-tree in order to fix > > regular-kexec to > > be safe in the EFI-memory-map/ACPI-tables case? > > > > It would be good to avoid having two ways of doing this, and I would > > like to > > avoid having extra arch code... > > >>> > > >>> I know what you mean. > > >>> /proc/iomem or system resource is, in my opinion, not the best place to > > >>> describe memory usage of kernel but rather to describe *physical* > > >>> hardware > > >>> layout. As we are still discussing about "reserved" memory, I don't want > > >>> to depend on it. > > >>> Along with memblock list, we will have more accurate control over memory > > >>> usage. > > >> > > >> In kexec-tools, we see any usable memory as candidate which can be used > > > > > > Here I said 'any', it's not accurate. Those memory which need be passed > > > to 2nd kernel for use need be excluded, just as we have done in > > > kexec-tools. > > > > > >> to load kexec kernel image/initrd etc. However kexec loading is a > > >> preparation work, it just books those position for later kexec kernel > > >> jumping after "kexec -e", that is why we need kexec_buf to remember > > >> them and do the real content copy of kernel/initrd. > > > > The problem we have on arm64 is /proc/iomem is being used for two things. > > 1) Kexec's this is memory I can book for the new kernel. > > 2) Kdump's this is memory I must describe for vmcore. > > > > We get the memory map from UEFI via the EFI stub, and leave it in > > memblock_reserved() memory. A new kexec kernel needs this to boot: it > > mustn't > > overwrite it. The same goes for the ACPI tables, they could be reclaimed and > > used as memory, but the new kexec kernel needs them to boot, they are > > memblock_reserved() too. > > Thanks for these details. Seems arm64 is different. In x86 64 memblock Thanks to James from me, too. > is used as bootmem allocator and will be released when buddy takes over. > Mainly, using memblock may bring concern that kexec kernel > will jump to a unfixed position. This creates an unexpected effect as > KASLR is doing, namely kernel could be put at a random position. As we I don't think that this would be a problem on arm64. > know, kexec was invented for fast kernel dev testing by bypassing > firmware reset, and has been taken to reboot those huge server with > thousands of devices and large memory for business currently. This extra > unpected KASLR effect may cause annoyance even though people have > disabled KASLR explicitly for a specific testing purpose. > > Besides, discarding the /proc/iomem scanning but taking memblock instead > in kernel space works for kexec loading for the time being, the flaw of > /proc/iomem still exists and cause problem for user space kexec-tools, > as pointed out. Do we have a plan for that? This was the difference between my and James' standpoint (at leas initially). James didn't want to require userspace changes to fix the issue, but the reality is that, without modifying it, we can't support kexec and kdump perfectly as James explained in his email. > > > > If we knock all memblock_reserved() regions out of /proc/iomem then kdump > > doesn't work, because /proc/iomem is only generated once. Its a snapshot. > > The > > initcode/data is an example of memory we release from memblock_reserve() > > after > > this, then gets used for data we need in the vmcore. > > Hmm, I'm a little confused here. We have defined different iores type > for different memory region. If acpi need be reused by kdump/kexec, we > can change to not reclaim it, and add them into /proc/iomem in order to > notify components which rely on them to process. > > > enum { > IORES_DESC_NONE = 0, > IORES_DESC_CRASH_KERNEL = 1, > IORES_DESC_ACPI_TABLES = 2, > IORES_DESC_ACPI_NV_STORAGE = 3, > IORES_DESC_PERSISTENT_MEMORY= 4, > IORES_DESC_PERSISTENT_M
[PATCH v2 3/3] perf annotate: Support '--group' option
With the '--group' option, even for non-explicit group, perf annotate will enable the group output. For example, perf record -e cycles,branches ./div perf annotate main --stdio --group :Disassembly of section .text: : :004004b0 : :main(): : :return i; :} : :int main(void) :{ 0.000.00 : 4004b0: push %rbx :int i; :int flag; :volatile double x = 1212121212, y = 121212; : :s_randseed = time(0); 0.000.00 : 4004b1: xor%edi,%edi :srand(s_randseed); 0.000.00 : 4004b3: mov$0x77359400,%ebx : :return i; :} : But if without --group, there is only one event reported. perf annotate main --stdio :Disassembly of section .text: : :004004b0 : :main(): : :return i; :} : :int main(void) :{ 0.00 : 4004b0: push %rbx :int i; :int flag; :volatile double x = 1212121212, y = 121212; : :s_randseed = time(0); 0.00 : 4004b1: xor%edi,%edi :srand(s_randseed); 0.00 : 4004b3: mov$0x77359400,%ebx : :return i; :} Signed-off-by: Jin Yao --- tools/perf/builtin-annotate.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c index 6e5d9f7..5272d48 100644 --- a/tools/perf/builtin-annotate.c +++ b/tools/perf/builtin-annotate.c @@ -45,6 +45,7 @@ struct perf_annotate { bool print_line; bool skip_missing; bool has_br_stack; + bool group_set; const char *sym_hist_filter; const char *cpu_list; DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS); @@ -508,6 +509,9 @@ int cmd_annotate(int argc, const char **argv) "Don't shorten the displayed pathnames"), OPT_BOOLEAN(0, "skip-missing", &annotate.skip_missing, "Skip symbols that cannot be annotated"), + OPT_BOOLEAN_SET(0, "group", &symbol_conf.event_group, + &annotate.group_set, + "Show event group information together"), OPT_STRING('C', "cpu", &annotate.cpu_list, "cpu", "list of cpus to profile"), OPT_CALLBACK(0, "symfs", NULL, "directory", "Look for files with symbols relative to this directory", @@ -570,6 +574,9 @@ int cmd_annotate(int argc, const char **argv) annotate.has_br_stack = perf_header__has_feat(&annotate.session->header, HEADER_BRANCH_STACK); + if (annotate.group_set) + perf_evlist_forced_leader(annotate.session->evlist); + ret = symbol__annotation_init(); if (ret < 0) goto out_delete; -- 2.7.4
[PATCH v2 2/3] perf report: Use perf_evlist_forced_leader to support '--group'
Since we have created a new function perf_evlist_forced_leader, so now remove the old code and use perf_evlist_forced_leader instead. Signed-off-by: Jin Yao --- tools/perf/builtin-report.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 4c931af..63fe776 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -202,12 +202,8 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter, static void setup_forced_leader(struct report *report, struct perf_evlist *evlist) { - if (report->group_set && !evlist->nr_groups) { - struct perf_evsel *leader = perf_evlist__first(evlist); - - perf_evlist__set_leader(evlist); - leader->forced_leader = true; - } + if (report->group_set) + perf_evlist_forced_leader(evlist); } static int process_feature_event(struct perf_tool *tool, -- 2.7.4
[PATCH v2 1/3] perf evlist: Create a new function perf_evlist_forced_leader
For non-explicit group, perf report supports a option '--group' which can enable group output. We also need to support perf annotate with the same '--group'. Create a new function perf_evlist_forced_leader which contains common code to force setting the group leader. Signed-off-by: Jin Yao --- tools/perf/util/evlist.c | 10 ++ tools/perf/util/evlist.h | 3 +++ 2 files changed, 13 insertions(+) diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index a59281d..ed8a9d5 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -1795,3 +1795,13 @@ bool perf_evlist__exclude_kernel(struct perf_evlist *evlist) return true; } + +void perf_evlist_forced_leader(struct perf_evlist *evlist) +{ + if (!evlist->nr_groups) { + struct perf_evsel *leader = perf_evlist__first(evlist); + + perf_evlist__set_leader(evlist); + leader->forced_leader = true; + } +} diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h index 6c41b2f..d77d514 100644 --- a/tools/perf/util/evlist.h +++ b/tools/perf/util/evlist.h @@ -309,4 +309,7 @@ struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist, union perf_event *event); bool perf_evlist__exclude_kernel(struct perf_evlist *evlist); + +void perf_evlist_forced_leader(struct perf_evlist *evlist); + #endif /* __PERF_EVLIST_H */ -- 2.7.4
[PATCH v2 0/3] perf annotate: Support '--group' option
For non-explicit group, perf report has already supported a option '--group' which can enable group output. This patch-set will support perf annotate with the same '--group'. For example, perf record -e cycles,branches ./div perf annotate main --stdio --group :Disassembly of section .text: : :004004b0 : :main(): : :return i; :} : :int main(void) :{ 0.000.00 : 4004b0: push %rbx :int i; :int flag; :volatile double x = 1212121212, y = 121212; : :s_randseed = time(0); 0.000.00 : 4004b1: xor%edi,%edi :srand(s_randseed); 0.000.00 : 4004b3: mov$0x77359400,%ebx : :return i; :} : v2: - Arnaldo points out that it should be done the way it is for perf report --group. v2 refers to this way and the patch is totally rewritten. Init post: -- Post the patch 'perf annotate: Support multiple events without group' Jin Yao (3): perf evlist: Create a new function perf_evlist_forced_leader perf report: Use perf_evlist_forced_leader to support '--group' perf annotate: Support '--group' option tools/perf/builtin-annotate.c | 7 +++ tools/perf/builtin-report.c | 8 ++-- tools/perf/util/evlist.c | 10 ++ tools/perf/util/evlist.h | 3 +++ 4 files changed, 22 insertions(+), 6 deletions(-) -- 2.7.4
[git pull] drm fixes for v4.17-rc6
Hi Linus, Pretty quiet week again, one vmwgfx regression fix, one core buffer overflow fix,one vc4 leak fix and three i915 fixes. Dave. The following changes since commit 76ef6b28ea4f81c3d511866a9b31392caa833126: drm: set FMODE_UNSIGNED_OFFSET for drm files (2018-05-15 14:46:04 +1000) are available in the Git repository at: git://people.freedesktop.org/~airlied/linux tags/drm-fixes-for-v4.17-rc6 for you to fetch changes up to 1827cad96d624ec127853a71cb931c74024e57d6: Merge tag 'drm-intel-fixes-2018-05-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes (2018-05-18 12:01:49 +1000) i915, vc4, vmwgfx and core fixes Chris Wilson (1): drm/i915/execlists: Use rmb() to order CSB reads Dan Carpenter (1): drm/dumb-buffers: Integer overflow in drm_mode_create_ioctl() Dave Airlie (3): Merge tag 'drm-misc-fixes-2018-05-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Merge branch 'vmwgfx-fixes-4.17' of git://people.freedesktop.org/~thomash/linux into drm-fixes Merge tag 'drm-intel-fixes-2018-05-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes Deepak Rawat (1): drm/vmwgfx: Set dmabuf_size when vmw_dmabuf_init is successful Eric Anholt (1): drm/vc4: Fix leak of the file_priv that stored the perfmon. Haneen Mohammed (1): drm: Match sysfs name in link removal to link creation Matthew Auld (1): drm/i915/userptr: reject zero user_size Michel Thierry (1): drm/i915/gen9: Add WaClearHIZ_WM_CHICKEN3 for bxt and glk drivers/gpu/drm/drm_drv.c | 2 +- drivers/gpu/drm/drm_dumb_buffers.c | 7 --- drivers/gpu/drm/i915/i915_gem_userptr.c | 3 +++ drivers/gpu/drm/i915/i915_reg.h | 3 +++ drivers/gpu/drm/i915/intel_engine_cs.c | 4 drivers/gpu/drm/i915/intel_lrc.c| 1 + drivers/gpu/drm/vc4/vc4_drv.c | 1 + drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c| 2 ++ 8 files changed, 19 insertions(+), 4 deletions(-)
[PATCH 3/5] kconfig: refactor GTK+ package checks for building gconf
Refactor the necessary package checks for building gconf in the same way as for qconf. Signed-off-by: Masahiro Yamada --- scripts/kconfig/Makefile | 43 +-- scripts/kconfig/gconf-cfg.sh | 23 +++ 2 files changed, 32 insertions(+), 34 deletions(-) create mode 100755 scripts/kconfig/gconf-cfg.sh diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile index e9a87bf..c222745 100644 --- a/scripts/kconfig/Makefile +++ b/scripts/kconfig/Makefile @@ -188,8 +188,6 @@ HOST_EXTRACFLAGS += $(shell $(CONFIG_SHELL) $(check-lxdialog) -ccflags) \ # Utilizes ncurses # mconf: Used for the menuconfig target # Utilizes the lxdialog package -# gconf: Used for the gconfig target -# Based on GTK+ which needs to be installed to compile it # object files used by all kconfig flavours lxdialog := lxdialog/checklist.o lxdialog/util.o lxdialog/inputbox.o @@ -199,12 +197,10 @@ conf-objs := conf.o zconf.tab.o mconf-objs := mconf.o zconf.tab.o $(lxdialog) nconf-objs := nconf.o zconf.tab.o nconf.gui.o kxgettext-objs := kxgettext.o zconf.tab.o -gconf-objs := gconf.o zconf.tab.o -hostprogs-y := conf nconf mconf kxgettext gconf +hostprogs-y := conf nconf mconf kxgettext targets+= zconf.lex.c -clean-files:= .tmp_gtkcheck clean-files+= gconf.glade.h clean-files += config.pot linux.pot @@ -224,10 +220,6 @@ HOST_EXTRACXXFLAGS += $(shell $(CONFIG_SHELL) $(srctree)/$(src)/check.sh $(HOSTC HOSTCFLAGS_zconf.lex.o := -I$(src) HOSTCFLAGS_zconf.tab.o := -I$(src) -HOSTLOADLIBES_gconf= `pkg-config --libs gtk+-2.0 gmodule-2.0 libglade-2.0` -HOSTCFLAGS_gconf.o = `pkg-config --cflags gtk+-2.0 gmodule-2.0 libglade-2.0` \ - -Wno-missing-prototypes - HOSTLOADLIBES_mconf = $(shell $(CONFIG_SHELL) $(check-lxdialog) -ldflags $(HOSTCC)) HOSTLOADLIBES_nconf= $(shell \ @@ -251,31 +243,14 @@ quiet_cmd_moc = MOC $@ $(obj)/%.moc: $(src)/%.h $(obj)/.qconf-cfg $(call cmd,moc) -$(obj)/gconf.o: $(obj)/.tmp_gtkcheck - -ifeq ($(MAKECMDGOALS),gconfig) --include $(obj)/.tmp_gtkcheck - -# GTK+ needs some extra effort, too... -$(obj)/.tmp_gtkcheck: - @if `pkg-config --exists gtk+-2.0 gmodule-2.0 libglade-2.0`; then \ - if `pkg-config --atleast-version=2.0.0 gtk+-2.0`; then \ - touch $@; \ - else \ - echo >&2 "*"; \ - echo >&2 "* GTK+ is present but version >= 2.0.0 is required."; \ - echo >&2 "*"; \ - false; \ - fi \ - else \ - echo >&2 "*"; \ - echo >&2 "* Unable to find the GTK+ installation. Please make sure that"; \ - echo >&2 "* the GTK+ 2.0 development package is correctly installed...";\ - echo >&2 "* You need gtk+-2.0, glib-2.0 and libglade-2.0."; \ - echo >&2 "*"; \ - false; \ - fi -endif +# gconf: Used for the gconfig target based on GTK+ +hostprogs-y+= gconf +gconf-objs := gconf.o zconf.tab.o + +HOSTLOADLIBES_gconf = $(shell . $(obj)/.gconf-cfg && echo $$libs) +HOSTCFLAGS_gconf.o = $(shell . $(obj)/.gconf-cfg && echo $$cflags) + +$(obj)/gconf.o: $(obj)/.gconf-cfg $(obj)/zconf.tab.o: $(obj)/zconf.lex.c diff --git a/scripts/kconfig/gconf-cfg.sh b/scripts/kconfig/gconf-cfg.sh new file mode 100755 index 000..533b3d8 --- /dev/null +++ b/scripts/kconfig/gconf-cfg.sh @@ -0,0 +1,23 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 + +PKG="gtk+-2.0 gmodule-2.0 libglade-2.0" + +if ! pkg-config --exists $PKG; then + echo >&2 "*" + echo >&2 "* Unable to find the GTK+ installation. Please make sure that" + echo >&2 "* the GTK+ 2.0 development package is correctly installed." + echo >&2 "* You need $PKG" + echo >&2 "*" + exit 1 +fi + +if ! pkg-config --atleast-version=2.0.0 gtk+-2.0; then + echo >&2 "*" + echo >&2 "* GTK+ is present but version >= 2.0.0 is required." + echo >&2 "*" + exit 1 +fi + +echo cflags=\"$(pkg-config --cflags $PKG)\" +echo libs=\"$(pkg-config --libs $PKG)\" -- 2.7.4
[PATCH 5/5] kconfig: refactor ncurses package checks for building nconf
Building nconf requires ncurses, but its presence is not checked. Check and configure necessary packages as in the other GUI frontends. Signed-off-by: Masahiro Yamada --- scripts/kconfig/Makefile | 16 scripts/kconfig/nconf-cfg.sh | 22 ++ 2 files changed, 30 insertions(+), 8 deletions(-) create mode 100644 scripts/kconfig/nconf-cfg.sh diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile index 25a3d25..b90e801 100644 --- a/scripts/kconfig/Makefile +++ b/scripts/kconfig/Makefile @@ -176,15 +176,12 @@ help: # === # Shared Makefile for the various kconfig executables: # conf: Used for defconfig, oldconfig and related targets -# nconf: Used for the nconfig target. -# Utilizes ncurses # object files used by all kconfig flavours conf-objs := conf.o zconf.tab.o -nconf-objs := nconf.o zconf.tab.o nconf.gui.o kxgettext-objs := kxgettext.o zconf.tab.o -hostprogs-y := conf nconf kxgettext +hostprogs-y := conf kxgettext targets+= zconf.lex.c clean-files+= gconf.glade.h @@ -199,10 +196,13 @@ HOST_EXTRACXXFLAGS += $(shell $(CONFIG_SHELL) $(srctree)/$(src)/check.sh $(HOSTC HOSTCFLAGS_zconf.lex.o := -I$(src) HOSTCFLAGS_zconf.tab.o := -I$(src) -HOSTLOADLIBES_nconf= $(shell \ - pkg-config --libs menuw panelw ncursesw 2>/dev/null \ - || pkg-config --libs menu panel ncurses 2>/dev/null \ - || echo "-lmenu -lpanel -lncurses" ) +# nconf: Used for the nconfig target based on ncurses +hostprogs-y+= nconf +nconf-objs := nconf.o zconf.tab.o nconf.gui.o + +HOSTLOADLIBES_nconf= $(shell . $(obj)/.nconf-cfg && echo $$libs) + +$(obj)/nconf.o: $(obj)/.nconf-cfg # mconf: Used for the menuconfig target based on lxdialog hostprogs-y+= mconf diff --git a/scripts/kconfig/nconf-cfg.sh b/scripts/kconfig/nconf-cfg.sh new file mode 100644 index 000..9def36f --- /dev/null +++ b/scripts/kconfig/nconf-cfg.sh @@ -0,0 +1,22 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 + +PKG="menuw panelw ncursesw" +PKG2="menu panel ncurses" + +if pkg-config --exists $PKG; then + echo libs=\"$(pkg-config --libs $PKG)\" + exit 0 +fi + +if pkg-config --exists $PKG2; then + echo libs=\"$(pkg-config --libs $PKG2)\" + exit 0 +fi + +echo >&2 "*" +echo >&2 "* Unable to find the ncurses." +echo >&2 "* Install ncurses (ncurses-devel or libncurses-dev" +echo >&2 "* depending on your distribution)" +echo >&2 "*" +exit 1 -- 2.7.4
[PATCH 1/5] kbuild: do not display CHK for filechk
filechk displays two short logs; CHK for creating a temporary file, and UPD for really updating the target. IMHO, the build system can be quiet when the target file has not been updated. Signed-off-by: Masahiro Yamada --- scripts/Kbuild.include | 1 - 1 file changed, 1 deletion(-) diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include index 50cee53..c7fedc5 100644 --- a/scripts/Kbuild.include +++ b/scripts/Kbuild.include @@ -57,7 +57,6 @@ kecho := $($(quiet)kecho) # to specify a valid file as first prerequisite (often the kbuild file) define filechk $(Q)set -e; \ - $(kecho) ' CHK $@';\ mkdir -p $(dir $@); \ $(filechk_$(1)) < $< > $@.tmp; \ if [ -r $@ ] && cmp -s $@ $@.tmp; then \ -- 2.7.4
[PATCH 4/5] kconfig: refactor ncurses package checks for building mconf
The mconf (or its infrastructure, lxdiaglog) depends on ncurses. check-lxdialog.sh has additional checks in case pkg-config is not available. However, qconf and gconf already rely on pkg-config to check necessary packages. For simplification, drop the fallback code from check-lxdialog.sh and move/rename to mconf-cfg.sh to make it work in the same way as the other GUI frontends. Signed-off-by: Masahiro Yamada --- scripts/kconfig/Makefile | 44 +- scripts/kconfig/lxdialog/check-lxdialog.sh | 93 -- scripts/kconfig/lxdialog/dialog.h | 2 +- scripts/kconfig/mconf-cfg.sh | 24 4 files changed, 41 insertions(+), 122 deletions(-) delete mode 100755 scripts/kconfig/lxdialog/check-lxdialog.sh create mode 100755 scripts/kconfig/mconf-cfg.sh diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile index c222745..25a3d25 100644 --- a/scripts/kconfig/Makefile +++ b/scripts/kconfig/Makefile @@ -173,60 +173,48 @@ help: @echo ' xenconfig - Enable additional options for xen dom0 and guest kernel support' @echo ' tinyconfig - Configure the tiniest possible kernel' -# lxdialog stuff -check-lxdialog := $(srctree)/$(src)/lxdialog/check-lxdialog.sh - -# Use recursively expanded variables so we do not call gcc unless -# we really need to do so. (Do not call gcc as part of make mrproper) -HOST_EXTRACFLAGS += $(shell $(CONFIG_SHELL) $(check-lxdialog) -ccflags) \ --DLOCALE - # === # Shared Makefile for the various kconfig executables: # conf: Used for defconfig, oldconfig and related targets # nconf: Used for the nconfig target. # Utilizes ncurses -# mconf: Used for the menuconfig target -# Utilizes the lxdialog package # object files used by all kconfig flavours -lxdialog := lxdialog/checklist.o lxdialog/util.o lxdialog/inputbox.o -lxdialog += lxdialog/textbox.o lxdialog/yesno.o lxdialog/menubox.o - conf-objs := conf.o zconf.tab.o -mconf-objs := mconf.o zconf.tab.o $(lxdialog) nconf-objs := nconf.o zconf.tab.o nconf.gui.o kxgettext-objs := kxgettext.o zconf.tab.o -hostprogs-y := conf nconf mconf kxgettext +hostprogs-y := conf nconf kxgettext targets+= zconf.lex.c clean-files+= gconf.glade.h clean-files += config.pot linux.pot -# Check that we have the required ncurses stuff installed for lxdialog (menuconfig) -PHONY += $(obj)/dochecklxdialog -$(addprefix $(obj)/, mconf.o $(lxdialog)): $(obj)/dochecklxdialog -$(obj)/dochecklxdialog: - $(Q)$(CONFIG_SHELL) $(check-lxdialog) -check $(HOSTCC) $(HOST_EXTRACFLAGS) $(HOSTLOADLIBES_mconf) - -always := dochecklxdialog - # Add environment specific flags -HOST_EXTRACFLAGS += $(shell $(CONFIG_SHELL) $(srctree)/$(src)/check.sh $(HOSTCC) $(HOSTCFLAGS)) -HOST_EXTRACXXFLAGS += $(shell $(CONFIG_SHELL) $(srctree)/$(src)/check.sh $(HOSTCXX) $(HOSTCXXFLAGS)) - +HOST_EXTRACFLAGS += $(shell $(CONFIG_SHELL) $(srctree)/$(src)/check.sh $(HOSTCC) $(HOSTCFLAGS)) \ + -DLOCALE +HOST_EXTRACXXFLAGS += $(shell $(CONFIG_SHELL) $(srctree)/$(src)/check.sh $(HOSTCXX) $(HOSTCXXFLAGS)) \ + -DLOCALE # generated files seem to need this to find local include files HOSTCFLAGS_zconf.lex.o := -I$(src) HOSTCFLAGS_zconf.tab.o := -I$(src) -HOSTLOADLIBES_mconf = $(shell $(CONFIG_SHELL) $(check-lxdialog) -ldflags $(HOSTCC)) - HOSTLOADLIBES_nconf= $(shell \ pkg-config --libs menuw panelw ncursesw 2>/dev/null \ || pkg-config --libs menu panel ncurses 2>/dev/null \ || echo "-lmenu -lpanel -lncurses" ) +# mconf: Used for the menuconfig target based on lxdialog +hostprogs-y+= mconf +lxdialog := checklist.o inputbox.o menubox.o textbox.o util.o yesno.o +mconf-objs := mconf.o zconf.tab.o $(addprefix lxdialog/, $(lxdialog)) + +HOSTLOADLIBES_mconf = $(shell . $(obj)/.mconf-cfg && echo $$libs) +$(foreach f, mconf.o $(lxdialog), \ + $(eval HOSTCFLAGS_$f = $$(shell . $(obj)/.mconf-cfg && echo cflags))) + +$(addprefix $(obj)/, mconf.o $(lxdialog)): $(obj)/.mconf-cfg + # qconf: Used for the xconfig target based on Qt hostprogs-y+= qconf qconf-cxxobjs := qconf.o diff --git a/scripts/kconfig/lxdialog/check-lxdialog.sh b/scripts/kconfig/lxdialog/check-lxdialog.sh deleted file mode 100755 index 6c0bcd9..000 --- a/scripts/kconfig/lxdialog/check-lxdialog.sh +++ /dev/null @@ -1,93 +0,0 @@ -#!/bin/sh -# SPDX-License-Identifier: GPL-2.0 -# Check ncurses compatibility - -# What library to link -ldflags() -{ - pkg-config --libs ncursesw 2>/dev/null && exit - pkg-config --libs ncurses 2>/dev/null && exit - for ext in so a dll.a dylib ; do - for lib in ncursesw ncurses curses ; do - $cc -print-file
[PATCH 0/5] kconfig: refactor package checks for GUI frontends
Kconfig supports 4 GUI frontends. Each of them needs some support packages, but checks them differently: qconf, gconf: check packages in Makefile (pkg-config is required) mconf: lxdialog/check-lxdialog.sh nconf: needs ncurses, but its presence is not checked This series refactor the package checks so that all of them work in the same way. The package check scripts have been moved to scripts/kconfig/*conf-cfg.sh The motivation of this clean-up is Randy's following patch: https://patchwork.kernel.org/patch/10277723/ I want to clean up existing code before adding more checks. Masahiro Yamada (5): kbuild: do not display CHK for filechk kconfig: refactor Qt package checks for building qconf kconfig: refactor GTK+ package checks for building gconf kconfig: refactor ncurses package checks for building mconf kconfig: refactor ncurses package checks for building nconf scripts/Kbuild.include | 1 - scripts/kconfig/Makefile | 160 ++--- scripts/kconfig/gconf-cfg.sh | 23 + scripts/kconfig/lxdialog/check-lxdialog.sh | 93 - scripts/kconfig/lxdialog/dialog.h | 2 +- scripts/kconfig/mconf-cfg.sh | 24 + scripts/kconfig/nconf-cfg.sh | 22 scripts/kconfig/qconf-cfg.sh | 25 + 8 files changed, 148 insertions(+), 202 deletions(-) create mode 100755 scripts/kconfig/gconf-cfg.sh delete mode 100755 scripts/kconfig/lxdialog/check-lxdialog.sh create mode 100755 scripts/kconfig/mconf-cfg.sh create mode 100644 scripts/kconfig/nconf-cfg.sh create mode 100755 scripts/kconfig/qconf-cfg.sh -- 2.7.4
[PATCH 2/5] kconfig: refactor Qt package checks for building qconf
Currently, the necessary package checks for building qconf is surrounded by ifeq ($(MAKECMDGOALS),xconfig) ... endif. Then, Make will restart when .tmp_qtcheck is generated. To simplify the Makefile, move the scripting to a separate file, and use filechk. The shell script is executed everytime xconfig is run, but it is not a costly script. Signed-off-by: Masahiro Yamada --- scripts/kconfig/Makefile | 73 +--- scripts/kconfig/qconf-cfg.sh | 25 +++ 2 files changed, 53 insertions(+), 45 deletions(-) create mode 100755 scripts/kconfig/qconf-cfg.sh diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile index 5def877..e9a87bf 100644 --- a/scripts/kconfig/Makefile +++ b/scripts/kconfig/Makefile @@ -188,8 +188,6 @@ HOST_EXTRACFLAGS += $(shell $(CONFIG_SHELL) $(check-lxdialog) -ccflags) \ # Utilizes ncurses # mconf: Used for the menuconfig target # Utilizes the lxdialog package -# qconf: Used for the xconfig target -# Based on Qt which needs to be installed to compile it # gconf: Used for the gconfig target # Based on GTK+ which needs to be installed to compile it # object files used by all kconfig flavours @@ -201,14 +199,12 @@ conf-objs := conf.o zconf.tab.o mconf-objs := mconf.o zconf.tab.o $(lxdialog) nconf-objs := nconf.o zconf.tab.o nconf.gui.o kxgettext-objs := kxgettext.o zconf.tab.o -qconf-cxxobjs := qconf.o -qconf-objs := zconf.tab.o gconf-objs := gconf.o zconf.tab.o -hostprogs-y := conf nconf mconf kxgettext qconf gconf +hostprogs-y := conf nconf mconf kxgettext gconf targets+= zconf.lex.c -clean-files:= qconf.moc .tmp_qtcheck .tmp_gtkcheck +clean-files:= .tmp_gtkcheck clean-files+= gconf.glade.h clean-files += config.pot linux.pot @@ -228,9 +224,6 @@ HOST_EXTRACXXFLAGS += $(shell $(CONFIG_SHELL) $(srctree)/$(src)/check.sh $(HOSTC HOSTCFLAGS_zconf.lex.o := -I$(src) HOSTCFLAGS_zconf.tab.o := -I$(src) -HOSTLOADLIBES_qconf= $(KC_QT_LIBS) -HOSTCXXFLAGS_qconf.o = $(KC_QT_CFLAGS) - HOSTLOADLIBES_gconf= `pkg-config --libs gtk+-2.0 gmodule-2.0 libglade-2.0` HOSTCFLAGS_gconf.o = `pkg-config --cflags gtk+-2.0 gmodule-2.0 libglade-2.0` \ -Wno-missing-prototypes @@ -241,34 +234,22 @@ HOSTLOADLIBES_nconf = $(shell \ pkg-config --libs menuw panelw ncursesw 2>/dev/null \ || pkg-config --libs menu panel ncurses 2>/dev/null \ || echo "-lmenu -lpanel -lncurses" ) -$(obj)/qconf.o: $(obj)/.tmp_qtcheck - -ifeq ($(MAKECMDGOALS),xconfig) -$(obj)/.tmp_qtcheck: $(src)/Makefile --include $(obj)/.tmp_qtcheck - -# Qt needs some extra effort... -$(obj)/.tmp_qtcheck: - @set -e; $(kecho) " CHECK qt"; \ - if pkg-config --exists Qt5Core; then \ - cflags="-std=c++11 -fPIC `pkg-config --cflags Qt5Core Qt5Gui Qt5Widgets`"; \ - libs=`pkg-config --libs Qt5Core Qt5Gui Qt5Widgets`; \ - moc=`pkg-config --variable=host_bins Qt5Core`/moc; \ - elif pkg-config --exists QtCore; then \ - cflags=`pkg-config --cflags QtCore QtGui`; \ - libs=`pkg-config --libs QtCore QtGui`; \ - moc=`pkg-config --variable=moc_location QtCore`; \ - else \ - echo >&2 "*"; \ - echo >&2 "* Could not find Qt via pkg-config."; \ - echo >&2 "* Please install either Qt 4.8 or 5.x. and make sure it's in PKG_CONFIG_PATH"; \ - echo >&2 "*"; \ - exit 1; \ - fi; \ - echo "KC_QT_CFLAGS=$$cflags" > $@; \ - echo "KC_QT_LIBS=$$libs" >> $@; \ - echo "KC_QT_MOC=$$moc" >> $@ -endif + +# qconf: Used for the xconfig target based on Qt +hostprogs-y+= qconf +qconf-cxxobjs := qconf.o +qconf-objs := zconf.tab.o + +HOSTLOADLIBES_qconf= $(shell . $(obj)/.qconf-cfg && echo $$libs) +HOSTCXXFLAGS_qconf.o = $(shell . $(obj)/.qconf-cfg && echo $$cflags) + +$(obj)/qconf.o: $(obj)/.qconf-cfg $(obj)/qconf.moc + +quiet_cmd_moc = MOC $@ + cmd_moc = $(shell . $(obj)/.qconf-cfg && echo $$moc) -i $< -o $@ + +$(obj)/%.moc: $(src)/%.h $(obj)/.qconf-cfg + $(call cmd,moc) $(obj)/gconf.o: $(obj)/.tmp_gtkcheck @@ -298,15 +279,17 @@ endif $(obj)/zconf.tab.o: $(obj)/zconf.lex.c -$(obj)/qconf.o: $(obj)/qconf.moc - -quiet_cmd_moc = MOC $@ - cmd_moc = $(KC_QT_MOC) -i $< -o $@ - -$(obj)/%.moc: $(src)/%.h $(obj)/.tmp_qtcheck - $(call cmd,moc) - # Extract gconf menu items for i18n support $(obj)/gconf.glade.h: $(obj)/gconf.glade $(Q)intltool-extract --type=gettext/glade --srcdir=$(srctree) \ $(obj)/gconf.glade + +# check if necessary packages are available, and configure build flags +define filechk_conf_cfg + $(CONFIG_SHELL) $< +endef + +$(obj)/.%conf-cfg: $(src)/%conf-cfg.sh FORCE + $(call filechk,conf_cfg) + +clean-files += .*conf-cfg diff --git a/scr
[PATCH v2 7/7] memcg: supports movement of surplus hugepages statistics
When the task that charged surplus hugepages moves memory cgroup, it updates the statistical information correctly. Signed-off-by: TSUKADA Koutaro --- memcontrol.c | 99 +++ 1 file changed, 99 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a8f1ff8..63f0922 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4698,12 +4698,110 @@ static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd, return 0; } +#ifdef CONFIG_HUGETLB_PAGE +static enum mc_target_type get_mctgt_type_hugetlb(struct vm_area_struct *vma, + unsigned long addr, pte_t *pte, union mc_target *target) +{ + struct page *page = NULL; + pte_t entry; + enum mc_target_type ret = MC_TARGET_NONE; + + if (!(mc.flags & MOVE_ANON)) + return ret; + + entry = huge_ptep_get(pte); + if (!pte_present(entry)) + return ret; + + page = pte_page(entry); + VM_BUG_ON_PAGE(!page || !PageHead(page), page); + if (likely(!PageSurplusCharge(page))) + return ret; + if (page->mem_cgroup == mc.from) { + ret = MC_TARGET_PAGE; + if (target) { + get_page(page); + target->page = page; + } + } + + return ret; +} + +static int hugetlb_count_precharge_pte_range(pte_t *pte, unsigned long hmask, + unsigned long addr, unsigned long end, + struct mm_walk *walk) +{ + struct vm_area_struct *vma = walk->vma; + struct mm_struct *mm = walk->mm; + spinlock_t *ptl; + union mc_target target; + + ptl = huge_pte_lock(hstate_vma(vma), mm, pte); + if (get_mctgt_type_hugetlb(vma, addr, pte, &target) == MC_TARGET_PAGE) { + mc.precharge += (1 << compound_order(target.page)); + put_page(target.page); + } + spin_unlock(ptl); + + return 0; +} + +static int hugetlb_move_charge_pte_range(pte_t *pte, unsigned long hmask, + unsigned long addr, unsigned long end, + struct mm_walk *walk) +{ + struct vm_area_struct *vma = walk->vma; + struct mm_struct *mm = walk->mm; + spinlock_t *ptl; + enum mc_target_type target_type; + union mc_target target; + struct page *page; + unsigned long nr_pages; + + ptl = huge_pte_lock(hstate_vma(vma), mm, pte); + target_type = get_mctgt_type_hugetlb(vma, addr, pte, &target); + if (target_type == MC_TARGET_PAGE) { + page = target.page; + nr_pages = (1 << compound_order(page)); + if (mc.precharge < nr_pages) { + put_page(page); + goto unlock; + } + if (!mem_cgroup_move_account(page, true, mc.from, mc.to)) { + mc.precharge -= nr_pages; + mc.moved_charge += nr_pages; + } + put_page(page); + } +unlock: + spin_unlock(ptl); + + return 0; +} +#else +static int hugetlb_count_precharge_pte_range(pte_t *pte, unsigned long hmask, + unsigned long addr, unsigned long end, + struct mm_walk *walk) +{ + return 0; +} + +static int hugetlb_move_charge_pte_range(pte_t *pte, unsigned long hmask, + unsigned long addr, unsigned long end, + struct mm_walk *walk) +{ + return 0; +} +#endif + static unsigned long mem_cgroup_count_precharge(struct mm_struct *mm) { unsigned long precharge; struct mm_walk mem_cgroup_count_precharge_walk = { .pmd_entry = mem_cgroup_count_precharge_pte_range, + .hugetlb_entry = hugetlb_count_precharge_pte_range, .mm = mm, }; down_read(&mm->mmap_sem); @@ -4981,6 +5079,7 @@ static void mem_cgroup_move_charge(void) { struct mm_walk mem_cgroup_move_charge_walk = { .pmd_entry = mem_cgroup_move_charge_pte_range, + .hugetlb_entry = hugetlb_move_charge_pte_range, .mm = mc.mm, }; -- Tsukada
[PATCH v2 6/7] Documentation, hugetlb: describe about charge_surplus_hugepages,
Add a description about charge_surplus_hugepages. Signed-off-by: TSUKADA Koutaro --- hugetlbpage.txt |6 ++ 1 file changed, 6 insertions(+) diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm/hugetlbpage.txt index faf077d..af8d112 100644 --- a/Documentation/vm/hugetlbpage.txt +++ b/Documentation/vm/hugetlbpage.txt @@ -129,6 +129,11 @@ number of "surplus" huge pages from the kernel's normal page pool, when the persistent huge page pool is exhausted. As these surplus huge pages become unused, they are freed back to the kernel's normal page pool. +/proc/sys/vm/charge_surplus_hugepages indicates to charge "surplus" huge pages +obteined from the normal page pool to memory cgroup. If true, the amount to be +overcommitted is limited within memory usage allowed by the memory cgroup to +which the task belongs. The default value is false. + When increasing the huge page pool size via nr_hugepages, any existing surplus pages will first be promoted to persistent huge pages. Then, additional huge pages will be allocated, if necessary and if possible, to fulfill @@ -169,6 +174,7 @@ Inside each of these directories, the same set of files will exist: free_hugepages resv_hugepages surplus_hugepages + charge_surplus_hugepages which function as described above for the default huge page-sized case. -- Tsukada
Re: mmotm 2018-05-17-16-26 uploaded (autofs)
On 18/05/18 12:23, Randy Dunlap wrote: > On 05/17/2018 08:50 PM, Ian Kent wrote: >> On 18/05/18 08:21, Randy Dunlap wrote: >>> On 05/17/2018 04:26 PM, a...@linux-foundation.org wrote: The mm-of-the-moment snapshot 2018-05-17-16-26 has been uploaded to http://www.ozlabs.org/~akpm/mmotm/ mmotm-readme.txt says README for mm-of-the-moment: http://www.ozlabs.org/~akpm/mmotm/ This is a snapshot of my -mm patch queue. Uploaded at random hopefully more than once a week. You will need quilt to apply these patches to the latest Linus release (4.x or 4.x-rcY). The series file is in broken-out.tar.gz and is duplicated in http://ozlabs.org/~akpm/mmotm/series The file broken-out.tar.gz contains two datestamp files: .DATE and .DATE--mm-dd-hh-mm-ss. Both contain the string -mm-dd-hh-mm-ss, followed by the base kernel version against which this patch series is to be applied. This tree is partially included in linux-next. To see which patches are included in linux-next, consult the `series' file. Only the patches within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in linux-next. A git tree which contains the memory management portion of this tree is maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git by Michal Hocko. It contains the patches which are between the "#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series file, http://www.ozlabs.org/~akpm/mmotm/series. A full copy of the full kernel tree with the linux-next and mmotm patches already applied is available through git within an hour of the mmotm release. Individual mmotm releases are tagged. The master branch always points to the latest release, so it's constantly rebasing. >>> >>> >>> on x86_64: with (randconfig): >>> CONFIG_AUTOFS_FS=y >>> CONFIG_AUTOFS4_FS=y >> >> Oh right, I need to make these exclusive. >> >> I seem to remember trying to do that along the way, can't remember why >> I didn't do it in the end. >> >> Any suggestions about potential problems when doing it? > > I think that just using "depends on" for each of them will cause kconfig to > complain about circular dependencies, so probably using "choice" will be > needed. Or (since this is just temporary?) just say "don't do that." > No doubt that was what happened, unfortunately I forgot to return to it. Right, a conditional with a message should work thanks. Ian
[PATCH v2 5/7] hugetlb: add charge_surplus_hugepages attribute
Add an entry for charge_surplus_hugepages to sysfs. Signed-off-by: TSUKADA Koutaro --- hugetlb.c | 25 + 1 file changed, 25 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 9a9549c..2f9bdbc 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2662,6 +2662,30 @@ static ssize_t surplus_hugepages_show(struct kobject *kobj, } HSTATE_ATTR_RO(surplus_hugepages); +static ssize_t charge_surplus_hugepages_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct hstate *h = kobj_to_hstate(kobj, NULL); + return sprintf(buf, "%d\n", h->charge_surplus_huge_pages); +} + +static ssize_t charge_surplus_hugepages_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t len) +{ + int err; + unsigned long input; + struct hstate *h = kobj_to_hstate(kobj, NULL); + + err = kstrtoul(buf, 10, &input); + if (err) + return err; + + h->charge_surplus_huge_pages = input ? true : false; + + return len; +} +HSTATE_ATTR(charge_surplus_hugepages); + static struct attribute *hstate_attrs[] = { &nr_hugepages_attr.attr, &nr_overcommit_hugepages_attr.attr, @@ -2671,6 +2695,7 @@ static ssize_t surplus_hugepages_show(struct kobject *kobj, #ifdef CONFIG_NUMA &nr_hugepages_mempolicy_attr.attr, #endif + &charge_surplus_hugepages_attr.attr, NULL, }; -- Tsukada
[PATCH v2 4/7] mm, sysctl: make charging surplus hugepages controllable
Make the default hugetlb surplus hugepage controlable by /proc/sys/vm/charge_surplus_hugepages. Signed-off-by: TSUKADA Koutaro --- include/linux/hugetlb.h |2 ++ kernel/sysctl.c |7 +++ mm/hugetlb.c| 21 + 3 files changed, 30 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 33fe5be..9314b07 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -80,6 +80,8 @@ struct hugepage_subpool *hugepage_new_subpool(struct hstate *h, long max_hpages, void reset_vma_resv_huge_pages(struct vm_area_struct *vma); int hugetlb_sysctl_handler(struct ctl_table *, int, void __user *, size_t *, loff_t *); int hugetlb_overcommit_handler(struct ctl_table *, int, void __user *, size_t *, loff_t *); +int hugetlb_charge_surplus_handler(struct ctl_table *, int, void __user *, + size_t *, loff_t *); int hugetlb_treat_movable_handler(struct ctl_table *, int, void __user *, size_t *, loff_t *); #ifdef CONFIG_NUMA diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 6a78cf7..d562d64 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -1394,6 +1394,13 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write, .mode = 0644, .proc_handler = hugetlb_overcommit_handler, }, + { + .procname = "charge_surplus_hugepages", + .data = NULL, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = hugetlb_charge_surplus_handler, + }, #endif { .procname = "lowmem_reserve_ratio", diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2e7b543..9a9549c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3069,6 +3069,27 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write, return ret; } +int hugetlb_charge_surplus_handler(struct ctl_table *table, int write, + void __user *buffer, size_t *length, loff_t *ppos) +{ + struct hstate *h = &default_hstate; + int tmp, ret; + + if (!hugepages_supported()) + return -EOPNOTSUPP; + + tmp = h->charge_surplus_huge_pages ? 1 : 0; + table->data = &tmp; + table->maxlen = sizeof(int); + ret = proc_dointvec_minmax(table, write, buffer, length, ppos); + if (ret) + goto out; + + if (write) + h->charge_surplus_huge_pages = tmp ? true : false; +out: + return ret; +} #endif /* CONFIG_SYSCTL */ void hugetlb_report_meminfo(struct seq_file *m) -- Tsukada
[PATCH v2 3/7] memcg: use compound_order rather than hpage_nr_pages
The current memcg implementation assumes that the compound page is THP. In order to be able to charge surplus hugepage, we use compound_order. Signed-off-by: TSUKADA Koutaro --- memcontrol.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2bd3df3..a8f1ff8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4483,7 +4483,7 @@ static int mem_cgroup_move_account(struct page *page, struct mem_cgroup *to) { unsigned long flags; - unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1; + unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1; int ret; bool anon; @@ -5417,7 +5417,7 @@ int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, bool compound) { struct mem_cgroup *memcg = NULL; - unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1; + unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1; int ret = 0; if (mem_cgroup_disabled()) @@ -5478,7 +5478,7 @@ int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, bool lrucare, bool compound) { - unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1; + unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1; VM_BUG_ON_PAGE(!page->mapping, page); VM_BUG_ON_PAGE(PageLRU(page) && !lrucare, page); @@ -5522,7 +5522,7 @@ void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg, void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg, bool compound) { - unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1; + unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1; if (mem_cgroup_disabled()) return; @@ -5729,7 +5729,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage) /* Force-charge the new page. The old one will be freed soon */ compound = PageTransHuge(newpage); - nr_pages = compound ? hpage_nr_pages(newpage) : 1; + nr_pages = compound ? (1 << compound_order(newpage)) : 1; page_counter_charge(&memcg->memory, nr_pages); if (do_memsw_account()) -- Tsukada
Re: [PATCH 1/2] perf script: Show virtual addresses instead of offsets
Arnaldo, We already have a binary offset handy in perf code but there is no way to dump it with perf script. We can derive it from symname+symoff but that's a manual work. Will it be good to have a '-F binoff' option? Ravi On 05/18/2018 01:29 AM, Arnaldo Carvalho de Melo wrote: > Em Thu, May 17, 2018 at 12:03:25PM +0530, Sandipan Das escreveu: >> When perf data is recorded with the call-graph option enabled, >> the callchain shown by perf script shows the binary offsets of >> the symbols as the ip. This is incorrect for kernel symbols as >> the ip values are always off by a fixed offset depending on the >> architecture. If the offsets from the start of the symbols are >> printed, they are also incorrect for both kernel and userspace >> symbols. >> >> Without the call-graph option, the callchain shows the virtual >> addresses of the symbols rather than their binary offsets. The >> offsets printed in this case are also correct. >> >> This fixes the inconsistency in perf script's output. > > Thanks, tested and applied, > > - Arnaldo >
[PATCH v2 2/7] hugetlb: support migrate charging for surplus hugepages
Surplus hugepages allocated for migration also charge to memory cgroup. Signed-off-by: TSUKADA Koutaro --- hugetlb.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 679c151f..2e7b543 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1687,6 +1687,8 @@ static struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask, if (!page) return NULL; + surplus_hugepage_set_charge(h, page); + /* * We do not account these pages as surplus because they are only * temporary and will be released properly on the last reference -- Tsukada
[PATCH v2 1/7] hugetlb: introduce charge_surplus_huge_pages to struct hstate
The charge_surplus_huge_pages indicates to charge surplus huge pages obteined from the normal page pool to memory cgroup. The default value is false. This patch implements the core part of charging surplus hugepages. Use the private and mem_cgroup member of the second entry of compound hugepage for surplus hugepage charging. Mark when surplus hugepage is obtained from normal pool, and charge to memory cgroup at alloc_huge_page. Once the mapping of the page is decided, commit the charge. surplus hugepages will uncharge or cancel at free_huge_page. Signed-off-by: TSUKADA Koutaro --- include/linux/hugetlb.h |2 mm/hugetlb.c| 100 2 files changed, 102 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 36fa6a2..33fe5be 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -158,6 +158,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, unsigned long address, unsigned long end, pgprot_t newprot); bool is_hugetlb_entry_migration(pte_t pte); +bool PageSurplusCharge(struct page *page); #else /* !CONFIG_HUGETLB_PAGE */ @@ -338,6 +339,7 @@ struct hstate { unsigned int nr_huge_pages_node[MAX_NUMNODES]; unsigned int free_huge_pages_node[MAX_NUMNODES]; unsigned int surplus_huge_pages_node[MAX_NUMNODES]; + bool charge_surplus_huge_pages; /* default to off */ #ifdef CONFIG_CGROUP_HUGETLB /* cgroup control files */ struct cftype cgroup_files[5]; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2186791..679c151f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -36,6 +36,7 @@ #include #include #include +#include #include "internal.h" int hugetlb_max_hstate __read_mostly; @@ -1236,6 +1237,90 @@ static inline void ClearPageHugeTemporary(struct page *page) page[2].mapping = NULL; } +#define HUGETLB_SURPLUS_CHARGE 1UL + +bool PageSurplusCharge(struct page *page) +{ + if (!PageHuge(page)) + return false; + return page[1].private == HUGETLB_SURPLUS_CHARGE; +} + +static inline void SetPageSurplusCharge(struct page *page) +{ + page[1].private = HUGETLB_SURPLUS_CHARGE; +} + +static inline void ClearPageSurplusCharge(struct page *page) +{ + page[1].private = 0; +} + +static inline void +set_surplus_hugepage_memcg(struct page *page, struct mem_cgroup *memcg) +{ + page[1].mem_cgroup = memcg; +} + +static inline struct mem_cgroup *get_surplus_hugepage_memcg(struct page *page) +{ + return page[1].mem_cgroup; +} + +static void surplus_hugepage_set_charge(struct hstate *h, struct page *page) +{ + if (likely(!h->charge_surplus_huge_pages)) + return; + if (unlikely(!page)) + return; + SetPageSurplusCharge(page); +} + +static int surplus_hugepage_try_charge(struct page *page, struct mm_struct *mm) +{ + struct mem_cgroup *memcg; + + if (likely(!PageSurplusCharge(page))) + return 0; + + if (mem_cgroup_try_charge(page, mm, GFP_KERNEL, &memcg, true)) { + /* mem_cgroup oom invoked */ + ClearPageSurplusCharge(page); + return -ENOMEM; + } + set_surplus_hugepage_memcg(page, memcg); + + return 0; +} + +static void surplus_hugepage_commit_charge(struct page *page) +{ + struct mem_cgroup *memcg; + + if (likely(!PageSurplusCharge(page))) + return; + + memcg = get_surplus_hugepage_memcg(page); + mem_cgroup_commit_charge(page, memcg, false, true); + set_surplus_hugepage_memcg(page, NULL); +} + +static void surplus_hugepage_finalize_charge(struct page *page) +{ + struct mem_cgroup *memcg; + + if (likely(!PageSurplusCharge(page))) + return; + + memcg = get_surplus_hugepage_memcg(page); + if (memcg) + mem_cgroup_cancel_charge(page, memcg, true); + else + mem_cgroup_uncharge(page); + set_surplus_hugepage_memcg(page, NULL); + ClearPageSurplusCharge(page); +} + void free_huge_page(struct page *page) { /* @@ -1248,6 +1333,8 @@ void free_huge_page(struct page *page) (struct hugepage_subpool *)page_private(page); bool restore_reserve; + surplus_hugepage_finalize_charge(page); + set_page_private(page, 0); page->mapping = NULL; VM_BUG_ON_PAGE(page_count(page), page); @@ -1583,6 +1670,8 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask, out_unlock: spin_unlock(&hugetlb_lock); + surplus_hugepage_set_charge(h, page); + return page; } @@ -2062,6 +2151,11 @@ struct page *alloc_huge_page(struct vm_area_struct *vma, hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, page); spin_unlock(&hugetlb_lock); + if (unlikely(surplus_hugepage_try_charge(page, vma->vm_mm))) { +
[PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg
Thanks to Mike Kravetz for comment on the previous version patch. The purpose of this patch-set is to make it possible to control whether or not to charge surplus hugetlb pages obtained by overcommitting to memory cgroup. In the future, I am trying to accomplish limiting the memory usage of applications that use both normal pages and hugetlb pages by the memory cgroup(not use the hugetlb cgroup). Applications that use shared libraries like libhugetlbfs.so use both normal pages and hugetlb pages, but we do not know how much to use each. Please suppose you want to manage the memory usage of such applications by cgroup How do you set the memory cgroup and hugetlb cgroup limit when you want to limit memory usage to 10GB? If you set a limit of 10GB for each, the user can use a total of 20GB of memory and can not limit it well. Since it is difficult to estimate the ratio used by user of normal pages and hugetlb pages, setting limits of 2GB to memory cgroup and 8GB to hugetlb cgroup is not very good idea. In such a case, I thought that by using my patch-set, we could manage resources just by setting 10GB as the limit of memory cgoup(there is no limit to hugetlb cgroup). In this patch-set, introduce the charge_surplus_huge_pages(boolean) to struct hstate. If it is true, it charges to the memory cgroup to which the task that obtained surplus hugepages belongs. If it is false, do nothing as before, and the default value is false. The charge_surplus_huge_pages can be controlled procfs or sysfs interfaces. Since THP is very effective in environments with kernel page size of 4KB, such as x86, there is no reason to positively use HugeTLBfs, so I think that there is no situation to enable charge_surplus_huge_pages. However, in some distributions such as arm64, the page size of the kernel is 64KB, and the size of THP is too huge as 512MB, making it difficult to use. HugeTLBfs may support multiple huge page sizes, and in such a special environment there is a desire to use HugeTLBfs. The patch set is for 4.17.0-rc3+. I don't know whether patch-set are acceptable or not, so I just done a simple test. Thanks, Tsukada TSUKADA Koutaro (7): hugetlb: introduce charge_surplus_huge_pages to struct hstate hugetlb: supports migrate charging for surplus hugepages memcg: use compound_order rather than hpage_nr_pages mm, sysctl: make charging surplus hugepages controllable hugetlb: add charge_surplus_hugepages attribute Documentation, hugetlb: describe about charge_surplus_hugepages memcg: supports movement of surplus hugepages statistics Documentation/vm/hugetlbpage.txt |6 + include/linux/hugetlb.h |4 + kernel/sysctl.c |7 + mm/hugetlb.c | 148 +++ mm/memcontrol.c | 109 +++- 5 files changed, 269 insertions(+), 5 deletions(-) -- Tsukada
Re: mmotm 2018-05-17-16-26 uploaded (autofs)
On 05/17/2018 08:50 PM, Ian Kent wrote: > On 18/05/18 08:21, Randy Dunlap wrote: >> On 05/17/2018 04:26 PM, a...@linux-foundation.org wrote: >>> The mm-of-the-moment snapshot 2018-05-17-16-26 has been uploaded to >>> >>>http://www.ozlabs.org/~akpm/mmotm/ >>> >>> mmotm-readme.txt says >>> >>> README for mm-of-the-moment: >>> >>> http://www.ozlabs.org/~akpm/mmotm/ >>> >>> This is a snapshot of my -mm patch queue. Uploaded at random hopefully >>> more than once a week. >>> >>> You will need quilt to apply these patches to the latest Linus release (4.x >>> or 4.x-rcY). The series file is in broken-out.tar.gz and is duplicated in >>> http://ozlabs.org/~akpm/mmotm/series >>> >>> The file broken-out.tar.gz contains two datestamp files: .DATE and >>> .DATE--mm-dd-hh-mm-ss. Both contain the string -mm-dd-hh-mm-ss, >>> followed by the base kernel version against which this patch series is to >>> be applied. >>> >>> This tree is partially included in linux-next. To see which patches are >>> included in linux-next, consult the `series' file. Only the patches >>> within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in >>> linux-next. >>> >>> A git tree which contains the memory management portion of this tree is >>> maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git >>> by Michal Hocko. It contains the patches which are between the >>> "#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series >>> file, http://www.ozlabs.org/~akpm/mmotm/series. >>> >>> >>> A full copy of the full kernel tree with the linux-next and mmotm patches >>> already applied is available through git within an hour of the mmotm >>> release. Individual mmotm releases are tagged. The master branch always >>> points to the latest release, so it's constantly rebasing. >> >> >> on x86_64: with (randconfig): >> CONFIG_AUTOFS_FS=y >> CONFIG_AUTOFS4_FS=y > > Oh right, I need to make these exclusive. > > I seem to remember trying to do that along the way, can't remember why > I didn't do it in the end. > > Any suggestions about potential problems when doing it? I think that just using "depends on" for each of them will cause kconfig to complain about circular dependencies, so probably using "choice" will be needed. Or (since this is just temporary?) just say "don't do that." -- ~Randy
Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
On Fri, May 18, 2018 at 11:28:11AM +0800, Wei Hu (Xavier) wrote: > > > On 2018/5/17 23:14, Jason Gunthorpe wrote: > > On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote: > >> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c > >> b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c > >> index 86ef15f..e1c44a6 100644 > >> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c > >> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev > >> *hr_dev, > >>int ret = 0; > >>int ntc; > >> > >> + if (hr_dev->is_reset) > >> + return 0; > >> + > >>spin_lock_bh(&csq->lock); > >> > >>if (num > hns_roce_cmq_space(csq)) { > >> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct > >> hnae3_handle *handle) > >>return 0; > >> > >> error_failed_get_cfg: > >> + handle->priv = NULL; > >>kfree(hr_dev->priv); > >> > >> error_failed_kzalloc: > >> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct > >> hnae3_handle *handle, > >> { > >>struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv; > >> > >> + if (!hr_dev) > >> + return; > >> + > >>hns_roce_exit(hr_dev); > >> + handle->priv = NULL; > >>kfree(hr_dev->priv); > >>ib_dealloc_device(&hr_dev->ib_dev); > >> } > > Why are these hunks here? If init fails then uninit should not be > > called, so why meddle with priv? > In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with > hr_dev, > We want clear the value in hns_roce_hw_v2_uninit_instance function. > So we can ensure no problem in RoCE driver. What problem could happen? I keep removing unnecessary sets to null and checks of null, so please don't add them if they cannot happen. Eg uninit should never be called with a null priv, that is a serious logic mis-design someplace if it happens. Jason
Re: Revert "dmaengine: pl330: add DMA_PAUSE feature"
On 17-05-18, 12:20, Frank Mori Hess wrote: > Sorry to keep coming back to this, but I'm experiencing a bit of > incredulity that you are saying what you seem to be saying. You seem > to be saying dmaengine provides no way to permanently stop a transfer > safely other than transferring the full number of bytes initially > requested. So the proper resolution is the 8250 serial driver needs > to remove rx dma support, because they are just trying to do something > that is not supported. > > On Thu, May 17, 2018 at 12:19 AM, Vinod Koul wrote: > >> > Terminate is abort, data loss may happen here. > >> > >> Wait, are you saying if you do > >> > >> dma pause > > > > no data loss > >> read residue > > > > here as well > >> dma terminate > > > > Oh yes, we aborted... > >> > > I see two ways of interpreting what you are saying. First, from the > point of view of the user of the dmaengine api. From this point of > view it is impossible for data loss to occur during pause or reading > the residue, so saying they cause no data loss during > pause/residue/terminate is meaningless. This is because the user > can't confirm any data loss until after they have read the residue and > the transfer is terminated, since optimistically the data may still be > available if only the user would resume and allow the transfer to > continue. > > Second there is the interpretation I want to believe. This is "no > data loss on pause" means that after the pause, no data has been > discarded by the dma controller hardware, in fact all the data it has > read before being paused has been fully transferred to its > destination. Reading the residue while paused gives you an accurate, > up-to-date state of the paused transfer. Then finally, although in > general dma terminate causes data loss, it does not in this case since > we terminated while we were paused and read the up-to-date residue. > This is the interpretation implicit in the 8250 serial driver. You are simply mixing things up! On Pause we don't expect data loss, as user can resume the transfer. This means as you rightly guessed, the DMA HW should not drop any data, nor should SW. Now if you want to read residue at this point it is perfectly valid. But if you decide to terminate the channel (yes it is terminate_all API), we abort and don't have context to report back! As Lars rightly pointed out, residue calculation are very tricky, DMA fifo may have data, some data may be in device FIFO, so residue is always from DMA point of view and may differ from device view (more or less depending upon direction) Now if you require to add more features for your usecase, please do feel free to send a patch. The framework can always be improved, we haven't solved world hunger yet! -- ~Vinod
Re: [PATCH 0/3] Add support to disable sensor groups in P9
On 05/17/2018 06:08 PM, Guenter Roeck wrote: > On 05/16/2018 11:10 PM, Shilpasri G Bhat wrote: >> >> >> On 05/15/2018 08:32 PM, Guenter Roeck wrote: >>> On Thu, Mar 22, 2018 at 04:24:32PM +0530, Shilpasri G Bhat wrote: This patch series adds support to enable/disable OCC based inband-sensor groups at runtime. The environmental sensor groups are managed in HWMON and the remaining platform specific sensor groups are managed in /sys/firmware/opal. The firmware changes required for this patch is posted below: https://lists.ozlabs.org/pipermail/skiboot/2018-March/010812.html >>> >>> Sorry for not getting back earlier. This is a tough one. >>> >> >> Thanks for the reply. I have tried to answer your questions according to my >> understanding below: >> >>> Key problem is that you are changing the ABI with those new attributes. >>> On top of that, the attributes _do_ make some sense (many chips support >>> enabling/disabling of individual sensors), suggesting that those or >>> similar attributes may or even should at some point be added to the ABI. >>> >>> At the same time, returning "0" as measurement values when sensors are >>> disabled does not seem like a good idea, since "0" is a perfectly valid >>> measurement, at least for most sensors. >> >> I agree. >> >>> >>> Given that, we need to have a discussion about adding _enable attributes to >>> the ABI >> >>> what is the scope, >> IIUC the scope should be RW and the attribute is defined for each supported >> sensor group >> > > That is _your_ need. I am not aware of any other chip where a per-sensor group > attribute would make sense. The discussion we need has to extend beyond the > need > of a single chip. > > Guenter > Is it okay if the ABI provides provision for both types of attribute power_enable and powerX_enable. And is it okay to decide which type of attribute to be used by the capability provided by the hwmon chip? - Shilpa >>> when should the attributes exist and when not, >> We control this currently via device-tree >> >>> do we want/need power_enable or powerX_enable or both, and so on), and >> We need power_enable right now >> >>> what to return if a sensor is disabled (such as -ENODATA). >> -ENODATA sounds good. >> >> Thanks and Regards, >> Shilpa >> >> Once we have an >>> agreement, we can continue with an implementation. >>> >>> Guenter >>> Shilpasri G Bhat (3): powernv:opal-sensor-groups: Add support to enable sensor groups hwmon: ibmpowernv: Add attributes to enable/disable sensor groups powernv: opal-sensor-groups: Add attributes to disable/enable sensors .../ABI/testing/sysfs-firmware-opal-sensor-groups | 34 ++ Documentation/hwmon/ibmpowernv | 31 - arch/powerpc/include/asm/opal-api.h| 4 +- arch/powerpc/include/asm/opal.h| 2 + .../powerpc/platforms/powernv/opal-sensor-groups.c | 104 - arch/powerpc/platforms/powernv/opal-wrappers.S | 1 + drivers/hwmon/ibmpowernv.c | 127 +++-- 7 files changed, 265 insertions(+), 38 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-firmware-opal-sensor-groups -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >> >
Re: [PATCH] Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE"
On Thu, May 17, 2018 at 10:53:32AM -0700, Laura Abbott wrote: > On 05/17/2018 10:08 AM, Michal Hocko wrote: > >On Thu 17-05-18 18:49:47, Michal Hocko wrote: > >>On Thu 17-05-18 16:58:32, Ville Syrjälä wrote: > >>>On Thu, May 17, 2018 at 04:36:29PM +0300, Ville Syrjälä wrote: > On Thu, May 17, 2018 at 03:21:09PM +0200, Michal Hocko wrote: > >On Thu 17-05-18 15:59:59, Ville Syrjala wrote: > >>From: Ville Syrjälä > >> > >>This reverts commit bad8c6c0b1144694ecb0bc5629ede9b8b578b86e. > >> > >>Make x86 with HIGHMEM=y and CMA=y boot again. > > > >Is there any bug report with some more details? It is much more > >preferable to fix the issue rather than to revert the whole thing > >right away. > > The machine I have in front of me right now didn't give me anything. > Black screen, and netconsole was silent. No serial port on this > machine unfortunately. > >>> > >>>Booted on another machine with serial: > >> > >>Could you provide your .config please? > >> > >>[...] > >>>[0.00] cma: Reserved 4 MiB at 0x3700 > >>[...] > >>>[0.00] BUG: Bad page state in process swapper pfn:377fe > >>>[0.00] page:f53effc0 count:0 mapcount:-127 mapping: > >>>index:0x0 > >> > >>OK, so this looks the be the source of the problem. -128 would be a > >>buddy page but I do not see anything that would set the counter to -127 > >>and the real map count updates shouldn't really happen that early. > >> > >>Maybe CONFIG_DEBUG_VM and CONFIG_DEBUG_HIGHMEM will tell us more. > > > >Looking closer, I _think_ that the bug is in > >set_highmem_pages_init->is_highmem > >and zone_movable_is_highmem might force CMA pages in the zone movable to > >be initialized as highmem. And that sounds supicious to me. Joonsoo? > > > > For a point of reference, arm with this configuration doesn't hit this bug > because highmem pages are freed via the memblock interface only instead > of iterating through each zone. It looks like the x86 highmem code > assumes only a single highmem zone and/or it's disjoint? Good point! Reason of the crash is that the span of MOVABLE_ZONE is extended to whole node span for future CMA initialization, and, normal memory is wrongly freed here. Here goes the fix. Ville, Could you test below patch? I re-generated the issue on my side and this patch fixed it. Thanks. >8- >From 569899a4dbd28cebb8d350d3d1ebb590d88b2629 Mon Sep 17 00:00:00 2001 From: Joonsoo Kim Date: Fri, 18 May 2018 10:52:05 +0900 Subject: [PATCH] x86/32/highmem: check if the zone is matched when free highmem pages on init If CONFIG_CMA is enabled, it extends the span of the MOVABLE_ZONE to manage the CMA memory later. And, in this case, the span of the MOVABLE_ZONE could overlap the other zone's memory. We need to avoid freeing this overlapped memory here since it would be the memory of the other zone. Therefore, this patch adds a check whether the page is indeed on the requested zone or not. Skipped page will be freed when the memory of the matched zone is freed. Reported-by: Ville Syrjälä Signed-off-by: Joonsoo Kim --- arch/x86/include/asm/highmem.h | 4 ++-- arch/x86/mm/highmem_32.c | 5 - arch/x86/mm/init_32.c | 25 + 3 files changed, 27 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/highmem.h b/arch/x86/include/asm/highmem.h index a805993..e383f57 100644 --- a/arch/x86/include/asm/highmem.h +++ b/arch/x86/include/asm/highmem.h @@ -72,8 +72,8 @@ void *kmap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot); #define flush_cache_kmaps()do { } while (0) -extern void add_highpages_with_active_regions(int nid, unsigned long start_pfn, - unsigned long end_pfn); +extern void add_highpages_with_active_regions(int nid, struct zone *zone, + unsigned long start_pfn, unsigned long end_pfn); #endif /* __KERNEL__ */ diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c index 6d18b70..bf9f5b8 100644 --- a/arch/x86/mm/highmem_32.c +++ b/arch/x86/mm/highmem_32.c @@ -120,6 +120,9 @@ void __init set_highmem_pages_init(void) if (!is_highmem(zone)) continue; + if (!populated_zone(zone)) + continue; + zone_start_pfn = zone->zone_start_pfn; zone_end_pfn = zone_start_pfn + zone->spanned_pages; @@ -127,7 +130,7 @@ void __init set_highmem_pages_init(void) printk(KERN_INFO "Initializing %s for node %d (%08lx:%08lx)\n", zone->name, nid, zone_start_pfn, zone_end_pfn); - add_highpages_with_active_regions(nid, zone_start_pfn, + add_highpages_with_active_regions(nid, zone, zone_start_pfn, zone_end_pfn); } } diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 8008db2..f
[PATCH v2 5/5] arm64: dts: rockchip: Add sdmmc UHS support for roc-rk3328-cc
From: Levin Du In roc-rk3328-cc board, the signal voltage of sdmmc is supplied by the vcc_sdio regulator, which is a mux between 1.8V and 3.3V, controlled by a special output only gpio pin labeled "gpiomut_pmuio_iout", corresponding bit 1 of the syscon GRF_SOC_CON10. This special pin can now be reference as <&gpio_mute 1>, thanks to the gpio-syscon driver, which makes writing regulator-gpio possible. If the signal voltage changes, the io domain needs to change correspondingly. To use this feature, the following options are required in kernel config: - CONFIG_GPIO_SYSCON=y - CONFIG_POWER_AVS=y - CONFIG_ROCKCHIP_IODOMAIN=y Signed-off-by: Levin Du --- Changes in v2: - Rename gpio_syscon10 to gpio_mute in rk3328-roc-cc.dts Changes in v1: - Split into small patches - Sort dts properties in sdmmc node arch/arm64/boot/dts/rockchip/rk3328-roc-cc.dts | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3328-roc-cc.dts b/arch/arm64/boot/dts/rockchip/rk3328-roc-cc.dts index b983abd..e3162bb 100644 --- a/arch/arm64/boot/dts/rockchip/rk3328-roc-cc.dts +++ b/arch/arm64/boot/dts/rockchip/rk3328-roc-cc.dts @@ -41,6 +41,19 @@ vin-supply = <&vcc_io>; }; + vcc_sdio: sdmmcio-regulator { + compatible = "regulator-gpio"; + gpios = <&gpio_mute 1 GPIO_ACTIVE_HIGH>; + states = <180 0x1 + 330 0x0>; + regulator-name = "vcc_sdio"; + regulator-type = "voltage"; + regulator-min-microvolt = <180>; + regulator-max-microvolt = <330>; + regulator-always-on; + vin-supply = <&vcc_sys>; + }; + vcc_host1_5v: vcc_otg_5v: vcc-host1-5v-regulator { compatible = "regulator-fixed"; enable-active-high; @@ -213,7 +226,7 @@ vccio1-supply = <&vcc_io>; vccio2-supply = <&vcc18_emmc>; - vccio3-supply = <&vcc_io>; + vccio3-supply = <&vcc_sdio>; vccio4-supply = <&vcc_18>; vccio5-supply = <&vcc_io>; vccio6-supply = <&vcc_io>; @@ -242,7 +255,12 @@ max-frequency = <15000>; pinctrl-names = "default"; pinctrl-0 = <&sdmmc0_clk &sdmmc0_cmd &sdmmc0_dectn &sdmmc0_bus4>; + sd-uhs-sdr12; + sd-uhs-sdr25; + sd-uhs-sdr50; + sd-uhs-sdr104; vmmc-supply = <&vcc_sd>; + vqmmc-supply = <&vcc_sdio>; status = "okay"; }; -- 2.7.4
[PATCH v2 2/5] gpio: syscon: Add gpio-syscon for rockchip
From: Levin Du Some GPIOs sit in the GRF_SOC_CON registers of Rockchip SoCs, which do not belong to the general pinctrl. Adding gpio-syscon support makes controlling regulator or LED using these special pins very easy by reusing existing drivers, such as gpio-regulator and led-gpio. Signed-off-by: Levin Du --- Changes in v2: - Rename gpio_syscon10 to gpio_mute in doc Changes in v1: - Refactured for general gpio-syscon usage for Rockchip SoCs. - Add doc rockchip,gpio-syscon.txt .../bindings/gpio/rockchip,gpio-syscon.txt | 41 ++ drivers/gpio/gpio-syscon.c | 30 2 files changed, 71 insertions(+) create mode 100644 Documentation/devicetree/bindings/gpio/rockchip,gpio-syscon.txt diff --git a/Documentation/devicetree/bindings/gpio/rockchip,gpio-syscon.txt b/Documentation/devicetree/bindings/gpio/rockchip,gpio-syscon.txt new file mode 100644 index 000..b1b2a67 --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/rockchip,gpio-syscon.txt @@ -0,0 +1,41 @@ +* Rockchip GPIO support for GRF_SOC_CON registers + +Required properties: +- compatible: Should contain "rockchip,gpio-syscon". +- gpio-controller: Marks the device node as a gpio controller. +- #gpio-cells: Should be two. The first cell is the pin number and + the second cell is used to specify the gpio polarity: +0 = Active high, +1 = Active low. +- gpio,syscon-dev: Should contain . + If declared as child of the grf node, the grf_phandle can be 0. + +Example: + +1. As child of grf node: + + grf: syscon@ff10 { + compatible = "rockchip,rk3328-grf", "syscon", "simple-mfd"; + + gpio_mute: gpio-mute { + compatible = "rockchip,gpio-syscon"; + gpio-controller; + #gpio-cells = <2>; + gpio,syscon-dev = <0 0x0428 0>; + }; + }; + + +2. Not child of grf node: + + grf: syscon@ff10 { + compatible = "rockchip,rk3328-grf", "syscon", "simple-mfd"; + //... + }; + + gpio_mute: gpio-mute { + compatible = "rockchip,gpio-syscon"; + gpio-controller; + #gpio-cells = <2>; + gpio,syscon-dev = <&grf 0x0428 0>; + }; diff --git a/drivers/gpio/gpio-syscon.c b/drivers/gpio/gpio-syscon.c index 7325b86..e24b408 100644 --- a/drivers/gpio/gpio-syscon.c +++ b/drivers/gpio/gpio-syscon.c @@ -135,6 +135,32 @@ static const struct syscon_gpio_data clps711x_mctrl_gpio = { .dat_bit_offset = 0x40 * 8 + 8, }; +static void rockchip_gpio_set(struct gpio_chip *chip, unsigned int offset, + int val) +{ + struct syscon_gpio_priv *priv = gpiochip_get_data(chip); + unsigned int offs; + u8 bit; + u32 data; + int ret; + + offs = priv->dreg_offset + priv->data->dat_bit_offset + offset; + bit = offs % SYSCON_REG_BITS; + data = (val ? BIT(bit) : 0) | BIT(bit + 16); + ret = regmap_write(priv->syscon, + (offs / SYSCON_REG_BITS) * SYSCON_REG_SIZE, + data); + if (ret < 0) + dev_err(chip->parent, "gpio write failed ret(%d)\n", ret); +} + +static const struct syscon_gpio_data rockchip_gpio_syscon = { + /* Rockchip GRF_SOC_CON Bits 0-15 */ + .flags = GPIO_SYSCON_FEAT_OUT, + .bit_count = 16, + .set= rockchip_gpio_set, +}; + #define KEYSTONE_LOCK_BIT BIT(0) static void keystone_gpio_set(struct gpio_chip *chip, unsigned offset, int val) @@ -175,6 +201,10 @@ static const struct of_device_id syscon_gpio_ids[] = { .compatible = "ti,keystone-dsp-gpio", .data = &keystone_dsp_gpio, }, + { + .compatible = "rockchip,gpio-syscon", + .data = &rockchip_gpio_syscon, + }, { } }; MODULE_DEVICE_TABLE(of, syscon_gpio_ids); -- 2.7.4
[PATCH v2 1/5] gpio: syscon: allow fetching syscon from parent node
From: Heiko Stuebner Syscon nodes can be a simple-mfd and the syscon-users then be declared as children of this node. That way the parent-child structure can be better represented for devices that are fully embedded in the syscon. Therefore allow getting the syscon from the parent if neither a special compatible nor a gpio,syscon-dev property is defined. Signed-off-by: Heiko Stuebner Signed-off-by: Levin Du --- Changes in v2: None Changes in v1: - New: allow fetching syscon from parent node in gpio-syscon driver drivers/gpio/gpio-syscon.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpio/gpio-syscon.c b/drivers/gpio/gpio-syscon.c index 537cec7..7325b86 100644 --- a/drivers/gpio/gpio-syscon.c +++ b/drivers/gpio/gpio-syscon.c @@ -205,6 +205,8 @@ static int syscon_gpio_probe(struct platform_device *pdev) } else { priv->syscon = syscon_regmap_lookup_by_phandle(np, "gpio,syscon-dev"); + if (IS_ERR(priv->syscon) && np->parent) + priv->syscon = syscon_node_to_regmap(np->parent); if (IS_ERR(priv->syscon)) return PTR_ERR(priv->syscon); -- 2.7.4
[PATCH v2 3/5] arm64: dts: rockchip: Add gpio-mute to rk3328
From: Levin Du Adding a new gpio controller named "gpio-mute" to rk3328, providing access to the GPIO_MUTE pin defined in the syscon GRF_SOC_CON10. The GPIO_MUTE pin is referred to as <&gpio-mute 1>. Signed-off-by: Levin Du --- Changes in v2: - Rename gpio_syscon10 to gpio_mute in rk3328.dtsi Changes in v1: - Split from V0 and add to rk3328.dtsi for general use. arch/arm64/boot/dts/rockchip/rk3328.dtsi | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3328.dtsi b/arch/arm64/boot/dts/rockchip/rk3328.dtsi index b8e9da1..5ba29d3 100644 --- a/arch/arm64/boot/dts/rockchip/rk3328.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3328.dtsi @@ -309,6 +309,13 @@ mode-loader = ; }; + /* The GPIO_MUTE pin is referred to as <&gpio-mute 1>.*/ + gpio_mute: gpio-mute { + compatible = "rockchip,gpio-syscon"; + gpio-controller; + #gpio-cells = <2>; + gpio,syscon-dev = <0 0x0428 0>; + }; }; uart0: serial@ff11 { -- 2.7.4
[PATCH v2 4/5] arm64: dts: rockchip: Add io-domain to roc-rk3328-cc
From: Levin Du It is necessary for the io domain setting of the SoC to match the voltage supplied by the regulators. Signed-off-by: Levin Du --- Changes in v2: None Changes in v1: - Split from V0. arch/arm64/boot/dts/rockchip/rk3328-roc-cc.dts | 12 1 file changed, 12 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3328-roc-cc.dts b/arch/arm64/boot/dts/rockchip/rk3328-roc-cc.dts index 246c317..b983abd 100644 --- a/arch/arm64/boot/dts/rockchip/rk3328-roc-cc.dts +++ b/arch/arm64/boot/dts/rockchip/rk3328-roc-cc.dts @@ -208,6 +208,18 @@ }; }; +&io_domains { + status = "okay"; + + vccio1-supply = <&vcc_io>; + vccio2-supply = <&vcc18_emmc>; + vccio3-supply = <&vcc_io>; + vccio4-supply = <&vcc_18>; + vccio5-supply = <&vcc_io>; + vccio6-supply = <&vcc_io>; + pmuio-supply = <&vcc_io>; +}; + &pinctrl { pmic { pmic_int_l: pmic-int-l { -- 2.7.4
Re: mmotm 2018-05-17-16-26 uploaded (autofs)
On 18/05/18 08:21, Randy Dunlap wrote: > On 05/17/2018 04:26 PM, a...@linux-foundation.org wrote: >> The mm-of-the-moment snapshot 2018-05-17-16-26 has been uploaded to >> >>http://www.ozlabs.org/~akpm/mmotm/ >> >> mmotm-readme.txt says >> >> README for mm-of-the-moment: >> >> http://www.ozlabs.org/~akpm/mmotm/ >> >> This is a snapshot of my -mm patch queue. Uploaded at random hopefully >> more than once a week. >> >> You will need quilt to apply these patches to the latest Linus release (4.x >> or 4.x-rcY). The series file is in broken-out.tar.gz and is duplicated in >> http://ozlabs.org/~akpm/mmotm/series >> >> The file broken-out.tar.gz contains two datestamp files: .DATE and >> .DATE--mm-dd-hh-mm-ss. Both contain the string -mm-dd-hh-mm-ss, >> followed by the base kernel version against which this patch series is to >> be applied. >> >> This tree is partially included in linux-next. To see which patches are >> included in linux-next, consult the `series' file. Only the patches >> within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in >> linux-next. >> >> A git tree which contains the memory management portion of this tree is >> maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git >> by Michal Hocko. It contains the patches which are between the >> "#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series >> file, http://www.ozlabs.org/~akpm/mmotm/series. >> >> >> A full copy of the full kernel tree with the linux-next and mmotm patches >> already applied is available through git within an hour of the mmotm >> release. Individual mmotm releases are tagged. The master branch always >> points to the latest release, so it's constantly rebasing. > > > on x86_64: with (randconfig): > CONFIG_AUTOFS_FS=y > CONFIG_AUTOFS4_FS=y Oh right, I need to make these exclusive. I seem to remember trying to do that along the way, can't remember why I didn't do it in the end. Any suggestions about potential problems when doing it? Thanks, Ian
Re: [PATCH v2 3/9] security: define security_kernel_read_blob() wrapper
Casey Schaufler writes: > On 5/17/2018 7:48 AM, Mimi Zohar wrote: >> In order for LSMs and IMA-appraisal to differentiate between the original >> and new syscalls (eg. kexec, kernel modules, firmware), both the original >> and new syscalls must call an LSM hook. >> >> Commit 2e72d51b4ac3 ("security: introduce kernel_module_from_file hook") >> introduced calling security_kernel_module_from_file() in both the original >> and new syscalls. Commit a1db74209483 ("module: replace >> copy_module_from_fd with kernel version") replaced these LSM calls with >> security_kernel_read_file(). >> >> Commit e40ba6d56b41 ("firmware: replace call to fw_read_file_contents() >> with kernel version") and commit b804defe4297 ("kexec: replace call to >> copy_file_from_fd() with kernel version") replaced their own version of >> reading a file from the kernel with the generic >> kernel_read_file_from_path/fd() versions, which call the pre and post >> security_kernel_read_file LSM hooks. >> >> Missing are LSM calls in the original kexec syscall and firmware sysfs >> fallback method. From a technical perspective there is no justification >> for defining a new LSM hook, as the existing security_kernel_read_file() >> works just fine. The original syscalls, however, do not read a file, so >> the security hook name is inappropriate. Instead of defining a new LSM >> hook, this patch defines security_kernel_read_blob() as a wrapper for >> the existing LSM security_kernel_file_read() hook. > > What a marvelous opportunity to bikeshed! > > I really dislike adding another security_ interface just because > the name isn't quite right. Especially a wrapper, which is just > code and execution overhead. Why not change security_kernel_read_file() > to security_kernel_read_blob() everywhere and be done? Nacked-by: "Eric W. Biederman" Nack on this sharing nonsense. These two interfaces do not share any code in their implementations other than the if statement to distinguish between the two cases. Casey you are wrong. We need something different here. Mimi a wrapper does not cut it. The code is not shared. Despite using a single function call today. If we want comprehensible and maintainable code in the security modules we need to split these two pieces of functionality apart. Eric
[PATCH v2 0/5] Add sdmmc UHS support to ROC-RK3328-CC board.
From: Levin Du Hi all, this is an attemp to add sdmmc UHS support to the ROC-RK3328-CC board. This patch series adds a new compatible `rockchip,gpio-syscon` to the gpio-syscon driver for general Rockchip SoC usage. A new gpio controller named `gpio_mute` is defined in rk3328.dtsi so that all rk3328 boards has access to it. The ROC-RK3328-CC board use the new gpio <&gpio_mute 1> in gpio-regulator to control the signal voltage of the sdmmc. It is essential for UHS support which requires 1.8V signal voltage. Many thanks to Heiko's great advice! Changes in v2: - Rename gpio_syscon10 to gpio_mute in doc - Rename gpio_syscon10 to gpio_mute in rk3328.dtsi - Rename gpio_syscon10 to gpio_mute in rk3328-roc-cc.dts Changes in v1: - New: allow fetching syscon from parent node in gpio-syscon driver - Refactured for general gpio-syscon usage for Rockchip SoCs. - Add doc rockchip,gpio-syscon.txt - Split from V0 into small patches - Sort dts properties in sdmmc node Heiko Stuebner (1): gpio: syscon: allow fetching syscon from parent node Levin Du (4): gpio: syscon: Add gpio-syscon for rockchip arm64: dts: rockchip: Add gpio-mute to rk3328 arm64: dts: rockchip: Add io-domain to roc-rk3328-cc arm64: dts: rockchip: Add sdmmc UHS support for roc-rk3328-cc .../bindings/gpio/rockchip,gpio-syscon.txt | 41 ++ arch/arm64/boot/dts/rockchip/rk3328-roc-cc.dts | 30 arch/arm64/boot/dts/rockchip/rk3328.dtsi | 7 drivers/gpio/gpio-syscon.c | 32 + 4 files changed, 110 insertions(+) create mode 100644 Documentation/devicetree/bindings/gpio/rockchip,gpio-syscon.txt -- 2.7.4
[PATCH v2] net: qcom/emac: Allocate buffers from local node
Currently we use non-NUMA aware allocation for TPD and RRD buffers, this patch modifies to use NUMA friendly allocation. Signed-off-by: Hemanth Puranik --- Change since v1: - Addressed comments related to ordering drivers/net/ethernet/qualcomm/emac/emac-mac.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c b/drivers/net/ethernet/qualcomm/emac/emac-mac.c index 092718a..031f6e6 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c +++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c @@ -683,10 +683,11 @@ static int emac_tx_q_desc_alloc(struct emac_adapter *adpt, struct emac_tx_queue *tx_q) { struct emac_ring_header *ring_header = &adpt->ring_header; + int node = dev_to_node(adpt->netdev->dev.parent); size_t size; size = sizeof(struct emac_buffer) * tx_q->tpd.count; - tx_q->tpd.tpbuff = kzalloc(size, GFP_KERNEL); + tx_q->tpd.tpbuff = kzalloc_node(size, GFP_KERNEL, node); if (!tx_q->tpd.tpbuff) return -ENOMEM; @@ -723,11 +724,12 @@ static void emac_rx_q_bufs_free(struct emac_adapter *adpt) static int emac_rx_descs_alloc(struct emac_adapter *adpt) { struct emac_ring_header *ring_header = &adpt->ring_header; + int node = dev_to_node(adpt->netdev->dev.parent); struct emac_rx_queue *rx_q = &adpt->rx_q; size_t size; size = sizeof(struct emac_buffer) * rx_q->rfd.count; - rx_q->rfd.rfbuff = kzalloc(size, GFP_KERNEL); + rx_q->rfd.rfbuff = kzalloc_node(size, GFP_KERNEL, node); if (!rx_q->rfd.rfbuff) return -ENOMEM; -- Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
On 2018/5/17 23:14, Jason Gunthorpe wrote: > On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote: >> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >> b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >> index 86ef15f..e1c44a6 100644 >> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev, >> int ret = 0; >> int ntc; >> >> +if (hr_dev->is_reset) >> +return 0; >> + >> spin_lock_bh(&csq->lock); >> >> if (num > hns_roce_cmq_space(csq)) { >> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct >> hnae3_handle *handle) >> return 0; >> >> error_failed_get_cfg: >> +handle->priv = NULL; >> kfree(hr_dev->priv); >> >> error_failed_kzalloc: >> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct >> hnae3_handle *handle, >> { >> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv; >> >> +if (!hr_dev) >> +return; >> + >> hns_roce_exit(hr_dev); >> +handle->priv = NULL; >> kfree(hr_dev->priv); >> ib_dealloc_device(&hr_dev->ib_dev); >> } > Why are these hunks here? If init fails then uninit should not be > called, so why meddle with priv? In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with hr_dev, We want clear the value in hns_roce_hw_v2_uninit_instance function. So we can ensure no problem in RoCE driver. static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle) { struct hns_roce_dev *hr_dev; int ret; hr_dev = (struct hns_roce_dev *)ib_alloc_device(sizeof(*hr_dev)); if (!hr_dev) return -ENOMEM; ...// other code handle->priv = hr_dev; // other code return 0; error_xxx: handle->priv = NULL; ...// other code error_: ib_dealloc_device(&hr_dev->ib_dev); return ret; } static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle, bool reset) { struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv; if (!hr_dev) return; hns_roce_exit(hr_dev); handle->priv = NULL; kfree(hr_dev->priv); ib_dealloc_device(&hr_dev->ib_dev); } > > Jason > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > >
Re: [PATCHv2][SMB3] Add kernel trace support
Very nice. Acked-by: Ronnie Sahlberg Possibly change the output from pid=6633 tid=0x0 sid=0x0 cmd=0 mid=0 to cmd=0 mid=0 pid=6633 tid=0x0 sid=0x0 just to make it easier for human-searching. I think the cmd will be useful much more often than pid/tid/sid and this would make it easier to look for as all cmd= entries will be aligned to the same column. - Original Message - From: "Steve French" To: "CIFS" , "LKML" , "samba-technical" , "linux-fsdevel" Sent: Friday, 18 May, 2018 12:36:36 PM Subject: [PATCHv2][SMB3] Add kernel trace support Patch updated with additional tracepoint locations and some formatting improvements. There are some obvious additional tracepoints that could be added, but this should be a reasonable group to start with. >From edc02d6f9dc24963d510c7ef59067428d3b082d3 Mon Sep 17 00:00:00 2001 From: Steve French Date: Thu, 17 May 2018 21:16:55 -0500 Subject: [PATCH] smb3: Add ftrace tracepoints for improved SMB3 debugging Although dmesg logs and wireshark network traces can be helpful, being able to dynamically enable/disable tracepoints (in this case via the kernel ftrace mechanism) can also be helpful in more quickly debugging problems, and more selectively tracing the events related to the bug report. This patch adds 12 ftrace tracepoints to cifs.ko for SMB3 events in some obvious locations. Subsequent patches will add more as needed. Example use: trace-cmd record -e cifs trace-cmd show Various trace events can be filtered. See: trace-cmd list | grep cifs for the current list of cifs tracepoints. Sample output (from mount and writing to a file): root@smf:/sys/kernel/debug/tracing/events/cifs# trace-cmd show mount.cifs-6633 [006] 7246.936461: smb3_cmd_done: pid=6633 tid=0x0 sid=0x0 cmd=0 mid=0 mount.cifs-6633 [006] 7246.936701: smb3_cmd_err: pid=6633 tid=0x0 sid=0x3d9cf8e5 cmd=1 mid=1 status=0xc016 rc=-5 mount.cifs-6633 [006] 7246.943055: smb3_cmd_done: pid=6633 tid=0x0 sid=0x3d9cf8e5 cmd=1 mid=2 mount.cifs-6633 [006] 7246.943298: smb3_cmd_done: pid=6633 tid=0xf9447636 sid=0x3d9cf8e5 cmd=3 mid=3 mount.cifs-6633 [006] 7246.943446: smb3_cmd_done: pid=6633 tid=0xf9447636 sid=0x3d9cf8e5 cmd=11 mid=4 mount.cifs-6633 [006] 7246.943659: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=3 mid=5 mount.cifs-6633 [006] 7246.943766: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=11 mid=6 mount.cifs-6633 [006] 7246.943937: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=5 mid=7 mount.cifs-6633 [006] 7246.944020: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=16 mid=8 mount.cifs-6633 [006] 7246.944091: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=16 mid=9 mount.cifs-6633 [006] 7246.944163: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=16 mid=10 mount.cifs-6633 [006] 7246.944218: smb3_cmd_err: pid=6633 tid=0xf9447636 sid=0x3d9cf8e5 cmd=11 mid=11 status=0xc225 rc=-2 mount.cifs-6633 [006] 7246.944219: smb3_fsctl_err: xid=0 fid=0x tid=0xf9447636 sid=0x3d9cf8e5 class=0 type=393620 rc=-2 mount.cifs-6633 [007] 7246.944353: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=16 mid=12 bash-2071 [000] 7256.903844: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=5 mid=13 bash-2071 [000] 7256.904172: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=16 mid=14 bash-2071 [000] 7256.904471: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=17 mid=15 bash-2071 [000] 7256.904950: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=5 mid=16 bash-2071 [000] 7256.905305: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=17 mid=17 bash-2071 [000] 7256.905688: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=6 mid=18 bash-2071 [000] 7256.905809: smb3_write_done: xid=0 fid=0xd628f511 tid=0xe1b781a sid=0x3d9cf8e5 offset=0x0 len=0x1b Signed-off-by: Steve French --- fs/cifs/Makefile | 7 +- fs/cifs/smb2maperror.c | 10 +- fs/cifs/smb2pdu.c | 56 +++- fs/cifs/trace.c| 18 +++ fs/cifs/trace.h| 298 + 5 files changed, 379 insertions(+), 10 deletions(-) create mode 100644 fs/cifs/trace.c create mode 100644 fs/cifs/trace.h diff --git a/fs/cifs/Makefile b/fs/cifs/Makefile index 7e4a1e2f0696..85817991ee68 100644 --- a/fs/cifs/Makefile +++ b/fs/cifs/Makefile @@ -1,11 +1,12 @@ # SPDX-License-Identifier: GPL-2.0 # -# Makefile for Linux CIFS VFS client +# Makefile for Linux CIFS/SMB2/SMB3 VFS client # +ccflags-y += -I$(src)# needed for trace events obj-$(CONFIG_CIFS) += cifs.o -cifs-y := cifsfs.o cifssmb.o cifs_debug.o connect.o dir.o file.o inode.o \ - li
[PATCH v2] Print the memcg's name when system-wide OOM happened
From: yuzhoujian The dump_header does not print the memcg's name when the system oom happened. So users cannot locate the certain container which contains the task that has been killed by the oom killer. System oom report will contain the memcg's name after this patch. Changes since v1: - replace adding mem_cgroup_print_oom_info with printing the memcg's name only. Signed-off-by: yuzhoujian --- mm/oom_kill.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 8ba6cb88cf58..b0abb5930232 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -433,6 +433,9 @@ static void dump_header(struct oom_control *oc, struct task_struct *p) if (is_memcg_oom(oc)) mem_cgroup_print_oom_info(oc->memcg, p); else { + pr_info("Task in "); + pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id)); + pr_cont(" killed as a result of limit of "); show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask); if (is_dump_unreclaim_slabs()) dump_unreclaimable_slab(); -- 2.14.1
[PATCH -mm] mm, huge page: Copy to access sub-page last when copy huge page
From: Huang Ying Huge page helps to reduce TLB miss rate, but it has higher cache footprint, sometimes this may cause some issue. For example, when copying huge page on x86_64 platform, the cache footprint is 4M. But on a Xeon E5 v3 2699 CPU, there are 18 cores, 36 threads, and only 45M LLC (last level cache). That is, in average, there are 2.5M LLC for each core and 1.25M LLC for each thread. If the cache pressure is heavy when copying the huge page, and we copy the huge page from the begin to the end, it is possible that the begin of huge page is evicted from the cache after we finishing copying the end of the huge page. And it is possible for the application to access the begin of the huge page after copying the huge page. To help the above situation, in this patch, when we copy a huge page, the order to copy sub-pages is changed. In quite some situation, we can get the address that the application will access after we copy the huge page, for example, in a page fault handler. Instead of copying the huge page from begin to end, we will copy the sub-pages farthest from the the sub-page to access firstly, and copy the sub-page to access last. This will make the sub-page to access most cache-hot and sub-pages around it more cache-hot too. If we cannot know the address the application will access, the begin of the huge page is assumed to be the the address the application will access. The patch is a generic optimization which should benefit quite some workloads, not for a specific use case. To demonstrate the performance benefit of the patch, we tested it with vm-scalability run on transparent huge page. With this patch, the throughput increases ~16.6% in vm-scalability anon-cow-seq test case with 36 processes on a 2 socket Xeon E5 v3 2699 system (36 cores, 72 threads). The test case set /sys/kernel/mm/transparent_hugepage/enabled to be always, mmap() a big anonymous memory area and populate it, then forked 36 child processes, each writes to the anonymous memory area from the begin to the end, so cause copy on write. For each child process, other child processes could be seen as other workloads which generate heavy cache pressure. At the same time, the IPC (instruction per cycle) increased from 0.63 to 0.78, and the time spent in user space is reduced ~7.2%. Signed-off-by: "Huang, Ying" Cc: Andi Kleen Cc: Jan Kara Cc: Michal Hocko Cc: Andrea Arcangeli Cc: "Kirill A. Shutemov" Cc: Matthew Wilcox Cc: Hugh Dickins Cc: Minchan Kim Cc: Shaohua Li Cc: Christopher Lameter Cc: Mike Kravetz --- include/linux/mm.h | 3 ++- mm/huge_memory.c | 3 ++- mm/memory.c| 43 +++ 3 files changed, 43 insertions(+), 6 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3fa3b1356c34..a5fae31988e6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2732,7 +2732,8 @@ extern void clear_huge_page(struct page *page, unsigned long addr_hint, unsigned int pages_per_huge_page); extern void copy_user_huge_page(struct page *dst, struct page *src, - unsigned long addr, struct vm_area_struct *vma, + unsigned long addr_hint, + struct vm_area_struct *vma, unsigned int pages_per_huge_page); extern long copy_huge_page_from_user(struct page *dst_page, const void __user *usr_src, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 323acdd14e6e..7e720e92fcd6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1331,7 +1331,8 @@ int do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) if (!page) clear_huge_page(new_page, vmf->address, HPAGE_PMD_NR); else - copy_user_huge_page(new_page, page, haddr, vma, HPAGE_PMD_NR); + copy_user_huge_page(new_page, page, vmf->address, + vma, HPAGE_PMD_NR); __SetPageUptodate(new_page); mmun_start = haddr; diff --git a/mm/memory.c b/mm/memory.c index 14578158ed20..f8868c94d6ab 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4654,10 +4654,12 @@ static void copy_user_gigantic_page(struct page *dst, struct page *src, } void copy_user_huge_page(struct page *dst, struct page *src, -unsigned long addr, struct vm_area_struct *vma, +unsigned long addr_hint, struct vm_area_struct *vma, unsigned int pages_per_huge_page) { - int i; + int i, n, base, l; + unsigned long addr = addr_hint & + ~(((unsigned long)pages_per_huge_page << PAGE_SHIFT) - 1); if (unlikely(pages_per_huge_page > MAX_ORDER_NR_PAGES)) { copy_user_gigantic_page(dst, src, addr, vma, @@ -4665,10 +4667,43 @@ void copy_user_huge_page(struct page *dst, struct page *src, return;
RE: [External] Re: [PATCH v1] include/linux/gfp.h: getting rid of GFP_ZONE_TABLE/BAD
> From: Matthew Wilcox [mailto:wi...@infradead.org] > Sent: Friday, May 11, 2018 9:26 PM > On Fri, May 11, 2018 at 03:24:34AM +, Huaisheng HS1 Ye wrote: > > > From: owner-linux...@kvack.org [mailto:owner-linux...@kvack.org] On > > > Behalf Of > Matthew > > > Wilcox > > > On Fri, May 11, 2018 at 12:10:25AM +0800, Huaisheng Ye wrote: > > > > -#define __GFP_DMA ((__force gfp_t)___GFP_DMA) > > > > -#define __GFP_HIGHMEM ((__force gfp_t)___GFP_HIGHMEM) > > > > -#define __GFP_DMA32((__force gfp_t)___GFP_DMA32) > > > > +#define __GFP_DMA ((__force gfp_t)OPT_ZONE_DMA ^ ZONE_NORMAL) > > > > +#define __GFP_HIGHMEM ((__force gfp_t)ZONE_MOVABLE ^ ZONE_NORMAL) > > > > +#define __GFP_DMA32((__force gfp_t)OPT_ZONE_DMA32 ^ ZONE_NORMAL) > > > > > > No, you've made gfp_zone even more complex than it already is. > > > If you can't use OPT_ZONE_HIGHMEM here, then this is a waste of time. > > > > > Dear Matthew, > > > > The reason why I don't use OPT_ZONE_HIGHMEM for __GFP_HIGHMEM > > directly is that, > for x86_64 platform there is no CONFIG_HIGHMEM, so OPT_ZONE_HIGHMEM shall > always be > equal to ZONE_NORMAL. > > Right. On 64-bit platforms, if somebody asks for HIGHMEM, they should > get NORMAL pages. > > > For gfp_zone it is impossible to distinguish the meaning of lowest 3 bits > > in flags. > How can gfp_zone to understand it comes from OPT_ZONE_HIGHMEM or ZONE_NORMAL? > > And the most pained thing is that, if __GFP_HIGHMEM with movable flag > > enabled, it > means that ZONE_MOVABLE shall be returned. > > That is different from ZONE_DMA, ZONE_DMA32 and ZONE_NORMAL. > > The point of this exercise is to actually encode the zone number in > the bottom bits of the GFP flags instead of something which has to be > interpreted into a zone number. When somebody sets __GFP_MOVABLE, they > should also be setting ZONE_MOVABLE: > > -#define __GFP_MOVABLE ((__force gfp_t)___GFP_MOVABLE) /* ZONE_MOVABLE > allowed */ > +#define __GFP_MOVABLE ((__force gfp_t)(___GFP_MOVABLE | (ZONE_MOVABLE ^ > ZONE_NORMAL))) > > One thing that does need to change is: > > -#define GFP_HIGHUSER_MOVABLE(GFP_HIGHUSER | __GFP_MOVABLE) > +#define GFP_HIGHUSER_MOVABLE(GFP_USER | __GFP_MOVABLE) > > otherwise we'll be OR'ing ZONE_MOVABLE and ZONE_HIGHMEM together. Dear Matthew, After thinking it over and over, I am afraid there is something needs to be discussed here. You know current X86_64 config file of kernel doesn't enable CONFIG_HIGHMEM, that is to say from this below, #define __GFP_HIGHMEM ((__force gfp_t)OPT_ZONE_HIGHMEM ^ ZONE_NORMAL) __GFP_HIGHMEM should equal to 0b, same as the value of ZONE_NORMAL gets encoded. If we define __GFP_MOVABLE like this, #define __GFP_MOVABLE ((__force gfp_t)(___GFP_MOVABLE | (ZONE_MOVABLE ^ ZONE_NORMAL))) Just like your introduced before, with this modification when somebody sets __GFP_MOVABLE, they should also be setting ZONE_MOVABLE. That brings us a problem, current mm (GFP_ZONE_TABLE) treats __GFP_MOVABLE as ZONE_NORMAL with movable policy, if without __GFP_HIGHMEM. The mm shall allocate a page or pages from migrate movable list of ZONE_NORMAL's freelist. So that conflicts with this modification. And I have checked current kernel, some of function directly set parameter gfp like this. For example, in fs/ext4/extents.c __read_extent_tree_block, bh = sb_getblk_gfp(inode->i_sb, pblk, __GFP_MOVABLE | GFP_NOFS); for these situations, I think only modify GFP_HIGHUSER_MOVABLE is not enough. I am preparing a workaround to solve this in the V2 patch. Later I will upload it to email loop. Sincerely, Huaisheng Ye > > I was thinking... > > Whether it is possible to use other judgement condition to decide > > OPT_ZONE_HIGHMEM > or ZONE_MOVABLE shall be returned from gfp_zone. > > > > Sincerely, > > Huaisheng Ye > > > > > > > > static inline enum zone_type gfp_zone(gfp_t flags) > > > > { > > > > enum zone_type z; > > > > - int bit = (__force int) (flags & GFP_ZONEMASK); > > > > + z = ((__force unsigned int)flags & ___GFP_ZONE_MASK) ^ > > > > ZONE_NORMAL; > > > > + > > > > + if (z > OPT_ZONE_HIGHMEM) > > > > + z = OPT_ZONE_HIGHMEM + > > > > + !!((__force unsigned int)flags & > > > > ___GFP_MOVABLE); > > > > > > > > - z = (GFP_ZONE_TABLE >> (bit * GFP_ZONES_SHIFT)) & > > > > -((1 << GFP_ZONES_SHIFT) - 1); > > > > - VM_BUG_ON((GFP_ZONE_BAD >> bit) & 1); > > > > + VM_BUG_ON(z > ZONE_MOVABLE); > > > > return z; > > > > } > >
Re: [RFC PATCH 04/10] devfreq: rk3399_dmc / rockchip: pm_domains: Register notify to DMC driver.
Hi, As I already commented[1], I think that it is not proper in order to pass the devfreq instance to power_domain driver by separate defined function (rockchip_pm_register_dmcfreq_notifier()). [1] https://patchwork.kernel.org/patch/10349571/ Maybe, you could check the 'OF graph[1]' or 'device connection[2]' for the device connection. Unfortunately, I'm not sure what is best solution for this issue. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/graph.txt [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f2d9b66d84f3ff5ea3aff111e6a403e04fa8bf37 On 2018년 05월 15일 06:16, Enric Balletbo i Serra wrote: > From: Lin Huang > > The DMC (Dynamic Memory Interface) controller does a SiP call to the > Trusted Firmware-A (TF-A) to change the DDR clock frequency. When this > happens the TF-A writes to the PMU bus idle request register > (PMU_BUS_IDLE_REQ) but at the same time it is possible that the Rockchip > power domain driver writes to the same register. So, add a notification > mechanism to ensure that the DMC and the PD driver does not access to this > register at the same time. > > Signed-off-by: Lin Huang > [rewrite commit message] > Signed-off-by: Enric Balletbo i Serra > --- > As I explained in the cover letter I have doubts regarding this patch > but I did not find another way to do it. So I will appreciate any > feedback on this. > > drivers/devfreq/rk3399_dmc.c | 7 ++ > drivers/soc/rockchip/pm_domains.c | 36 +++ > include/soc/rockchip/rk3399_dmc.h | 14 > 3 files changed, 57 insertions(+) > create mode 100644 include/soc/rockchip/rk3399_dmc.h > > diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c > index cc1bbca3fb15..2c4985a501cb 100644 > --- a/drivers/devfreq/rk3399_dmc.c > +++ b/drivers/devfreq/rk3399_dmc.c > @@ -28,6 +28,7 @@ > #include > #include > > +#include > #include > #include > > @@ -443,6 +444,12 @@ static int rk3399_dmcfreq_probe(struct platform_device > *pdev) > data->dev = dev; > platform_set_drvdata(pdev, data); > > + rockchip_pm_register_dmcfreq_notifier(data->devfreq); > + if (ret < 0) { > + dev_err(dev, "Failed to register dmcfreq notifier\n"); > + return ret; > + } > + > return 0; > } > > diff --git a/drivers/soc/rockchip/pm_domains.c > b/drivers/soc/rockchip/pm_domains.c > index 53efc386b1ad..b0e66f24b3e3 100644 > --- a/drivers/soc/rockchip/pm_domains.c > +++ b/drivers/soc/rockchip/pm_domains.c > @@ -8,6 +8,7 @@ > * published by the Free Software Foundation. > */ > > +#include > #include > #include > #include > @@ -76,9 +77,13 @@ struct rockchip_pmu { > const struct rockchip_pmu_info *info; > struct mutex mutex; /* mutex lock for pmu */ > struct genpd_onecell_data genpd_data; > + struct devfreq *devfreq; > + struct notifier_block dmc_nb; > struct generic_pm_domain *domains[]; > }; > > +static struct rockchip_pmu *dmc_pmu; > + > #define to_rockchip_pd(gpd) container_of(gpd, struct rockchip_pm_domain, > genpd) > > #define DOMAIN(pwr, status, req, idle, ack, wakeup) \ > @@ -601,6 +606,35 @@ static int rockchip_pm_add_subdomain(struct rockchip_pmu > *pmu, > return error; > } > > +static int rk3399_dmcfreq_notify(struct notifier_block *nb, > + unsigned long event, void *data) > +{ > + if (event == DEVFREQ_PRECHANGE) > + mutex_lock(&dmc_pmu->mutex); > + else if (event == DEVFREQ_POSTCHANGE) > + mutex_unlock(&dmc_pmu->mutex); > + > + return NOTIFY_OK; > +} > + > +int rockchip_pm_register_dmcfreq_notifier(struct devfreq *devfreq) > +{ > + int ret; > + > + if (!dmc_pmu) > + return -EPROBE_DEFER; > + > + dmc_pmu->devfreq = devfreq; > + dmc_pmu->dmc_nb.notifier_call = rk3399_dmcfreq_notify; > + ret = devm_devfreq_register_notifier(devfreq->dev.parent, > + dmc_pmu->devfreq, > + &dmc_pmu->dmc_nb, > + DEVFREQ_TRANSITION_NOTIFIER); > + > + return ret; > +} > +EXPORT_SYMBOL(rockchip_pm_register_dmcfreq_notifier); > + > static int rockchip_pm_domain_probe(struct platform_device *pdev) > { > struct device *dev = &pdev->dev; > @@ -694,6 +728,8 @@ static int rockchip_pm_domain_probe(struct > platform_device *pdev) > goto err_out; > } > > + dmc_pmu = pmu; > + > return 0; > > err_out: > diff --git a/include/soc/rockchip/rk3399_dmc.h > b/include/soc/rockchip/rk3399_dmc.h > new file mode 100644 > index ..031a62607f61 > --- /dev/null > +++ b/include/soc/rockchip/rk3399_dmc.h > @@ -0,0 +1,14 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (c) 2016-2018, Fuzhou Rockchip Electronics Co., Ltd > + * Autho
Re: [PATCH v4 0/3] AIO add per-command iopriority
On 5/17/18 2:38 PM, adam.manzana...@wdc.com wrote: > From: Adam Manzanares > > This is the per-I/O equivalent of the ioprio_set system call. > See the following link for performance implications on a SATA HDD: > https://lkml.org/lkml/2016/12/6/495 > > First patch factors ioprio_check_cap function out of ioprio_set system call to > also be used by the aio ioprio interface. > > Second patch converts kiocb ki_hint field to a u16 to avoid kiocb bloat. > > Third patch passes ioprio hint from aio iocb to kiocb and enables block_dev > usage of the per I/O ioprio feature. > > v2: merge patches > use IOCB_FLAG_IOPRIO > validate intended use with IOCB_IOPRIO > add linux-api and linux-block to cc > > v3: add ioprio_check_cap function > convert kiocb ki_hint to u16 > use ioprio_check_cap when adding ioprio to kiocb in aio.c > > v4: handle IOCB_IOPRIO in aio_prep_rw > note patch 3 depends on patch 1 in commit msg > > Adam Manzanares (3): > block: add ioprio_check_cap function > fs: Convert kiocb rw_hint from enum to u16 > fs: Add aio iopriority support for block_dev > > block/ioprio.c | 22 -- > fs/aio.c | 16 > fs/block_dev.c | 2 ++ > include/linux/fs.h | 17 +++-- > include/linux/ioprio.h | 2 ++ > include/uapi/linux/aio_abi.h | 1 + > 6 files changed, 52 insertions(+), 8 deletions(-) This looks fine to me now. I can pick up #1 for 4.18 - and 2+3 as well, unless someone else wants to take them. -- Jens Axboe
[PATCHv2][SMB3] Add kernel trace support
Patch updated with additional tracepoint locations and some formatting improvements. There are some obvious additional tracepoints that could be added, but this should be a reasonable group to start with. >From edc02d6f9dc24963d510c7ef59067428d3b082d3 Mon Sep 17 00:00:00 2001 From: Steve French Date: Thu, 17 May 2018 21:16:55 -0500 Subject: [PATCH] smb3: Add ftrace tracepoints for improved SMB3 debugging Although dmesg logs and wireshark network traces can be helpful, being able to dynamically enable/disable tracepoints (in this case via the kernel ftrace mechanism) can also be helpful in more quickly debugging problems, and more selectively tracing the events related to the bug report. This patch adds 12 ftrace tracepoints to cifs.ko for SMB3 events in some obvious locations. Subsequent patches will add more as needed. Example use: trace-cmd record -e cifs trace-cmd show Various trace events can be filtered. See: trace-cmd list | grep cifs for the current list of cifs tracepoints. Sample output (from mount and writing to a file): root@smf:/sys/kernel/debug/tracing/events/cifs# trace-cmd show mount.cifs-6633 [006] 7246.936461: smb3_cmd_done: pid=6633 tid=0x0 sid=0x0 cmd=0 mid=0 mount.cifs-6633 [006] 7246.936701: smb3_cmd_err: pid=6633 tid=0x0 sid=0x3d9cf8e5 cmd=1 mid=1 status=0xc016 rc=-5 mount.cifs-6633 [006] 7246.943055: smb3_cmd_done: pid=6633 tid=0x0 sid=0x3d9cf8e5 cmd=1 mid=2 mount.cifs-6633 [006] 7246.943298: smb3_cmd_done: pid=6633 tid=0xf9447636 sid=0x3d9cf8e5 cmd=3 mid=3 mount.cifs-6633 [006] 7246.943446: smb3_cmd_done: pid=6633 tid=0xf9447636 sid=0x3d9cf8e5 cmd=11 mid=4 mount.cifs-6633 [006] 7246.943659: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=3 mid=5 mount.cifs-6633 [006] 7246.943766: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=11 mid=6 mount.cifs-6633 [006] 7246.943937: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=5 mid=7 mount.cifs-6633 [006] 7246.944020: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=16 mid=8 mount.cifs-6633 [006] 7246.944091: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=16 mid=9 mount.cifs-6633 [006] 7246.944163: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=16 mid=10 mount.cifs-6633 [006] 7246.944218: smb3_cmd_err: pid=6633 tid=0xf9447636 sid=0x3d9cf8e5 cmd=11 mid=11 status=0xc225 rc=-2 mount.cifs-6633 [006] 7246.944219: smb3_fsctl_err: xid=0 fid=0x tid=0xf9447636 sid=0x3d9cf8e5 class=0 type=393620 rc=-2 mount.cifs-6633 [007] 7246.944353: smb3_cmd_done: pid=6633 tid=0xe1b781a sid=0x3d9cf8e5 cmd=16 mid=12 bash-2071 [000] 7256.903844: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=5 mid=13 bash-2071 [000] 7256.904172: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=16 mid=14 bash-2071 [000] 7256.904471: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=17 mid=15 bash-2071 [000] 7256.904950: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=5 mid=16 bash-2071 [000] 7256.905305: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=17 mid=17 bash-2071 [000] 7256.905688: smb3_cmd_done: pid=2071 tid=0xe1b781a sid=0x3d9cf8e5 cmd=6 mid=18 bash-2071 [000] 7256.905809: smb3_write_done: xid=0 fid=0xd628f511 tid=0xe1b781a sid=0x3d9cf8e5 offset=0x0 len=0x1b Signed-off-by: Steve French --- fs/cifs/Makefile | 7 +- fs/cifs/smb2maperror.c | 10 +- fs/cifs/smb2pdu.c | 56 +++- fs/cifs/trace.c| 18 +++ fs/cifs/trace.h| 298 + 5 files changed, 379 insertions(+), 10 deletions(-) create mode 100644 fs/cifs/trace.c create mode 100644 fs/cifs/trace.h diff --git a/fs/cifs/Makefile b/fs/cifs/Makefile index 7e4a1e2f0696..85817991ee68 100644 --- a/fs/cifs/Makefile +++ b/fs/cifs/Makefile @@ -1,11 +1,12 @@ # SPDX-License-Identifier: GPL-2.0 # -# Makefile for Linux CIFS VFS client +# Makefile for Linux CIFS/SMB2/SMB3 VFS client # +ccflags-y += -I$(src)# needed for trace events obj-$(CONFIG_CIFS) += cifs.o -cifs-y := cifsfs.o cifssmb.o cifs_debug.o connect.o dir.o file.o inode.o \ - link.o misc.o netmisc.o smbencrypt.o transport.o asn1.o \ +cifs-y := trace.o cifsfs.o cifssmb.o cifs_debug.o connect.o dir.o file.o \ + inode.o link.o misc.o netmisc.o smbencrypt.o transport.o asn1.o \ cifs_unicode.o nterr.o cifsencrypt.o \ readdir.o ioctl.o sess.o export.o smb1ops.o winucase.o \ smb2ops.o smb2maperror.o smb2transport.o \ diff --git a/fs/cifs/smb2maperror.c b/fs/cifs/smb2maperror.c index 3bfc9c990724..20185be4a93d 100644 --- a/fs/cifs/smb2maperror.c +++ b/fs/cifs/smb2maperror.c @@ -27,6 +27,7 @@ #include "smb2proto.h
[PATCH 5/5] MAINTAINERS: Add Actions Semi S900 pinctrl entries
Add S900 pinctrl entries under ARCH_ACTIONS Signed-off-by: Manivannan Sadhasivam --- MAINTAINERS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 640dabc4c311..9e1a17c9b4a7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1125,10 +1125,12 @@ F: arch/arm/mach-actions/ F: arch/arm/boot/dts/owl-* F: arch/arm64/boot/dts/actions/ F: drivers/clocksource/owl-* +F: drivers/pinctrl/actions/* F: drivers/soc/actions/ F: include/dt-bindings/power/owl-* F: include/linux/soc/actions/ F: Documentation/devicetree/bindings/arm/actions.txt +F: Documentation/devicetree/bindings/pinctrl/actions,s900-pinctrl.txt F: Documentation/devicetree/bindings/power/actions,owl-sps.txt F: Documentation/devicetree/bindings/timer/actions,owl-timer.txt -- 2.14.1