Re: [PATCH 1/2] open: add close_range()
On Tue, May 21, 2019 at 08:20:09PM +0100, Al Viro wrote: > On Tue, May 21, 2019 at 05:30:27PM +0100, David Howells wrote: > > > If we can live with close_from(int first) rather than close_range(), then > > this > > can perhaps be done a lot more efficiently by: > > > > new = alloc_fdtable(first); > > spin_lock(>file_lock); > > old = files_fdtable(files); > > copy_fds(new, old, 0, first - 1); > > rcu_assign_pointer(files->fdt, new); > > spin_unlock(>file_lock); > > clear_fds(old, 0, first - 1); > > close_fdt_from(old, first); > > kfree_rcu(old); > > I really hate to think how that would interact with POSIX locks... POSIX locks store current->files in fl_owner; David's resizing the underlying files->fdt, just like growing from 64 to 256 fds.
Re: ext4 corruption on alpha with 4.20.0-09062-gd8372ba8ce28
On Tue, Feb 19, 2019 at 02:20:26PM +0100, Jan Kara wrote: > Thanks for information. Yeah, that makes somewhat more sense. Can you ever > see the failure if you disable CONFIG_TRANSPARENT_HUGEPAGE? Because your > findings still seem to indicate that there' some problem with page > migration and Alpha (added MM list to CC). Could https://lore.kernel.org/linux-mm/20190219123212.29838-1-lar...@axis.com/T/#u be relevant?
Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
On Wed, Jan 02, 2019 at 03:57:58PM -0500, Michael S. Tsirkin wrote: > @@ -875,6 +893,8 @@ to the CPU containing it. See the section on "Multicopy > atomicity" > for more information. > > > + > + > In summary: > >(*) Control dependencies can order prior loads against later stores. Was this hunk intentional?
Re: [PATCH 00/32] docs/vm: convert to ReST format
On Fri, Apr 13, 2018 at 01:55:51PM -0600, Jonathan Corbet wrote: > > I believe that keeping the mm docs together will give better visibility of > > what (little) mm documentation we have and will make the updates easier. > > The documents that fit well into a certain topic could be linked there. For > > instance: > > ...but this sounds like just the opposite...? > > I've had this conversation with folks in a number of subsystems. > Everybody wants to keep their documentation together in one place - it's > easier for the developers after all. But for the readers I think it's > objectively worse. It perpetuates the mess that Documentation/ is, and > forces readers to go digging through all kinds of inappropriate material > in the hope of finding something that tells them what they need to know. > > So I would *really* like to split the documentation by audience, as has > been done for a number of other kernel subsystems (and eventually all, I > hope). > > I can go ahead and apply the RST conversion, that seems like a step in > the right direction regardless. But I sure hope we don't really have to > keep it as an unorganized jumble of stuff... I've started on Documentation/core-api/memory.rst which covers just memory allocation. So far it has the Overview and GFP flags sections written and an outline for 'The slab allocator', 'The page allocator', 'The vmalloc allocator' and 'The page_frag allocator'. And typing this up, I realise we need a 'The percpu allocator'. I'm thinking that this is *not* the right document for the DMA memory allocators (although it should link to that documentation). I suspect the existing Documentation/vm/ should probably stay as an unorganised jumble of stuff. Developers mostly talking to other MM developers. Stuff that people outside the MM fraternity should know about needs to be centrally documented. By all means convert it to ReST ... I don't much care, and it may make it easier to steal bits or link to it from the organised documentation. -- To unsubscribe from this list: send the line "unsubscribe linux-alpha" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 0/2] Randomization of address chosen by mmap.
On Fri, Mar 23, 2018 at 03:16:21PM -0400, Rich Felker wrote: > > Huh, I thought libc was aware of this. Also, I'd expect a libc-based > > implementation to restrict itself to, eg, only loading libraries in > > the bottom 1GB to avoid applications who want to map huge things from > > running out of unfragmented address space. > > That seems like a rather arbitrary expectation and I'm not sure why > you'd expect it to result in less fragmentation rather than more. For > example if it started from 1GB and worked down, you'd immediately > reduce the contiguous free space from ~3GB to ~2GB, and if it started > from the bottom and worked up, brk would immediately become > unavailable, increasing mmap pressure elsewhere. By *not* limiting yourself to the bottom 1GB, you'll almost immediately fragment the address space even worse. Just looking at 'ls' as a hopefully-good example of a typical app, it maps: linux-vdso.so.1 (0x7ffef5eef000) libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x7fb3657f5000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7fb36543b000) libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x7fb3651c9000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x7fb364fc5000) /lib64/ld-linux-x86-64.so.2 (0x7fb365c3f000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x7fb364da7000) The VDSO wouldn't move, but look at the distribution of mapping 6 things into a 3GB address space in random locations. What are the odds you have a contiguous 1GB chunk of address space? If you restrict yourself to the bottom 1GB before running out of room and falling back to a sequential allocation, you'll prevent a lot of fragmentation. -- To unsubscribe from this list: send the line "unsubscribe linux-alpha" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 0/2] Randomization of address chosen by mmap.
On Fri, Mar 23, 2018 at 02:00:24PM -0400, Rich Felker wrote: > On Fri, Mar 23, 2018 at 05:48:06AM -0700, Matthew Wilcox wrote: > > On Thu, Mar 22, 2018 at 07:36:36PM +0300, Ilya Smith wrote: > > > Current implementation doesn't randomize address returned by mmap. > > > All the entropy ends with choosing mmap_base_addr at the process > > > creation. After that mmap build very predictable layout of address > > > space. It allows to bypass ASLR in many cases. This patch make > > > randomization of address on any mmap call. > > > > Why should this be done in the kernel rather than libc? libc is perfectly > > capable of specifying random numbers in the first argument of mmap. > > Generally libc does not have a view of the current vm maps, and thus > in passing "random numbers", they would have to be uniform across the > whole vm space and thus non-uniform once the kernel rounds up to avoid > existing mappings. I'm aware that you're the musl author, but glibc somehow manages to provide etext, edata and end, demonstrating that it does know where at least some of the memory map lies. Virtually everything after that is brought into the address space via mmap, which at least glibc intercepts, so it's entirely possible for a security-conscious libc to know where other things are in the memory map. Not to mention that what we're primarily talking about here are libraries which are dynamically linked and are loaded by ld.so before calling main(); not dlopen() or even regular user mmaps. > Also this would impose requirements that libc be > aware of the kernel's use of the virtual address space and what's > available to userspace -- for example, on 32-bit archs whether 2GB, > 3GB, or full 4GB (for 32-bit-user-on-64-bit-kernel) is available, and > on 64-bit archs where fewer than the full 64 bits are actually valid > in addresses, what the actual usable pointer size is. There is > currently no clean way of conveying this information to userspace. Huh, I thought libc was aware of this. Also, I'd expect a libc-based implementation to restrict itself to, eg, only loading libraries in the bottom 1GB to avoid applications who want to map huge things from running out of unfragmented address space. -- To unsubscribe from this list: send the line "unsubscribe linux-alpha" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 4/7] alpha: Add support for memset16
From: Matthew Wilcox <mawil...@microsoft.com> Alpha already had an optimised memset-16-bit-quantity assembler routine called memsetw(). It has a slightly different calling convention from memset16() in that it takes a byte count, not a count of words. That's the same convention used by ARM's __memset16(), so rename Alpha's routine to match and add a memset16() wrapper around it. Then convert Alpha's scr_memsetw() to call memset16() instead of memsetw(). Signed-off-by: Matthew Wilcox <mawil...@microsoft.com> --- arch/alpha/include/asm/string.h | 15 --- arch/alpha/include/asm/vga.h| 2 +- arch/alpha/lib/memset.S | 10 +- 3 files changed, 14 insertions(+), 13 deletions(-) diff --git a/arch/alpha/include/asm/string.h b/arch/alpha/include/asm/string.h index c2911f591704..74c0a693b76b 100644 --- a/arch/alpha/include/asm/string.h +++ b/arch/alpha/include/asm/string.h @@ -65,13 +65,14 @@ extern void * memchr(const void *, int, size_t); aligned values. The DEST and COUNT parameters must be even for correct operation. */ -#define __HAVE_ARCH_MEMSETW -extern void * __memsetw(void *dest, unsigned short, size_t count); - -#define memsetw(s, c, n)\ -(__builtin_constant_p(c)\ - ? __constant_c_memset((s),0x0001000100010001UL*(unsigned short)(c),(n)) \ - : __memsetw((s),(c),(n))) +#define __HAVE_ARCH_MEMSET16 +extern void * __memset16(void *dest, unsigned short, size_t count); +static inline void *memset16(uint16_t *p, uint16_t v, size_t n) +{ + if (__builtin_constant_p(v)) + return __constant_c_memset(p, 0x0001000100010001UL * v, n * 2) + return __memset16(p, v, n * 2); +} #endif /* __KERNEL__ */ diff --git a/arch/alpha/include/asm/vga.h b/arch/alpha/include/asm/vga.h index c00106bac521..3c1c2b6128e7 100644 --- a/arch/alpha/include/asm/vga.h +++ b/arch/alpha/include/asm/vga.h @@ -34,7 +34,7 @@ static inline void scr_memsetw(u16 *s, u16 c, unsigned int count) if (__is_ioaddr(s)) memsetw_io((u16 __iomem *) s, c, count); else - memsetw(s, c, count); + memset16(s, c, count / 2); } /* Do not trust that the usage will be correct; analyze the arguments. */ diff --git a/arch/alpha/lib/memset.S b/arch/alpha/lib/memset.S index 89a26f5e89de..f824969e9e77 100644 --- a/arch/alpha/lib/memset.S +++ b/arch/alpha/lib/memset.S @@ -20,7 +20,7 @@ .globl memset .globl __memset .globl ___memset - .globl __memsetw + .globl __memset16 .globl __constant_c_memset .ent ___memset @@ -110,8 +110,8 @@ EXPORT_SYMBOL(___memset) EXPORT_SYMBOL(__constant_c_memset) .align 5 - .ent __memsetw -__memsetw: + .ent __memset16 +__memset16: .prologue 0 inswl $17,0,$1 /* E0 */ @@ -123,8 +123,8 @@ __memsetw: or $1,$4,$17/* E0 */ br __constant_c_memset /* .. E1 */ - .end __memsetw -EXPORT_SYMBOL(__memsetw) + .end __memset16 +EXPORT_SYMBOL(__memset16) memset = ___memset __memset = ___memset -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-alpha" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 6/7] sym53c8xx_2: Convert to use memset32
From: Matthew Wilcox <mawil...@microsoft.com> memset32() can be used to initialise these three arrays. Minor code footprint reduction. Signed-off-by: Matthew Wilcox <mawil...@microsoft.com> --- drivers/scsi/sym53c8xx_2/sym_hipd.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.c b/drivers/scsi/sym53c8xx_2/sym_hipd.c index 6b349e301869..b886b10e3499 100644 --- a/drivers/scsi/sym53c8xx_2/sym_hipd.c +++ b/drivers/scsi/sym53c8xx_2/sym_hipd.c @@ -4985,13 +4985,10 @@ struct sym_lcb *sym_alloc_lcb (struct sym_hcb *np, u_char tn, u_char ln) * Compute the bus address of this table. */ if (ln && !tp->luntbl) { - int i; - tp->luntbl = sym_calloc_dma(256, "LUNTBL"); if (!tp->luntbl) goto fail; - for (i = 0 ; i < 64 ; i++) - tp->luntbl[i] = cpu_to_scr(vtobus(>badlun_sa)); + memset32(tp->luntbl, cpu_to_scr(vtobus(>badlun_sa)), 64); tp->head.luntbl_sa = cpu_to_scr(vtobus(tp->luntbl)); } @@ -5077,8 +5074,7 @@ static void sym_alloc_lcb_tags (struct sym_hcb *np, u_char tn, u_char ln) /* * Initialize the task table with invalid entries. */ - for (i = 0 ; i < SYM_CONF_MAX_TASK ; i++) - lp->itlq_tbl[i] = cpu_to_scr(np->notask_ba); + memset32(lp->itlq_tbl, cpu_to_scr(np->notask_ba), SYM_CONF_MAX_TASK); /* * Fill up the tag buffer with tag numbers. @@ -5764,8 +5760,7 @@ int sym_hcb_attach(struct Scsi_Host *shost, struct sym_fw *fw, struct sym_nvram goto attach_failed; np->badlun_sa = cpu_to_scr(SCRIPTB_BA(np, resel_bad_lun)); - for (i = 0 ; i < 64 ; i++) /* 64 luns/target, no less */ - np->badluntbl[i] = cpu_to_scr(vtobus(>badlun_sa)); + memset32(np->badluntbl, cpu_to_scr(vtobus(>badlun_sa)), 64); /* * Prepare the bus address array that contains the bus -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-alpha" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 7/7] vga: Optimise console scrolling
From: Matthew Wilcox <mawil...@microsoft.com> Where possible, call memset16(), memmove() or memcpy() instead of using open-coded loops. If an architecture doesn't define VT_BUF_HAVE_RW, we can do that from the generic code. For the architectures which do have special RW routines, usually we can do the special thing (pointer test or byteswap) once (and then use a mem* call) instead of each time around a loop. Alpha is the only architecture missing a scr_memmovew() definition (because it's non-trivial to write). I don't like the calling convention that uses a byte count instead of a count of u16s, but it's a little late to change that. Reduces code size of fbcon.o by almost 400 bytes on my laptop build. Signed-off-by: Matthew Wilcox <mawil...@microsoft.com> --- arch/mips/include/asm/vga.h| 6 ++ arch/powerpc/include/asm/vga.h | 8 arch/sparc/include/asm/vga.h | 24 include/linux/vt_buffer.h | 12 4 files changed, 50 insertions(+) diff --git a/arch/mips/include/asm/vga.h b/arch/mips/include/asm/vga.h index f82c83749a08..7510f406e1e1 100644 --- a/arch/mips/include/asm/vga.h +++ b/arch/mips/include/asm/vga.h @@ -40,9 +40,15 @@ static inline u16 scr_readw(volatile const u16 *addr) return le16_to_cpu(*addr); } +static inline void scr_memsetw(u16 *s, u16 v, unsigned int count) +{ + memset16(s, cpu_to_le16(v), count / 2); +} + #define scr_memcpyw(d, s, c) memcpy(d, s, c) #define scr_memmovew(d, s, c) memmove(d, s, c) #define VT_BUF_HAVE_MEMCPYW #define VT_BUF_HAVE_MEMMOVEW +#define VT_BUF_HAVE_MEMSETW #endif /* _ASM_VGA_H */ diff --git a/arch/powerpc/include/asm/vga.h b/arch/powerpc/include/asm/vga.h index ab3acd2f2786..7a7b541b7493 100644 --- a/arch/powerpc/include/asm/vga.h +++ b/arch/powerpc/include/asm/vga.h @@ -33,8 +33,16 @@ static inline u16 scr_readw(volatile const u16 *addr) return le16_to_cpu(*addr); } +#define VT_BUF_HAVE_MEMSETW +static inline void scr_memsetw(u16 *s, u16 v, unsigned int n) +{ + memset16(s, cpu_to_le16(v), n / 2); +} + #define VT_BUF_HAVE_MEMCPYW +#define VT_BUF_HAVE_MEMMOVEW #define scr_memcpywmemcpy +#define scr_memmovew memmove #endif /* !CONFIG_VGA_CONSOLE && !CONFIG_MDA_CONSOLE */ diff --git a/arch/sparc/include/asm/vga.h b/arch/sparc/include/asm/vga.h index ec0e9967d93d..1fab92b110d9 100644 --- a/arch/sparc/include/asm/vga.h +++ b/arch/sparc/include/asm/vga.h @@ -11,6 +11,9 @@ #include #define VT_BUF_HAVE_RW +#define VT_BUF_HAVE_MEMSETW +#define VT_BUF_HAVE_MEMCPYW +#define VT_BUF_HAVE_MEMMOVEW #undef scr_writew #undef scr_readw @@ -29,6 +32,27 @@ static inline u16 scr_readw(const u16 *addr) return *addr; } +static inline void scr_memsetw(u16 *p, u16 v, unsigned int n) +{ + BUG_ON((long) p >= 0); + + memset16(s, cpu_to_le16(v), n / 2); +} + +static inline void scr_memcpyw(u16 *d, u16 *s, unsigned int n) +{ + BUG_ON((long) d >= 0); + + memcpy(d, s, n); +} + +static inline void scr_memmovew(u16 *d, u16 *s, unsigned int n) +{ + BUG_ON((long) d >= 0); + + memmove(d, s, n); +} + #define VGA_MAP_MEM(x,s) (x) #endif diff --git a/include/linux/vt_buffer.h b/include/linux/vt_buffer.h index f38c10ba3ff5..31b92fcd8f03 100644 --- a/include/linux/vt_buffer.h +++ b/include/linux/vt_buffer.h @@ -26,24 +26,33 @@ #ifndef VT_BUF_HAVE_MEMSETW static inline void scr_memsetw(u16 *s, u16 c, unsigned int count) { +#ifdef VT_BUF_HAVE_RW count /= 2; while (count--) scr_writew(c, s++); +#else + memset16(s, c, count / 2); +#endif } #endif #ifndef VT_BUF_HAVE_MEMCPYW static inline void scr_memcpyw(u16 *d, const u16 *s, unsigned int count) { +#ifdef VT_BUF_HAVE_RW count /= 2; while (count--) scr_writew(scr_readw(s++), d++); +#else + memcpy(d, s, count); +#endif } #endif #ifndef VT_BUF_HAVE_MEMMOVEW static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count) { +#ifdef VT_BUF_HAVE_RW if (d < s) scr_memcpyw(d, s, count); else { @@ -53,6 +62,9 @@ static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count) while (count--) scr_writew(scr_readw(--s), --d); } +#else + memmove(d, s, count); +#endif } #endif -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-alpha" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 0/7] Add memsetN functions
From: Matthew Wilcox <mawil...@microsoft.com> zram was recently enhanced to support compressing pages with a repeating pattern up to the size of an unsigned long. As part of the discussion, we noted it would be nice if architectures had optimised routines to fill regions of memory with patterns larger than those contained in a single byte. Our suspicions were right; the x86 version offers approximately a 7% performance improvement over the C implementation. The generic memfill() function is part of Lars Wirzenius' publib, but it doesn't offer the most convenient interface. I chose to add five more-specific functions as part of this patchset -- memset16(), memset32(), memset64(), memset_l() (long) and memset_p() (pointer). It would be nice to have some more architectures implement optimised memsetN calls. It would also be nice to find more places in the kernel which could benefit from calling these functions. Maybe a coccinelle script could be written to find such places? We're looking for loops over an array where the value being stored into the array does not depend on the iteration variable. Since v1 of the patchset, I stumbled on Alpha's memsetw() which caused me to add memset16() to complete the set. I removed the '__HAVE_ARCH_MEMSET_PLUS' preprocessor symbol in favour of separate MEMSET16 MEMSET32 and MEMSET64 symbols. I also reviewed the scr_mem*w() usages across the different architectures and implemented some obvious missing optimisations. Alpha is still missing scr_memmovew() as it would be non-trivial to write. Russell's review on patch 2 only applies to the memset32/memset64 implementation. The memset16 is unreviewed (and, indeed, untested) to date. Matthew Wilcox (7): Add multibyte memset functions ARM: Implement memset16, memset32 & memset64 x86: Implement memset16, memset32 & memset64 alpha: Add support for memset16 zram: Convert to using memset_l sym53c8xx_2: Convert to use memset32 vga: Optimise console scrolling arch/alpha/include/asm/string.h | 15 arch/alpha/include/asm/vga.h| 2 +- arch/alpha/lib/memset.S | 10 +++--- arch/arm/include/asm/string.h | 21 arch/arm/kernel/armksyms.c | 3 ++ arch/arm/lib/memset.S | 44 +++- arch/mips/include/asm/vga.h | 6 arch/powerpc/include/asm/vga.h | 8 + arch/sparc/include/asm/vga.h| 24 + arch/x86/include/asm/string_32.h| 24 + arch/x86/include/asm/string_64.h| 36 drivers/block/zram/zram_drv.c | 15 ++-- drivers/scsi/sym53c8xx_2/sym_hipd.c | 11 ++ include/linux/string.h | 30 include/linux/vt_buffer.h | 12 +++ lib/string.c| 68 + 16 files changed, 287 insertions(+), 42 deletions(-) -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-alpha" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 5/7] zram: Convert to using memset_l
From: Matthew Wilcox <mawil...@microsoft.com> zram was the motivation for creating memset_l(). Minchan Kim sees a 7% performance improvement on x86 with 100MB of non-zero deduplicatable data: perf stat -r 10 dd if=/dev/zram0 of=/dev/null vanilla:0.232050465 seconds time elapsed ( +- 0.51% ) memset_l: 0.217219387 seconds time elapsed ( +- 0.07% ) Signed-off-by: Matthew Wilcox <mawil...@microsoft.com> Tested-by: Minchan Kim <minc...@kernel.org> --- drivers/block/zram/zram_drv.c | 15 +++ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index e27d89a36c34..25dcad309695 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -157,20 +157,11 @@ static inline void update_used_max(struct zram *zram, } while (old_max != cur_max); } -static inline void zram_fill_page(char *ptr, unsigned long len, +static inline void zram_fill_page(void *ptr, unsigned long len, unsigned long value) { - int i; - unsigned long *page = (unsigned long *)ptr; - WARN_ON_ONCE(!IS_ALIGNED(len, sizeof(unsigned long))); - - if (likely(value == 0)) { - memset(ptr, 0, len); - } else { - for (i = 0; i < len / sizeof(*page); i++) - page[i] = value; - } + memset_l(ptr, value, len / sizeof(unsigned long)); } static bool page_same_filled(void *ptr, unsigned long *element) @@ -193,7 +184,7 @@ static bool page_same_filled(void *ptr, unsigned long *element) static void handle_same_page(struct bio_vec *bvec, unsigned long element) { struct page *page = bvec->bv_page; - void *user_mem; + char *user_mem; user_mem = kmap_atomic(page); zram_fill_page(user_mem + bvec->bv_offset, bvec->bv_len, element); -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-alpha" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 3/7] x86: Implement memset16, memset32 & memset64
From: Matthew Wilcox <mawil...@microsoft.com> These are single instructions on x86. There's no 64-bit instruction for x86-32, but we don't yet have any user for memset64() on 32-bit architectures, so don't bother to implement it. Signed-off-by: Matthew Wilcox <mawil...@microsoft.com> --- arch/x86/include/asm/string_32.h | 24 arch/x86/include/asm/string_64.h | 36 2 files changed, 60 insertions(+) diff --git a/arch/x86/include/asm/string_32.h b/arch/x86/include/asm/string_32.h index 3d3e8353ee5c..84da91fe13ac 100644 --- a/arch/x86/include/asm/string_32.h +++ b/arch/x86/include/asm/string_32.h @@ -331,6 +331,30 @@ void *__constant_c_and_count_memset(void *s, unsigned long pattern, : __memset((s), (c), (count))) #endif +#define __HAVE_ARCH_MEMSET16 +static inline void *memset16(uint16_t *s, uint16_t v, size_t n) +{ + int d0, d1; + asm volatile("rep\n\t" +"stosw" +: "=" (d0), "=" (d1) +: "a" (v), "1" (s), "0" (n) +: "memory"); + return s; +} + +#define __HAVE_ARCH_MEMSET_32 +static inline void *memset32(uint32_t *s, uint32_t v, size_t n) +{ + int d0, d1; + asm volatile("rep\n\t" +"stosl" +: "=" (d0), "=" (d1) +: "a" (v), "1" (s), "0" (n) +: "memory"); + return s; +} + /* * find the first occurrence of byte 'c', or 1 past the area if none */ diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h index a164862d77e3..71c5e860c7da 100644 --- a/arch/x86/include/asm/string_64.h +++ b/arch/x86/include/asm/string_64.h @@ -56,6 +56,42 @@ extern void *__memcpy(void *to, const void *from, size_t len); void *memset(void *s, int c, size_t n); void *__memset(void *s, int c, size_t n); +#define __HAVE_ARCH_MEMSET16 +static inline void *memset16(uint16_t *s, uint16_t v, size_t n) +{ + long d0, d1; + asm volatile("rep\n\t" +"stosw" +: "=" (d0), "=" (d1) +: "a" (v), "1" (s), "0" (n) +: "memory"); + return s; +} + +#define __HAVE_ARCH_MEMSET32 +static inline void *memset32(uint32_t *s, uint32_t v, size_t n) +{ + long d0, d1; + asm volatile("rep\n\t" +"stosl" +: "=" (d0), "=" (d1) +: "a" (v), "1" (s), "0" (n) +: "memory"); + return s; +} + +#define __HAVE_ARCH_MEMSET64 +static inline void *memset64(uint64_t *s, uint64_t v, size_t n) +{ + long d0, d1; + asm volatile("rep\n\t" +"stosq" +: "=" (d0), "=" (d1) +: "a" (v), "1" (s), "0" (n) +: "memory"); + return s; +} + #define __HAVE_ARCH_MEMMOVE void *memmove(void *dest, const void *src, size_t count); void *__memmove(void *dest, const void *src, size_t count); -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-alpha" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/7] Add multibyte memset functions
From: Matthew Wilcox <mawil...@microsoft.com> memset16(), memset32() and memset64() are like memset(), but allow the caller to fill the destination with a multibyte pattern. memset_l() and memset_p() allow the caller to use unsigned long and pointer values respectively. memset64() is currently only available on 64-bit architectures. Signed-off-by: Matthew Wilcox <mawil...@microsoft.com> --- include/linux/string.h | 30 ++ lib/string.c | 68 ++ 2 files changed, 98 insertions(+) diff --git a/include/linux/string.h b/include/linux/string.h index 26b6f6a66f83..b376875b650c 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -99,6 +99,36 @@ extern __kernel_size_t strcspn(const char *,const char *); #ifndef __HAVE_ARCH_MEMSET extern void * memset(void *,int,__kernel_size_t); #endif + +#ifndef __HAVE_ARCH_MEMSET16 +extern void *memset16(uint16_t *, uint16_t, __kernel_size_t); +#endif + +#ifndef __HAVE_ARCH_MEMSET32 +extern void *memset32(uint32_t *, uint32_t, __kernel_size_t); +#endif + +#ifndef __HAVE_ARCH_MEMSET64 +extern void *memset64(uint64_t *, uint64_t, __kernel_size_t); +#endif + +static inline void *memset_l(unsigned long *p, unsigned long v, + __kernel_size_t n) +{ + if (BITS_PER_LONG == 32) + return memset32((uint32_t *)p, v, n); + else + return memset64((uint64_t *)p, v, n); +} + +static inline void *memset_p(void **p, void *v, __kernel_size_t n) +{ + if (BITS_PER_LONG == 32) + return memset32((uint32_t *)p, (uintptr_t)v, n); + else + return memset64((uint64_t *)p, (uintptr_t)v, n); +} + #ifndef __HAVE_ARCH_MEMCPY extern void * memcpy(void *,const void *,__kernel_size_t); #endif diff --git a/lib/string.c b/lib/string.c index ed83562a53ae..f18ba402e503 100644 --- a/lib/string.c +++ b/lib/string.c @@ -697,6 +697,74 @@ void memzero_explicit(void *s, size_t count) } EXPORT_SYMBOL(memzero_explicit); +#ifndef __HAVE_ARCH_MEMSET16 +/** + * memset16() - Fill a memory area with a uint16_t + * @s: Pointer to the start of the area. + * @v: The value to fill the area with + * @count: The number of values to store + * + * Differs from memset() in that it fills with a uint16_t instead + * of a byte. Remember that @count is the number of uint16_ts to + * store, not the number of bytes. + */ +void *memset16(uint16_t *s, uint16_t v, size_t count) +{ + uint16_t *xs = s; + + while (count--) + *xs++ = v; + return s; +} +EXPORT_SYMBOL(memset16); +#endif + +#ifndef __HAVE_ARCH_MEMSET32 +/** + * memset32() - Fill a memory area with a uint32_t + * @s: Pointer to the start of the area. + * @v: The value to fill the area with + * @count: The number of values to store + * + * Differs from memset() in that it fills with a uint32_t instead + * of a byte. Remember that @count is the number of uint32_ts to + * store, not the number of bytes. + */ +void *memset32(uint32_t *s, uint32_t v, size_t count) +{ + uint32_t *xs = s; + + while (count--) + *xs++ = v; + return s; +} +EXPORT_SYMBOL(memset32); +#endif + +#ifndef __HAVE_ARCH_MEMSET64 +#if BITS_PER_LONG > 32 +/** + * memset64() - Fill a memory area with a uint64_t + * @s: Pointer to the start of the area. + * @v: The value to fill the area with + * @count: The number of values to store + * + * Differs from memset() in that it fills with a uint64_t instead + * of a byte. Remember that @count is the number of uint64_ts to + * store, not the number of bytes. + */ +void *memset64(uint64_t *s, uint64_t v, size_t count) +{ + uint64_t *xs = s; + + while (count--) + *xs++ = v; + return s; +} +EXPORT_SYMBOL(memset64); +#endif +#endif + #ifndef __HAVE_ARCH_MEMCPY /** * memcpy - Copy one area of memory to another -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-alpha" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html