This series is simply rebased on v5.2rc1 and I added the Reviewed-By
from Palmer for the first patch, thanks for that.

This series introduces hugetlbfs support for both riscv 32/64. Riscv32          
 
is architecturally limited to huge pages of size 4MB whereas riscv64 has        
 
2MB/1G huge pages support. Transparent huge page support is not                 
 
implemented here, I will submit another series later.                           
 
                                                                                
 
As stated in "The RISC-V Instruction Set Manual, Volume II: Privileged          
 
Architecture", riscv page table entries are marked as non-leaf entries          
 
as soon as at least one of the R/W/X bit set:                                   
 
                                                                                
 
- pmd_huge/pud_huge check if one of those bits are set,                         
 
- pte_mkhuge simply returns the same pte value and does not set any of          
 
  the R/W/X bits                                                                
 
                                                                                
 
This series was validated using libhugetlbfs testsuite ported to riscv64        
 
without linker script support.                                                  
 
(https://github.com/AlexGhiti/libhugetlbfs.git, branch dev/alex/riscv).         
 
                                                                                
 
- libhugetlbfs testsuite on riscv64/2M:                                         
 
  - brk_near_huge triggers an assert in malloc.c, does not on x86.              
 
                                                                                
 
- libhugetlbfs testsuite on riscv64/1G:                                         
 
  - brk_near_huge triggers an assert in malloc.c, does not on x86.              
 
  - mmap-gettest, mmap-cow: testsuite passes the number of default free         
 
    pages as parameters and then fails for 1G which is not the default.         
 
    Otherwise succeeds when given the right number of pages.                    
 
  - map_high_truncate_2 fails on x86 too: 0x60000000 is not 1G aligned          
 
    and fails at line 694 of fs/hugetlbfs/inode.c.                              
 
  - heapshrink on 1G fails on x86 too, not investigated.                        
 
  - counters.sh on 1G fails on x86 too: alloc_surplus_huge_page returns         
 
    NULL in case of gigantic pages.                                             
 
  - icache-hygiene succeeds after patch #3 of this series which lowers          
 
    the base address of mmap.                                                   
 
  - fallocate_stress.sh on 1G never ends, on x86 too, not investigated.         
 
                                                                                
 
- libhugetlbfs testsuite on riscv32/4M: kernel build passes, lacks              
 
  libhugetlbfs support for 32bits.                                              
 
                                                                                
 
* Output for riscv64 2M and 1G libhugetbfs testsuite:                           
 
                                                                                
 
zero_filesize_segment (2M: 64):                                                 
 
zero_filesize_segment (1024M: 64):                                              
 
test_root (2M: 64):     PASS                                                    
 
test_root (1024M: 64):  PASS                                                    
 
meminfo_nohuge (2M: 64):        PASS                                            
 
meminfo_nohuge (1024M: 64):     PASS                                            
 
gethugepagesize (2M: 64):       PASS                                            
 
gethugepagesize (1024M: 64):    PASS                                            
 
gethugepagesizes (2M: 64):      PASS                                            
 
gethugepagesizes (1024M: 64):   PASS                                            
 
HUGETLB_VERBOSE=1 empty_mounts (2M: 64):        PASS                            
 
HUGETLB_VERBOSE=1 empty_mounts (1024M: 64):     PASS                            
 
HUGETLB_VERBOSE=1 large_mounts (2M: 64):        PASS                            
 
HUGETLB_VERBOSE=1 large_mounts (1024M: 64):     PASS                            
 
find_path (2M: 64):     PASS                                                    
 
find_path (1024M: 64):  PASS                                                    
 
unlinked_fd (2M: 64):   PASS                                                    
 
unlinked_fd (1024M: 64):        PASS                                            
 
readback (2M: 64):      PASS                                                    
 
readback (1024M: 64):   PASS                                                    
 
truncate (2M: 64):      PASS                                                    
 
truncate (1024M: 64):   PASS                                                    
 
shared (2M: 64):        PASS                                                    
 
shared (1024M: 64):     PASS                                                    
 
mprotect (2M: 64):      PASS                                                    
 
mprotect (1024M: 64):   PASS                                                    
 
mlock (2M: 64): PASS                                                            
 
mlock (1024M: 64):      PASS                                                    
 
misalign (2M: 64):      PASS                                                    
 
misalign (1024M: 64):   PASS                                                    
 
fallocate_basic.sh (2M: 64):    PASS                                            
 
fallocate_basic.sh (1024M: 64): PASS                                            
 
fallocate_align.sh (2M: 64):    PASS                                            
 
fallocate_align.sh (1024M: 64): PASS                                            
 
ptrace-write-hugepage (2M: 64): PASS                                            
 
ptrace-write-hugepage (1024M: 64):      PASS                                    
 
icache-hygiene (2M: 64):        PASS                                            
 
icache-hygiene (1024M: 64):     PASS                                            
 
slbpacaflush (2M: 64):  PASS (inconclusive)                                     
 
slbpacaflush (1024M: 64):       PASS (inconclusive)                             
 
straddle_4GB_static (2M: 64):   PASS                                            
 
straddle_4GB_static (1024M: 64):        PASS                                    
 
huge_at_4GB_normal_below_static (2M: 64):       PASS                            
 
huge_at_4GB_normal_below_static (1024M: 64):    PASS                            
 
huge_below_4GB_normal_above_static (2M: 64):    PASS                            
 
huge_below_4GB_normal_above_static (1024M: 64): PASS                            
 
map_high_truncate_2 (2M: 64):   PASS                                            
 
map_high_truncate_2 (1024M: 64):        FAIL    ftruncate(): Invalid            
 
argument                                                                        
 
misaligned_offset (2M: 64):     PASS (inconclusive)                             
 
misaligned_offset (1024M: 64):  PASS (inconclusive)                             
 
truncate_above_4GB (2M: 64):    PASS                                            
 
truncate_above_4GB (1024M: 64): PASS                                            
 
brk_near_huge (2M: 64): brk_near_huge: malloc.c:2385: sysmalloc:                
 
Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned         
 
long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long)         
 
old_end & (pagesize - 1)) == 0)' failed.                                        
 
brk_near_huge (1024M: 64):      brk_near_huge: malloc.c:2385: sysmalloc:        
 
Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned         
 
long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long)         
 
old_end & (pagesize - 1)) == 0)' failed.                                        
 
task-size-overrun (2M: 64):     PASS                                            
 
task-size-overrun (1024M: 64):  PASS                                            
 
stack_grow_into_huge (2M: 64):   PASS                                           
 
stack_grow_into_huge (1024M: 64): PASS                                          
 
corrupt-by-cow-opt (2M: 64):    PASS                                            
 
corrupt-by-cow-opt (1024M: 64): PASS                                            
 
noresv-preserve-resv-page (2M: 64):     PASS                                    
 
noresv-preserve-resv-page (1024M: 64):  PASS                                    
 
noresv-regarded-as-resv (2M: 64):       PASS                                    
 
noresv-regarded-as-resv (1024M: 64):    PASS                                    
 
readahead_reserve.sh (2M: 64):  PASS                                            
 
readahead_reserve.sh (1024M: 64):       PASS                                    
 
madvise_reserve.sh (2M: 64):    PASS                                            
 
madvise_reserve.sh (1024M: 64): PASS                                            
 
fadvise_reserve.sh (2M: 64):    PASS                                            
 
fadvise_reserve.sh (1024M: 64): PASS                                            
 
mremap-expand-slice-collision.sh (2M: 64):      PASS                            
 
mremap-expand-slice-collision.sh (1024M: 64):   PASS                            
 
mremap-fixed-normal-near-huge.sh (2M: 64):      PASS                            
 
mremap-fixed-normal-near-huge.sh (1024M: 64):   PASS                            
 
mremap-fixed-huge-near-normal.sh (2M: 64):      PASS                            
 
mremap-fixed-huge-near-normal.sh (1024M: 64):   PASS                            
 
set shmmax limit to 67108864                                                    
 
shm-perms (2M: 64):     PASS                                                    
 
private (2M: 64):       PASS                                                    
 
private (1024M: 64):    PASS                                                    
 
fork-cow (2M: 64):      PASS                                                    
 
fork-cow (1024M: 64):   PASS                                                    
 
direct (2M: 64):        Bad configuration: Failed to open direct-IO             
 
file: Invalid argument                                                          
 
direct (1024M: 64):     Bad configuration: Failed to open direct-IO             
 
file: File exists                                                               
 
malloc (2M: 64):        PASS                                                    
 
malloc (1024M: 64):     PASS                                                    
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc (2M: 64):                
 
PASS                                                                            
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc (1024M: 64):             
 
PASS                                                                            
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:none                    
 
HUGETLB_MORECORE=yes malloc (2M: 64):      PASS                                 
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:none                    
 
HUGETLB_MORECORE=yes malloc (1024M: 64):PASS                                    
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:malloc                  
 
HUGETLB_MORECORE=yes malloc (2M: 64):    PASS                                   
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:malloc                  
 
HUGETLB_MORECORE=yes malloc (1024M: 64): PASS                                   
 
malloc_manysmall (2M: 64):      PASS                                            
 
malloc_manysmall (1024M: 64):   PASS                                            
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc_manysmall (2M:           
 
64):      PASS                                                                  
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc_manysmall (1024M:        
 
64):   PASS                                                                     
 
heapshrink (2M: 64):    PASS                                                    
 
heapshrink (1024M: 64): PASS                                                    
 
LD_PRELOAD=libheapshrink.so heapshrink (2M: 64):        PASS                    
 
LD_PRELOAD=libheapshrink.so heapshrink (1024M: 64):     PASS                    
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes heapshrink (2M: 64):            
 
PASS                                                                            
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes heapshrink (1024M: 64):         
 
PASS                                                                            
 
LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE=yes                
 
heapshrink (2M: 64):   PASS                                                     
 
LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE=yes                
 
heapshrink (1024M: 64):        PASS                                             
 
LD_PRELOAD=libheapshrink.so HUGETLB_MORECORE_SHRINK=yes                         
 
HUGETLB_MORECORE=yes heapshrink (2M: 64):       PASS (inconclusive)             
 
LD_PRELOAD=libheapshrink.so HUGETLB_MORECORE_SHRINK=yes                         
 
HUGETLB_MORECORE=yes heapshrink (1024M: 64):    PASS (inconclusive)             
 
LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE_SHRINK=yes         
 
HUGETLB_MORECORE=yes heapshrink (2M: 64):       PASS                            
 
LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE_SHRINK=yes         
 
HUGETLB_MORECORE=yes heapshrink (1024M: 64):    FAIL    Heap did not            
 
shrink                                                                          
 
HUGETLB_VERBOSE=1 HUGETLB_MORECORE=yes heap-overflow (2M: 64):  PASS            
 
HUGETLB_VERBOSE=1 HUGETLB_MORECORE=yes heap-overflow (1024M: 64):               
 
PASS                                                                            
 
HUGETLB_VERBOSE=0 linkhuge_nofd (2M: 64):                                       
 
HUGETLB_VERBOSE=0 linkhuge_nofd (1024M: 64):                                    
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_VERBOSE=0 linkhuge_nofd (2M: 64):            
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_VERBOSE=0 linkhuge_nofd (1024M: 64):         
 
linkhuge (2M: 64):                                                              
 
linkhuge (1024M: 64):                                                           
 
LD_PRELOAD=libhugetlbfs.so linkhuge (2M: 64):                                   
 
LD_PRELOAD=libhugetlbfs.so linkhuge (1024M: 64):                                
 
linkhuge_rw (2M: 64):                                                           
 
linkhuge_rw (1024M: 64):                                                        
 
HUGETLB_ELFMAP=R linkhuge_rw (2M: 64):                                          
 
HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64):                                       
 
HUGETLB_ELFMAP=W linkhuge_rw (2M: 64):                                          
 
HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64):                                       
 
HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64):                                         
 
HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64):                                      
 
HUGETLB_ELFMAP=no linkhuge_rw (2M: 64):                                         
 
HUGETLB_ELFMAP=no linkhuge_rw (1024M: 64):                                      
 
HUGETLB_ELFMAP=R HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64):                  
 
HUGETLB_ELFMAP=R HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64):               
 
HUGETLB_ELFMAP=W HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64):                  
 
HUGETLB_ELFMAP=W HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64):               
 
HUGETLB_ELFMAP=RW HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64):                 
 
HUGETLB_ELFMAP=RW HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64):              
 
HUGETLB_SHARE=0 HUGETLB_ELFMAP=R linkhuge_rw (2M: 64):                          
 
HUGETLB_SHARE=0 HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64):                       
 
HUGETLB_SHARE=1 HUGETLB_ELFMAP=R linkhuge_rw (2M: 64):                          
 
HUGETLB_SHARE=1 HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64):                       
 
HUGETLB_SHARE=0 HUGETLB_ELFMAP=W linkhuge_rw (2M: 64):                          
 
HUGETLB_SHARE=0 HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64):                       
 
HUGETLB_SHARE=1 HUGETLB_ELFMAP=W linkhuge_rw (2M: 64):                          
 
HUGETLB_SHARE=1 HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64):                       
 
HUGETLB_SHARE=0 HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64):                         
 
HUGETLB_SHARE=0 HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64):                      
 
HUGETLB_SHARE=1 HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64):                         
 
HUGETLB_SHARE=1 HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64):                      
 
chunk-overcommit (2M: 64):      PASS                                            
 
chunk-overcommit (1024M: 64):   PASS                                            
 
alloc-instantiate-race shared (2M: 64): PASS                                    
 
alloc-instantiate-race shared (1024M: 64):      PASS                            
 
alloc-instantiate-race private (2M: 64):        PASS                            
 
alloc-instantiate-race private (1024M: 64):     PASS                            
 
truncate_reserve_wraparound (2M: 64):   PASS                                    
 
truncate_reserve_wraparound (1024M: 64):        PASS                            
 
truncate_sigbus_versus_oom (2M: 64):    PASS                                    
 
truncate_sigbus_versus_oom (1024M: 64): PASS                                    
 
get_huge_pages (2M: 64):        PASS                                            
 
get_huge_pages (1024M: 64):     PASS                                            
 
shmoverride_linked (2M: 64):    PASS                                            
 
HUGETLB_SHM=yes shmoverride_linked (2M: 64):    PASS                            
 
shmoverride_linked_static (2M: 64):                                             
 
HUGETLB_SHM=yes shmoverride_linked_static (2M: 64):                             
 
LD_PRELOAD=libhugetlbfs.so shmoverride_unlinked (2M: 64):       PASS            
 
LD_PRELOAD=libhugetlbfs.so HUGETLB_SHM=yes shmoverride_unlinked (2M:            
 
64):       PASS                                                                 
 
quota.sh (2M: 64):      PASS                                                    
 
quota.sh (1024M: 64):   PASS                                                    
 
counters.sh (2M: 64):   PASS                                                    
 
counters.sh (1024M: 64):        FAIL mmap failed: Invalid argument              
 
mmap-gettest 10 35 (2M: 64):    PASS                                            
 
mmap-gettest 10 35 (1024M: 64): FAIL    Failed to mmap the hugetlb file:        
 
Cannot allocate memory                                                          
 
mmap-cow 34 35 (2M: 64):        PASS                                            
 
mmap-cow 34 35 (1024M: 64):     FAIL    Thread 15 (pid=514) failed              
 
set shmmax limit to 73400320                                                    
 
shm-fork 10 17 (2M: 64):        PASS                                            
 
set shmmax limit to 73400320                                                    
 
shm-fork 10 35 (2M: 64):        PASS                                            
 
set shmmax limit to 73400320                                                    
 
shm-getraw 35 /dev/full (2M: 64):       PASS                                    
 
fallocate_stress.sh (2M: 64):   libgcc_s.so.1 must be installed for             
 
pthread_cancel to work                                                          
 
fallocate_stress.sh (1024M: 64):                                                
 
********** TEST SUMMARY                                                         
 
*                      2M             1024M                                     
 
*                      32-bit 64-bit  32-bit 64-bit                             
 
*     Total testcases:     0     93       0     83                              
 
*             Skipped:     0      0       0      0                              
 
*                PASS:     0     69       0     56                              
 
*                FAIL:     0      0       0      5                              
 
*    Killed by signal:     0      1       0      2                              
 
*   Bad configuration:     0      1       0      1                              
 
*       Expected FAIL:     0      0       0      0                              
 
*     Unexpected PASS:     0      0       0      0                              
 
*    Test not present:     0     21       0     19                              
 
* Strange test result:     0      1       0      0                              
 
**********                                                                      
 
                                                                                
 
Changes in v2:                                                                  
 
  - Merge comment fix about task size.                                          
 
  - Patch about mmap base address was merged separately.                        
 
  - Rebased on top of linux-next where series about the capability              
 
    to free gigantic pages regardless of the configuration was merged.          
 
  - Add huge pmd sharing as suggested by Mike Kravetz.                          
 
  - Gigantic page hstate is automatically created if CONTIG_ALLOC is            
 
    set, even if not explicitly asked for in command line, as suggested         
 
    by Mike.                                                                    
 
  - Replace #ifdef CONFIG_64BIT into IS_ENABLED(CONFIG_64BIT), as suggested     
 
    by Christoph Hellwig.                                                 

Alexandre Ghiti (2):
  x86, arm64: Move ARCH_WANT_HUGE_PMD_SHARE config in arch/Kconfig
  riscv: Introduce huge page support for 32/64bit kernel

 arch/Kconfig                     |  3 +++
 arch/arm64/Kconfig               |  2 +-
 arch/riscv/Kconfig               |  8 ++++++
 arch/riscv/include/asm/hugetlb.h | 18 +++++++++++++
 arch/riscv/include/asm/page.h    | 10 ++++++++
 arch/riscv/include/asm/pgtable.h |  8 ++++--
 arch/riscv/mm/Makefile           |  2 ++
 arch/riscv/mm/hugetlbpage.c      | 44 ++++++++++++++++++++++++++++++++
 arch/x86/Kconfig                 |  4 +--
 9 files changed, 93 insertions(+), 6 deletions(-)
 create mode 100644 arch/riscv/include/asm/hugetlb.h
 create mode 100644 arch/riscv/mm/hugetlbpage.c

-- 
2.20.1

Reply via email to