[PATCH] MIPS: Grant pte read permission, even if vma only have VM_WRITE permission.

2020-06-29 Thread Lichao Liu
Background:
a cpu have RIXI feature.

Now, if a vma only have VM_WRITE permission, the vma->vm_page_prot will
set _PAGE_NO_READ. In general case, someone read the vma will trigger
RI exception, then do_page_fault will handle it.

But in the following scene, program will hang.

example scene(a trinity test case):
futex_wake_op() will read uaddr, which is passed from user space.
If a program mmap a vma, which only have VM_WRITE permission,
then call futex, and use an address belonging to the vma as uaddr
argument. futex_wake_op() will read the address after disable
pagefault and set correct __ex_table(return -14 directly),
do_page_fault will find the correct __ex_table, and then return -14.
Then futex_wake_op() will try to fixup this error by call
fault_in_user_writeable(), because the pte have write permission,
so handle_mm_fault will do nothing, and return success.
But the RI bit in pte and tlb entry still exsits.
The program will deadloop:
do_page_fault -> find __ex_table success -> return -14;
futex_wake_op -> call fault_in_user_writeable() to fix the error -> retry;
do_page_fault -> find __ex_table success -> return -14;
futex_wake_op -> call fault_in_user_writeable() to fix the error -> retry;
.

The first perspective of root cause:
Futex think a pte have write permission will have read permission.
When page fault, it only try to fixup with FAULT_FLAG_WRITE.

The second perspective of root cause:
MIPS platform doesn't grant pte read permission, if vma only have
VM_WRITE permission.But X86 and arm64 will.

Most of the architecture will grant pte read permission, even if
the vma only have VM_WRITE permission.
And if the cpu doesn't have RIXI feature, MIPS platform will
grant pte read permission by set _PAGE_READ.
So I think we should fixup thix problem by grant pte read permission,
even if vma only have VM_WRITE permission.

Signed-off-by: Lichao Liu 
---
 arch/mips/mm/cache.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
index ad6df1cea866..72b60c44a962 100644
--- a/arch/mips/mm/cache.c
+++ b/arch/mips/mm/cache.c
@@ -160,7 +160,7 @@ static inline void setup_protection_map(void)
if (cpu_has_rixi) {
protection_map[0]  = __pgprot(_page_cachable_default | 
_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_NO_READ);
protection_map[1]  = __pgprot(_page_cachable_default | 
_PAGE_PRESENT | _PAGE_NO_EXEC);
-   protection_map[2]  = __pgprot(_page_cachable_default | 
_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_NO_READ);
+   protection_map[2]  = __pgprot(_page_cachable_default | 
_PAGE_PRESENT | _PAGE_NO_EXEC);
protection_map[3]  = __pgprot(_page_cachable_default | 
_PAGE_PRESENT | _PAGE_NO_EXEC);
protection_map[4]  = __pgprot(_page_cachable_default | 
_PAGE_PRESENT);
protection_map[5]  = __pgprot(_page_cachable_default | 
_PAGE_PRESENT);
@@ -169,7 +169,7 @@ static inline void setup_protection_map(void)
 
protection_map[8]  = __pgprot(_page_cachable_default | 
_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_NO_READ);
protection_map[9]  = __pgprot(_page_cachable_default | 
_PAGE_PRESENT | _PAGE_NO_EXEC);
-   protection_map[10] = __pgprot(_page_cachable_default | 
_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_WRITE | _PAGE_NO_READ);
+   protection_map[10] = __pgprot(_page_cachable_default | 
_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_WRITE);
protection_map[11] = __pgprot(_page_cachable_default | 
_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_WRITE);
protection_map[12] = __pgprot(_page_cachable_default | 
_PAGE_PRESENT);
protection_map[13] = __pgprot(_page_cachable_default | 
_PAGE_PRESENT);
-- 
2.25.1



[PATCH] sched/rt: Don't active rt throtting when no running cfs task

2020-06-16 Thread Lichao Liu
Active rt throtting will dequeue rt_rq from rq at least 50ms,
When there is no running cfs task, do we still active it?

Signed-off-by: Lichao Liu 
---
 kernel/sched/rt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index df11d88c9895..d6524347cea0 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -961,12 +961,13 @@ static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq)
 
if (rt_rq->rt_time > runtime) {
struct rt_bandwidth *rt_b = sched_rt_bandwidth(rt_rq);
+   struct rq *rq = rq_of_rt_rq(rt_rq);
 
/*
 * Don't actually throttle groups that have no runtime assigned
 * but accrue some time due to boosting.
 */
-   if (likely(rt_b->rt_runtime)) {
+   if (likely(rt_b->rt_runtime) && rq->cfs.nr_running > 0) {
rt_rq->rt_throttled = 1;
printk_deferred_once("sched: RT throttling 
activated\n");
} else {
-- 
2.25.1