l_exc_copy() is usually to be used like this:
1 EXC(  LOAD    t0, UNIT(0)(src),       l_exc)
2 EXC(  LOAD    t1, UNIT(1)(src),       l_exc_copy)
3 EXC(  LOAD    t2, UNIT(2)(src),       l_exc_copy)
4 EXC(  LOAD    t3, UNIT(3)(src),       l_exc_copy)
When the fault occurs on row 4, l_exc_copy will get the bad
addr through THREAD_BUADDR(), complete the copy of row
1,2 and 3, and then return the len that has not been copied.
l_exc_copy assumes the src is smaller thann the bad addr.It will
increase src by 1 until reach the bad addr.

octeon-memcpy.S use the l_exc_copy with wrong way which make src
could be greater than bad addr. We will fix it in this patch.
We add the max offset of LOAD to 15 here to fix the issue without
adding new commands . Howerver, the side effect is that, when LOAD
fails in few case, l_exc_copy has to copy more.

Signed-off-by: jianchao.wang <[email protected]>
---
 arch/mips/cavium-octeon/octeon-memcpy.S | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S 
b/arch/mips/cavium-octeon/octeon-memcpy.S
index 64e08df..b0fe98e 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -205,22 +205,22 @@ EXC(      LOAD    t3, UNIT(7)(src),       l_exc_copy)
 EXC(   STORE   t0, UNIT(4)(dst),       s_exc_p12u)
 EXC(   STORE   t1, UNIT(5)(dst),       s_exc_p11u)
 EXC(   STORE   t2, UNIT(6)(dst),       s_exc_p10u)
-       ADD     src, src, 16*NBYTES
 EXC(   STORE   t3, UNIT(7)(dst),       s_exc_p9u)
+EXC(   LOAD    t0, UNIT(8)(src),       l_exc_copy)
+EXC(   LOAD    t1, UNIT(9)(src),       l_exc_copy)
+EXC(   LOAD    t2, UNIT(10)(src),      l_exc_copy)
+EXC(   LOAD    t3, UNIT(11)(src),      l_exc_copy)
+EXC(   STORE   t0, UNIT(8)(dst),       s_exc_p8u)
+EXC(   STORE   t1, UNIT(9)(dst),       s_exc_p7u)
+EXC(   STORE   t2, UNIT(10)(dst),      s_exc_p6u)
+EXC(   STORE   t3, UNIT(11)(dst),      s_exc_p5u)
+EXC(   LOAD    t0, UNIT(12)(src),      l_exc_copy)
+EXC(   LOAD    t1, UNIT(13)(src),      l_exc_copy)
+EXC(   LOAD    t2, UNIT(14)(src),      l_exc_copy)
+EXC(   LOAD    t3, UNIT(15)(src),      l_exc_copy)
        ADD     dst, dst, 16*NBYTES
-EXC(   LOAD    t0, UNIT(-8)(src),      l_exc_copy)
-EXC(   LOAD    t1, UNIT(-7)(src),      l_exc_copy)
-EXC(   LOAD    t2, UNIT(-6)(src),      l_exc_copy)
-EXC(   LOAD    t3, UNIT(-5)(src),      l_exc_copy)
-EXC(   STORE   t0, UNIT(-8)(dst),      s_exc_p8u)
-EXC(   STORE   t1, UNIT(-7)(dst),      s_exc_p7u)
-EXC(   STORE   t2, UNIT(-6)(dst),      s_exc_p6u)
-EXC(   STORE   t3, UNIT(-5)(dst),      s_exc_p5u)
-EXC(   LOAD    t0, UNIT(-4)(src),      l_exc_copy)
-EXC(   LOAD    t1, UNIT(-3)(src),      l_exc_copy)
-EXC(   LOAD    t2, UNIT(-2)(src),      l_exc_copy)
-EXC(   LOAD    t3, UNIT(-1)(src),      l_exc_copy)
 EXC(   STORE   t0, UNIT(-4)(dst),      s_exc_p4u)
+       ADD     src, src, 16*NBYTES
 EXC(   STORE   t1, UNIT(-3)(dst),      s_exc_p3u)
 EXC(   STORE   t2, UNIT(-2)(dst),      s_exc_p2u)
 EXC(   STORE   t3, UNIT(-1)(dst),      s_exc_p1u)
-- 
2.7.4

Reply via email to