On 07.07.2017 17:05, Thomas Huth wrote: > On 07.07.2017 15:46, Christian Borntraeger wrote: >> On 07/07/2017 12:26 PM, Thomas Huth wrote: >>> The upcoming netboot code will use the libc from SLOF. To be able >>> to still use s390-ccw.h there, the libc related functions in this >>> header have to be moved to a different location. >>> And while we're at it, remove the duplicate memcpy() function from >>> sclp.c. >>> >>> Signed-off-by: Thomas Huth <th...@redhat.com> >> >> one suggestion below, but >> >> Reviewed-by: Christian Borntraeger <borntrae...@de.ibm.com> >> >> for change and no change >> >> [...] >>> +static inline void *memcpy(void *s1, const void *s2, size_t n) >>> +{ >>> + uint8_t *p1 = s1; >>> + const uint8_t *p2 = s2; >>> + >>> + while (n--) { >>> + p1[n] = p2[n]; >>> + } >>> + return s1; >>> +} >> >> Not that it matters, and no idea if moving forward like in the for loop below >> might allow gcc to generate better code, but a for loop like below will be >> the >> same direction as the MVC instruction. Can you double check if this generates >> better code? > > At least with my compiler version (old 4.8.1), the code looks pretty > similar. Backward loop: > > 1d8: a7 19 04 d1 lghi %r1,1233 > 1dc: e3 40 50 00 00 04 lg %r4,0(%r5) > 1e2: e3 30 70 00 00 04 lg %r3,0(%r7) > 1e8: a7 29 04 d2 lghi %r2,1234 > 1ec: 43 01 30 00 ic %r0,0(%r1,%r3) > 1f0: a7 1b ff ff aghi %r1,-1 > 1f4: 42 01 40 01 stc %r0,1(%r1,%r4) > 1f8: a7 27 ff fa brctg %r2,1ec <main+0x1ec> > > Forward loop: > > 238: a7 19 00 00 lghi %r1,0 > 23c: e3 40 50 00 00 04 lg %r4,0(%r5) > 242: e3 30 70 00 00 04 lg %r3,0(%r7) > 248: a7 29 04 d2 lghi %r2,1234 > 24c: 43 01 30 00 ic %r0,0(%r1,%r3) > 250: 42 01 40 00 stc %r0,0(%r1,%r4) > 254: a7 1b 00 01 aghi %r1,1 > 258: a7 27 ff fa brctg %r2,24c <main+0x24c> > > But I can switch to the for-loop version in my patch anyway, if you > prefer that.
FWIW, if I define memcpy() via __builtin_memcpy() instead, I get a MVC in the disassembly... should we use maybe that macro instead? Thomas