On Thu, Mar 19, 2015 at 03:48:35PM -0500, Bolo wrote:
> This bugfix / patch consists of two portions
> 
> 1) A 'Z' option which forces strace to use ptrace() memory access
> routines all the time instead of vm-based memory access routines if
> they are present..  This allows a fallback if you find that VM based
> requests are failing or producing incorrect output.   The existing
> ptrace based methods are quite robust and tested :-)

As there are no known issues with kernel implementation of process_vm_readv,
why should anybody want to disable fast process_vm_readv based code
in favour of slow PTRACE_PEEKDATA based one?

> 2) A correction for a long-term bug in th VM based memory access
> method of strace.  This bug has been in strace in various forms since
> at least 4.5.19.
> 
> The ptrace based code is correct.

Well, it was a genuine bug in process_vm_readv based code, thank you for
reporting it, although I haven't got it from your description and patch
after the first read.

It's fixed by commit v4.10-56-g4832134 which also contains a regression
test.  There is also a follow-up cleanup commit v4.10-57-gea1fea6.

> What happens is that the vm based access code in strace doesn't deal
> correctly with alignments of data in the kernel.   It assumes that
> everything is at a certain alignment.  This is an incorrrect assumption;
> when there are a lot of objects the kernel starts putting string data at
> lesser aligned  addresses.   I discovered this through debugging
> of strace failures and finding the kernel relaxing alignment constraints.
> This is true of older (such as rhel 5.9 and newer such as rhel -6.6 kernels).

I've tried to reproduce alignment issues with process_vm_readv from your
description, but without success.

> There was also an issue with the code not treating page boundaries
> correctly due to incorrect page arithmetic.

The actual bug was using wrong address for the page arithmetic.

> When this alignment constraint is relaxed, or strace incorrectly issues
> reads across page boundaries, strace fails with a error of
> 
>       "umovestr: short read (%d < %d) @0x%lx"
> 
> due to the incorect code.  In addition, since the vm-based code doesn't
> correctly update the address and lengths of the region to be accessed,
> the fallback code -- which is implemented correctly, fails to work.

There is no fallback to PTRACE_PEEKDATA based code in case when
process_vm_readv has read at least one byte.

> The new code also deals with arbitrary page sizes correctly in extracting
> data using the vm mode, instead of relying upon a 4k pagesize.
> 
> 3) Reproducing this error depends upon where the kernel is putting items
> in memory, and may also be based on the load of the kernel.   As such,
> this error can be difficult to reproduce.

Actually, the reproducer is quite simple:
http://sourceforge.net/p/strace/code/ci/HEAD/tree/tests/umovestr.c


-- 
ldv

Attachment: pgpODzh6Fx90D.pgp
Description: PGP signature

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Strace-devel mailing list
Strace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/strace-devel

Reply via email to