Astonishing. I changed my non-C based binary to remove PROT_READ, and I found that the mmap test completed successfully! Now I just have to figure out how to edit the binary headers to remove the READ_IMPLIES_EXEC option and then test it.
On Sat, Jan 16, 2016 at 1:33 PM, Kenneth Adam Miller < [email protected]> wrote: > The particular non-C binary that I'm using is rust with musl support, so > that I can statically compile the binary in order to eliminate all library > dependencies and then run it on a buildroot based linux. > > On Sat, Jan 16, 2016 at 1:32 PM, Kenneth Adam Miller < > [email protected]> wrote: > >> Wait, are you assuming that I'm using the latest kernel? Because I'm >> using 3.14.56... >> >> On Sat, Jan 16, 2016 at 1:31 PM, Mike Krinkin <[email protected]> >> wrote: >> >>> On Sat, Jan 16, 2016 at 01:16:42PM -0500, Kenneth Adam Miller wrote: >>> > Ok, so you think that the format of the binary would influence the >>> kernel >>> > to change the permissions on the user's behalf? There's not much prose >>> > explanation here, and I don't understand why the kernel would do >>> something >>> > like this. >>> >>> That personality falg was introduced here with quite a detail explanation >>> (which i don't understand though): >>> http://lwn.net/Articles/94068/ >>> >>> > I just wanted to use a static binary to eliminate library >>> > dependency issues between my host machine and the target machine. I >>> had no >>> > idea that settings like this would carry over to my task at hand. >>> >>> I compiled simple hello world with -static flag, and GNU_STACK in the >>> binary >>> has no executable flag set, so static has probably nothing to do with >>> this. >>> >>> > >>> > On Sat, Jan 16, 2016 at 1:08 PM, Mike Krinkin <[email protected]> >>> wrote: >>> > >>> > > On Sat, Jan 16, 2016 at 12:45:17PM -0500, Kenneth Adam Miller wrote: >>> > > > I got the strace output of my non-C binary (I filtered the noise >>> out of >>> > > the >>> > > > output for you): >>> > > > >>> > > > mmap(NULL, 8192, PROT_READ | PROT_WRITE, >>> MAP_PRIVATE|MAP_ANONYMOUS, -1, >>> > > 0) >>> > > > >>> > > > I also have readelf -l output: >>> > > > >>> > > > Elf file type is EXEC (Executable file) >>> > > > Entry point 0x401311 >>> > > > There are 7 program headers, starting at offset 64 >>> > > > >>> > > > Program Headers: >>> > > > Type Offset VirtAddr PhysAddr >>> > > > FileSiz MemSiz Flags >>> Align >>> > > > LOAD 0x0000000000000000 0x0000000000400000 >>> 0x0000000000400000 >>> > > > 0x00000000000db604 0x00000000000db604 R E 1000 >>> > > > LOAD 0x00000000000dc1c0 0x00000000004dd1c0 >>> 0x00000000004dd1c0 >>> > > > 0x0000000000006220 0x00000000000091dc RW 1000 >>> > > > NOTE 0x00000000000001c8 0x00000000004001c8 >>> 0x00000000004001c8 >>> > > > 0x0000000000000024 0x0000000000000024 R 4 >>> > > > GNU_EH_FRAME 0x00000000000d5680 0x00000000004d5680 >>> 0x00000000004d5680 >>> > > > 0x0000000000005f84 0x0000000000005f84 R 4 >>> > > > GNU_STACK 0x0000000000000000 0x0000000000000000 >>> 0x0000000000000000 >>> > > > 0x0000000000000000 0x0000000000000000 RWE 0 >>> > > >>> > > Well, probably this is a bit more relevant: >>> > > http://lxr.free-electrons.com/source/mm/mmap.c#L1281 >>> > > >>> > > As far as i can see, kernel sets READ_IMPLIES_EXEC flag here: >>> > > http://lxr.free-electrons.com/source/fs/binfmt_elf.c#L844 >>> > > >>> > > if executable_stack != EXSTACK_DISABLE_X, and executable_stack >>> initialized >>> > > here: >>> > > http://lxr.free-electrons.com/source/fs/binfmt_elf.c#L781 >>> > > >>> > > if GNU_STACK has an executable flag set (and i suppose, that RWE >>> means, >>> > > that >>> > > in your case GNU_STACK indeed has exectuable flag set). >>> > > >>> > > It may be a reason, i'm not shure though. May be this can help: >>> > > http://man7.org/linux/man-pages/man2/personality.2.html >>> > > >>> > > >>> > > > TLS 0x00000000000dc1c0 0x00000000004dd1c0 >>> 0x00000000004dd1c0 >>> > > > 0x0000000000000100 0x0000000000000100 R 10 >>> > > > GNU_RELRO 0x00000000000dc1c0 0x00000000004dd1c0 >>> 0x00000000004dd1c0 >>> > > > 0x0000000000005e40 0x0000000000005e40 RW 20 >>> > > > >>> > > > Section to Segment mapping: >>> > > > Segment Sections... >>> > > > 00 .note.gnu.build-id .init .text .fini .gcc_except_table >>> .rodata >>> > > > .debug_gdb_scripts .eh_frame .eh_frame_hdr >>> > > > 01 .tdata .data.rel.ro.local .data.rel.ro .init_array .got >>> > > .got.plt >>> > > > .data .bss >>> > > > 02 .note.gnu.build-id >>> > > > 03 .eh_frame_hdr >>> > > > 04 >>> > > > 05 .tdata >>> > > > 06 .tdata .data.rel.ro.local .data.rel.ro .init_array .got >>> > > .got.plt >>> > > > >>> > > > Some notes: >>> > > > >>> > > > As a test, I changed the non-C binary's target device file to >>> /dev/zero, >>> > > > and then I could see that the non-C mmap attempt would succeed >>> just fine. >>> > > > >>> > > > After further verification and debugging based on guidance from >>> another >>> > > > forum, I have convinced that the vm_flags change must be occuring >>> > > somewhere >>> > > > in kernel land after control flow has left user land. Now I need to >>> > > figure >>> > > > out how to use a kernel debugger or kprobes to walk through the >>> execution >>> > > > of mmap callback delegation and see where the flags parameter is >>> being >>> > > > changed. >>> > > > >>> > > > I was pointed out to this: >>> > > > http://lxr.free-electrons.com/source/mm/mmap.c#L1312 >>> > > > >>> > > > But why would my vm_flags be changed by the kernel? And what can I >>> do to >>> > > > get this to stop? Why is the kernel changing the vm_flags for a >>> non-C >>> > > > binary using my device file, but not for either a C binary using my >>> > > device >>> > > > file or any type of binary that's not using my device file? >>> > > > >>> > > > On Thu, Jan 14, 2016 at 12:28 PM, Kenneth Adam Miller < >>> > > > [email protected]> wrote: >>> > > > >>> > > > > >>> > > > > >>> > > > > On Thu, Jan 14, 2016 at 12:00 PM, Mike Krinkin < >>> [email protected]> >>> > > > > wrote: >>> > > > > >>> > > > >> Hi, i have a couple of questions to clarify, if you don't mind >>> > > > >> >>> > > > >> On Thu, Jan 14, 2016 at 11:04:28AM -0500, Kenneth Adam Miller >>> wrote: >>> > > > >> > I have a custom drive and userland program pair that I'm >>> using for a >>> > > > >> very >>> > > > >> > special use case at my workplace where we are mapping specific >>> > > physical >>> > > > >> > address ranges into userland memory with a mmap callback. >>> Everything >>> > > > >> works >>> > > > >> > together well with a C userland program that calls into our >>> driver's >>> > > > >> ioctl >>> > > > >> > and mmap definitions, but for our case we are using an >>> alternative >>> > > > >> systems >>> > > > >> > language just for the userland program. >>> > > > >> >>> > > > >> So you have userland app written in C, and another not written >>> in C? >>> > > > >> The former works well while the latter doesn't, am i right? >>> > > > >> >>> > > > > >>> > > > > Yes, the former works in so much as mmap completes successfully. >>> I've >>> > > > > verified that the >>> > > > > parameters are identical in the non-C program. The issue of just >>> using >>> > > the >>> > > > > C only program >>> > > > > is that the actual implementation of interest is in the non-C >>> program, >>> > > and >>> > > > > that's because >>> > > > > that language facilitates other features that are *required* on >>> our >>> > > end. >>> > > > > >>> > > > > >>> > > > >> >>> > > > >> > That mmap call is failing (properly >>> > > > >> > as we want) out from the driver's mmap implementation due to >>> the >>> > > fact >>> > > > >> that >>> > > > >> > the vm_flags have the VM_EXEC flag set. We do not want users >>> to be >>> > > able >>> > > > >> to >>> > > > >> > map the memory range as executable, so the driver should >>> check for >>> > > this >>> > > > >> as >>> > > > >> > it does. The issue is in the fact that somewhere between >>> where mmap >>> > > is >>> > > > >> > called and when the parameters are given to the driver, the >>> > > > >> vma->vm_flags >>> > > > >> > are being set to 255. I've manually checked the values being >>> given >>> > > to >>> > > > >> the >>> > > > >> > mmap call in our non-C binary, and they are *equivalent* in >>> value to >>> > > > >> that >>> > > > >> > of the C program. >>> > > > >> >>> > > > >> By "manually" do you mean strace? Could you show strace output >>> for >>> > > > >> both apps? And also could you show readelf -l output for both >>> > > binaries? >>> > > > >> >>> > > > > >>> > > > > By manually, I mean with a print call just before the mmap call >>> in >>> > > each of >>> > > > > the >>> > > > > programs. Right now, I'm working on getting a strace output, but >>> I >>> > > have to >>> > > > > run that in qemu. >>> > > > > To be able to run it in qemu in order to isolate the driver and >>> all >>> > > from >>> > > > > my host, I have to build >>> > > > > with buildroot. So I'll email that when I get it, but it'll be a >>> while. >>> > > > > >>> > > > > >>> > > > >> >>> > > > >> > >>> > > > >> > My question is, is there anything that can cause the >>> vma->vm_flags >>> > > to be >>> > > > >> > changed in the trip between when the user land program calls >>> mmap >>> > > and >>> > > > >> when >>> > > > >> > control is delivered to the mmap callback? >>> > > > >> >>> > > > >> > _______________________________________________ >>> > > > >> > Kernelnewbies mailing list >>> > > > >> > [email protected] >>> > > > >> > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies >>> > > > >> >>> > > > >> >>> > > > > >>> > > >>> >> >> >
_______________________________________________ Kernelnewbies mailing list [email protected] http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
