I now have another issue. My binary fails to mmap a file within lkvm
sandbox. The same binary works fine on host and in qemu. I've added
strace into sandbox script, and here is the output:

[pid   837] openat(AT_FDCWD, "syzkaller-shm048878722", O_RDWR|O_CLOEXEC) = 5
[pid   837] mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_SHARED, 5,
0) = -1 EINVAL (Invalid argument)

I don't see anything that can potentially cause EINVAL here. Is it
possible that lkvm somehow affects kernel behavior here?

I run lkvm as:

$ taskset 1 /kvmtool/lkvm sandbox --disk syz-0 --mem=2048 --cpus=2
--kernel /arch/x86/boot/bzImage --network mode=user --sandbox
/workdir/kvm/syz-0.sh








On Mon, Oct 19, 2015 at 4:20 PM, Sasha Levin <sasha.le...@oracle.com> wrote:
> On 10/19/2015 05:28 AM, Dmitry Vyukov wrote:
>> On Mon, Oct 19, 2015 at 11:22 AM, Andre Przywara <andre.przyw...@arm.com> 
>> wrote:
>>> Hi Dmitry,
>>>
>>> On 19/10/15 10:05, Dmitry Vyukov wrote:
>>>> On Fri, Oct 16, 2015 at 7:25 PM, Sasha Levin <sasha.le...@oracle.com> 
>>>> wrote:
>>>>> On 10/15/2015 04:20 PM, Dmitry Vyukov wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I am trying to run a program in lkvm sandbox so that it communicates
>>>>>> with a program on host. I run lkvm as:
>>>>>>
>>>>>> ./lkvm sandbox --disk sandbox-test --mem=2048 --cpus=4 --kernel
>>>>>> /arch/x86/boot/bzImage --network mode=user -- /my_prog
>>>>>>
>>>>>> /my_prog then connects to a program on host over a tcp socket.
>>>>>> I see that host receives some data, sends some data back, but then
>>>>>> my_prog hangs on network read.
>>>>>>
>>>>>> To localize this I wrote 2 programs (attached). ping is run on host
>>>>>> and pong is run from lkvm sandbox. They successfully establish tcp
>>>>>> connection, but after some iterations both hang on read.
>>>>>>
>>>>>> Networking code in Go runtime is there for more than 3 years, widely
>>>>>> used in production and does not have any known bugs. However, it uses
>>>>>> epoll edge-triggered readiness notifications that known to be tricky.
>>>>>> Is it possible that lkvm contains some networking bug? Can it be
>>>>>> related to the data races in lkvm I reported earlier today?
>>>
>>> Just to let you know:
>>> I think we have seen networking issues in the past - root over NFS had
>>> issues IIRC. Will spent some time on debugging this and it looked like a
>>> race condition in kvmtool's virtio implementation. I think pinning
>>> kvmtool's virtio threads to one host core made this go away. However
>>> although he tried hard (even by Will's standards!) he couldn't find a
>>> the real root cause or a fix at the time he looked at it and we found
>>> other ways to work around the issues (using virtio-blk or initrd's).
>>>
>>> So it's quite possible that there are issues. I haven't had time yet to
>>> look at your sanitizer reports, but it looks like a promising approach
>>> to find the root cause.
>>
>>
>> Thanks, Andre!
>>
>> ping/pong does not hang within at least 5 minutes when I run lkvm
>> under taskset 1.
>>
>> And, yeah, this pretty strongly suggests a data race. ThreadSanitizer
>> can point you to the bug within a minute, so you just need to say
>> "aha! it is here". Or maybe not. There are no guarantees. But if you
>> already spent significant time on this, then checking the reports
>> definitely looks like a good idea.
>
> Okay, that's good to know.
>
> I have a few busy days, but I'll definitely try to clear up these reports
> as they seem to be pointing to real issues.
>
>
> Thanks,
> Sasha
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to