My next question is about disks. I see that the fs_bigLITTLE.py script uses
PciVirtIO to set up its disks, where I'm using IDE which I inherited from
the fs.py scripts I used as reference. The problem I'm seeing is that the
IDE controllers seem to be mangling commands and dropping interrupts, so
this difference looks particularly suspicious. Is there a KVM related
reason you're using PciVirtIO? Is this something that *should* work with
IDE bug doesn't, or do I have to use PciVirtIO for things to work properly?
I'm not familiar with PciVirtIO beyond briefly skimming the source for it
in gem5. Is this something we should consider using globally as a
replacement for IDE, even in simulations where we're trying to be really
realistic?

Thanks again for all the help.

Gabe

On Tue, Mar 20, 2018 at 3:14 PM, Gabe Black <gabebl...@google.com> wrote:

> Ok, that (multiple event queues) made things way better. There are still
> some glitches to figure out, but at least it makes good forward progress at
> a reasonable speed. Thanks!
>
> Gabe
>
> On Mon, Mar 19, 2018 at 5:12 PM, Gabe Black <gabebl...@google.com> wrote:
>
>> This is on an chromebook based on the RK3399 with only ~4GB of RAM which
>> is not ideal, although we have a bigger machine in the works for the
>> future. I agree with your reasoning and don't think option 1 is a problem.
>> We're using static DTBs so I don't think that's an issue either. In my
>> script, I'm not doing anything smart with the event queues, so that's
>> likely at least part of the problem. When I tried using fs_bigLITTLE.py I
>> ran into what looked like a similar issue so that might not be the whole
>> story, but it's definitely something I should fix up. I'll let you know how
>> that goes!
>>
>> Gabe
>>
>> On Mon, Mar 19, 2018 at 4:30 AM, Andreas Sandberg <
>> andreas.sandb...@arm.com> wrote:
>>
>>> Hmm, OK, this is very strange.
>>>
>>> What type of hardware are you running on? Is it an A57-based chip or
>>> something else? Also, what's your simulation quantum? I have been able to
>>> run with a 0.5ms quantum  (5e8 ticks).
>>> I think the following trace of two CPUs running in KVM should be roughly
>>> equivalent to the trace you shared earlier. It was generated on a
>>> commercially available 8xA57 (16GiB ram) using the following command (gem5
>>> rev 9dc44b417):
>>>
>>> gem5.opt -r --debug-flags Kvm,KvmIO,KvmRun 
>>> configs/example/arm/fs_bigLITTLE.py \
>>>     --sim-quantum '0.5ms' \
>>>     --cpu-type kvm --big-cpus 0 --little-cpus 2 \
>>>     --dtb system/arm/dt/armv8_gem5_v1_2cpu.dtb --kernel 
>>> vmlinux.aarch64.4.4-d318f95d0c
>>>
>>> Note that the tick counts are a bit weird since we have three different
>>> event queues at play (1 for devices and one per CPU).
>>>
>>>       0: system.littleCluster.cpus0: KVM: Executing for 500000000 ticks
>>>       0: system.littleCluster.cpus1: KVM: Executing for 500000000 ticks
>>>       0: system.littleCluster.cpus0: KVM: Executed 79170 instructions in 
>>> 176363 cycles (88181504 ticks, sim cycles: 176363).
>>> 88182000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>>> 88182000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>>> 0x1c090024, len: 4)
>>> 88332000: system.littleCluster.cpus0: Entering KVM...
>>> 88332000: system.littleCluster.cpus0: KVM: Executing for 411668000 ticks
>>> 88332000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 
>>> 16854 cycles (8427000 ticks, sim cycles: 16854).
>>> 96759000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>>> 96759000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>>> 0x1c090030, len: 4)
>>>       0: system.littleCluster.cpus1: KVM: Executed 409368 instructions in 
>>> 666400 cycles (333200000 ticks, sim cycles: 666400).
>>> 333200000: system.littleCluster.cpus1: Entering KVM...
>>> 333200000: system.littleCluster.cpus1: KVM: Executing for 166800000 ticks
>>> 96909000: system.littleCluster.cpus0: Entering KVM...
>>> 96909000: system.littleCluster.cpus0: KVM: Executing for 403091000 ticks
>>> 96909000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 
>>> 15257 cycles (7628500 ticks, sim cycles: 15257).
>>> 104538000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>>> 104538000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>>> 0x1c0100a0, len: 4)
>>> 333200000: system.littleCluster.cpus1: KVM: Executed 47544 instructions in 
>>> 200820 cycles (100410000 ticks, sim cycles: 200820).
>>> 433610000: system.littleCluster.cpus1: Entering KVM...
>>> 433610000: system.littleCluster.cpus1: KVM: Executing for 66390000 ticks
>>> 104688000: system.littleCluster.cpus0: Entering KVM...
>>> 104688000: system.littleCluster.cpus0: KVM: Executing for 395312000 ticks
>>> 104688000: system.littleCluster.cpus0: KVM: Executed 4382 instructions in 
>>> 14942 cycles (7471000 ticks, sim cycles: 14942).
>>>
>>> Comparing this trace to yours, I'd say that there the frequent KVM exits
>>> look a bit suspicious. I would expect secondary CPUs to make very little
>>> process while the main CPU initializes the system and starts the early boot
>>> code.
>>>
>>> There area  couple of possibilities that might be causing issues:
>>>
>>> 1) There is some CPU ID weirdness that confuses the boot code and puts
>>> both CPUs in the holding pen. This seems unlikely since there are some
>>> writes to the UART.
>>>
>>> 2) Some device is incorrectly mapped to the CPU event queues and causes
>>> frequent KVM exits. Have a look at _build_kvm in fs_bigLITTLE.py, it
>>> doesn't use configs/common, so no need to tear your eyes out. ;) Do you map
>>> event queues in the same way? It's mapping all simulated devices to one
>>> event queue and the CPUs to private event queues. It's important to remap
>>> CPU child devices to the device queue instead of the CPU queue. Failing to
>>> do this will cause chaos, madness, and quite possibly result in Armageddon.
>>>
>>> 3) You're using DTB autogeneration. This doesn't work for KVM guests due
>>> to issues with the timer interrupt specification. We have a patch for the
>>> timer that we are testing internally. Sorry. :(
>>>
>>> Regards,
>>> Andreas
>>> On 16/03/2018 23:20, Gabe Black wrote:
>>>
>>> Ok, diving into this a little deeper, it looks like execution is
>>> progressing but is making very slow progress for some reason. I added a
>>> call to "dump()" before each ioctl invocation which enters the VM and
>>> looked at the PC to get an idea of what it was up to. I made sure to put
>>> that before the timers to avoid taking up VM time with printing debug
>>> stuff. In any case, I see that neither CPU gets off of PC 0 for about 2ms
>>> simulated time (~500Hz), and that's EXTREMELY slow for a CPU which is
>>> supposed to be running in the ballpark of 2GHz. It's not clear to me why
>>> it's making such slow progress, but that would explain why I'm getting very
>>> little out on the simulated console. It's just taking forever to make it
>>> that far.
>>>
>>> Any idea why it's going so slow, or how to debug further?
>>>
>>> Gabe
>>>
>>> On Wed, Mar 14, 2018 at 7:42 PM, Gabe Black <gabebl...@google.com>
>>> wrote:
>>>
>>>> Some output which I think is suspicious:
>>>>
>>>> 55462000: system.cpus0: Entering KVM...
>>>> 55462000: system.cpus0: KVM: Executing for 1506000 ticks
>>>> 55462000: system.cpus0: KVM: Executed 5159 instructions in 13646 cycles
>>>> (6823000 ticks, sim cycles: 13646).
>>>> 56968000: system.cpus1: Entering KVM...
>>>> 56968000: system.cpus1: KVM: Executing for 5317000 ticks
>>>> 56968000: system.cpus1: KVM: Executed 7229 instructions in 14379 cycles
>>>> (7189500 ticks, sim cycles: 14379).
>>>> 62285000: system.cpus0: Entering KVM...
>>>> 62285000: system.cpus0: KVM: Executing for 1872500 ticks
>>>> 62285000: system.cpus0: KVM: Executed 5159 instructions in 13496 cycles
>>>> (6748000 ticks, sim cycles: 13496).
>>>> 64157500: system.cpus1: Entering KVM...
>>>> 64157500: system.cpus1: KVM: Executing for 4875500 ticks
>>>> 64157500: system.cpus1: KVM: Executed 6950 instructions in 13863 cycles
>>>> (6931500 ticks, sim cycles: 13863).
>>>> 69033000: system.cpus0: Entering KVM...
>>>> 69033000: system.cpus0: KVM: Executing for 2056000 ticks
>>>> 69033000: system.cpus0: KVM: Executed 5159 instructions in 13454 cycles
>>>> (6727000 ticks, sim cycles: 13454).
>>>> 71089000: system.cpus1: Entering KVM...
>>>> 71089000: system.cpus1: KVM: Executing for 4671000 ticks
>>>> 71089000: system.cpus1: KVM: Executed 6950 instructions in 13861 cycles
>>>> (6930500 ticks, sim cycles: 13861).
>>>> 75760000: system.cpus0: Entering KVM...
>>>> 75760000: system.cpus0: KVM: Executing for 2259500 ticks
>>>> 75760000: system.cpus0: KVM: Executed 5159 instructions in 13688 cycles
>>>> (6844000 ticks, sim cycles: 13688).
>>>>
>>>> [...]
>>>>
>>>> 126512000: system.cpus0: handleKvmExit (exit_reason: 6)
>>>> 126512000: system.cpus0: KVM: Handling MMIO (w: 1, addr: 0x1c090024,
>>>> len: 4)
>>>> 126512000: system.cpus0: In updateThreadContext():
>>>>
>>>> [...]
>>>>
>>>> 126512000: system.cpus0:   PC := 0xd8 (t: 0, a64: 1)
>>>>
>>>> On Wed, Mar 14, 2018 at 7:37 PM, Gabe Black <gabebl...@google.com>
>>>> wrote:
>>>>
>>>>> I tried it just now, and I still don't see anything on the console. I
>>>>> switched back to using my own script since it's a bit simpler (it doesn't
>>>>> use all the configs/common stuff), and started looking at the KVM debug
>>>>> output. I see that both cpus claim to execute instructions, although cpu1
>>>>> didn't take an exit in the output I was looking at. cpu0 took four exits,
>>>>> two which touched some UART registers, and two which touched RealView
>>>>> registes, the V2M_SYS_CFGDATA and V2M_SYS_CFGCTRL registers judging by the
>>>>> comments in the bootloader assembly file.
>>>>>
>>>>> After that they claim to be doing stuff, although I see no further
>>>>> console output or KVM exits. The accesses themselves and their PCs are 
>>>>> from
>>>>> the bootloader blob, and so I'm pretty confident that it's starting that
>>>>> and executing some of those instructions. One thing that looks very odd 
>>>>> now
>>>>> that I think about it, is that the KVM messages about entering and
>>>>> executing instructions (like those below) seem to say that cpu0 has
>>>>> executed thousands of instructions, but the exits I see seem to correspond
>>>>> to the first maybe 50 instructions it should be seeing in the bootloader
>>>>> blob. Are those values bogus for some reason? Is there some existing debug
>>>>> output which would let me see where KVM thinks it is periodically to see 
>>>>> if
>>>>> it's in the kernel or if it went bananas and is executing random memory
>>>>> somewhere? Or if it just got stuck waiting for some event that's not going
>>>>> to show up?
>>>>>
>>>>> Are there any important CLs which haven't made their way into upstream
>>>>> somehow?
>>>>>
>>>>> Gabe
>>>>>
>>>>> On Wed, Mar 14, 2018 at 4:28 AM, Andreas Sandberg <
>>>>> andreas.sandb...@arm.com> wrote:
>>>>>
>>>>>> Have you tried using the fs_bigLITTLE script in configs/examples/arm?
>>>>>> That's the script I have been using for testing.
>>>>>>
>>>>>> I just tested the script with 8 little CPUs and 0 big CPUs and it
>>>>>> seems
>>>>>> to work. Timing is a bit temperamental though, so you might need to
>>>>>> override the simulation quantum. The default is 1ms, you might need to
>>>>>> decrease it to something slightly smaller (I'm currently using 0.5ms).
>>>>>> Another caveat is that there seem to be some issues related to dtb
>>>>>> auto-generation that affect KVM guests. We are currently testing a
>>>>>> solution for this issue.
>>>>>>
>>>>>> Cheers,
>>>>>> Andreas
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 12/03/2018 22:26, Gabe Black wrote:
>>>>>>
>>>>>>> I'm trying to run in FS mode, to boot android/linux.
>>>>>>>
>>>>>>> Gabe
>>>>>>>
>>>>>>> On Mon, Mar 12, 2018 at 3:26 PM, Dutu, Alexandru <
>>>>>>> alexandru.d...@amd.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Gabe,
>>>>>>>>
>>>>>>>> Are you running SE or FS mode?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Alex
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: gem5-dev [mailto:gem5-dev-boun...@gem5.org] On Behalf Of
>>>>>>>> Gabe Black
>>>>>>>> Sent: Friday, March 9, 2018 5:46 PM
>>>>>>>> To: gem5 Developer List <gem5-dev@gem5.org>
>>>>>>>> Subject: [gem5-dev] Multicore ARM v8 KVM based simulation
>>>>>>>>
>>>>>>>> Hi folks. I have a config script set up where I can run a KVM based
>>>>>>>> ARM v8
>>>>>>>> simulation just fine when I have a single CPU in it, but when I try
>>>>>>>> running
>>>>>>>> with more than one CPU, it just seems to get lost and not do
>>>>>>>> anything. Is
>>>>>>>> this a configuration that's supported? If so, are there any caveats
>>>>>>>> to how
>>>>>>>> it's set up? I may be missing something simple, but it's not
>>>>>>>> apparent to me
>>>>>>>> at the moment.
>>>>>>>>
>>>>>>>> Gabe
>>>>>>>> _______________________________________________
>>>>>>>> gem5-dev mailing list
>>>>>>>> gem5-dev@gem5.org
>>>>>>>> http://m5sim.org/mailman/listinfo/gem5-dev
>>>>>>>> _______________________________________________
>>>>>>>> gem5-dev mailing list
>>>>>>>> gem5-dev@gem5.org
>>>>>>>> http://m5sim.org/mailman/listinfo/gem5-dev
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> gem5-dev mailing list
>>>>>>> gem5-dev@gem5.org
>>>>>>> http://m5sim.org/mailman/listinfo/gem5-dev
>>>>>>>
>>>>>>
>>>>>> IMPORTANT NOTICE: The contents of this email and any attachments are
>>>>>> confidential and may also be privileged. If you are not the intended
>>>>>> recipient, please notify the sender immediately and do not disclose the
>>>>>> contents to any other person, use it for any purpose, or store or copy 
>>>>>> the
>>>>>> information in any medium. Thank you.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> IMPORTANT NOTICE: The contents of this email and any attachments are
>>> confidential and may also be privileged. If you are not the intended
>>> recipient, please notify the sender immediately and do not disclose the
>>> contents to any other person, use it for any purpose, or store or copy the
>>> information in any medium. Thank you.
>>>
>>
>>
>
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to