Re: [gem5-dev] Multicore ARM v8 KVM based simulation

Gabe Black Tue, 20 Mar 2018 15:14:39 -0700

Ok, that (multiple event queues) made things way better. There are still
some glitches to figure out, but at least it makes good forward progress at
a reasonable speed. Thanks!


Gabe

On Mon, Mar 19, 2018 at 5:12 PM, Gabe Black <[email protected]> wrote:

> This is on an chromebook based on the RK3399 with only ~4GB of RAM which
> is not ideal, although we have a bigger machine in the works for the
> future. I agree with your reasoning and don't think option 1 is a problem.
> We're using static DTBs so I don't think that's an issue either. In my
> script, I'm not doing anything smart with the event queues, so that's
> likely at least part of the problem. When I tried using fs_bigLITTLE.py I
> ran into what looked like a similar issue so that might not be the whole
> story, but it's definitely something I should fix up. I'll let you know how
> that goes!
>
> Gabe
>
> On Mon, Mar 19, 2018 at 4:30 AM, Andreas Sandberg <
> [email protected]> wrote:
>
>> Hmm, OK, this is very strange.
>>
>> What type of hardware are you running on? Is it an A57-based chip or
>> something else? Also, what's your simulation quantum? I have been able to
>> run with a 0.5ms quantum  (5e8 ticks).
>> I think the following trace of two CPUs running in KVM should be roughly
>> equivalent to the trace you shared earlier. It was generated on a
>> commercially available 8xA57 (16GiB ram) using the following command (gem5
>> rev 9dc44b417):
>>
>> gem5.opt -r --debug-flags Kvm,KvmIO,KvmRun 
>> configs/example/arm/fs_bigLITTLE.py \
>>     --sim-quantum '0.5ms' \
>>     --cpu-type kvm --big-cpus 0 --little-cpus 2 \
>>     --dtb system/arm/dt/armv8_gem5_v1_2cpu.dtb --kernel 
>> vmlinux.aarch64.4.4-d318f95d0c
>>
>> Note that the tick counts are a bit weird since we have three different
>> event queues at play (1 for devices and one per CPU).
>>
>>       0: system.littleCluster.cpus0: KVM: Executing for 500000000 ticks
>>       0: system.littleCluster.cpus1: KVM: Executing for 500000000 ticks
>>       0: system.littleCluster.cpus0: KVM: Executed 79170 instructions in 
>> 176363 cycles (88181504 ticks, sim cycles: 176363).
>> 88182000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>> 88182000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>> 0x1c090024, len: 4)
>> 88332000: system.littleCluster.cpus0: Entering KVM...
>> 88332000: system.littleCluster.cpus0: KVM: Executing for 411668000 ticks
>> 88332000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 
>> 16854 cycles (8427000 ticks, sim cycles: 16854).
>> 96759000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>> 96759000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>> 0x1c090030, len: 4)
>>       0: system.littleCluster.cpus1: KVM: Executed 409368 instructions in 
>> 666400 cycles (333200000 ticks, sim cycles: 666400).
>> 333200000: system.littleCluster.cpus1: Entering KVM...
>> 333200000: system.littleCluster.cpus1: KVM: Executing for 166800000 ticks
>> 96909000: system.littleCluster.cpus0: Entering KVM...
>> 96909000: system.littleCluster.cpus0: KVM: Executing for 403091000 ticks
>> 96909000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 
>> 15257 cycles (7628500 ticks, sim cycles: 15257).
>> 104538000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>> 104538000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>> 0x1c0100a0, len: 4)
>> 333200000: system.littleCluster.cpus1: KVM: Executed 47544 instructions in 
>> 200820 cycles (100410000 ticks, sim cycles: 200820).
>> 433610000: system.littleCluster.cpus1: Entering KVM...
>> 433610000: system.littleCluster.cpus1: KVM: Executing for 66390000 ticks
>> 104688000: system.littleCluster.cpus0: Entering KVM...
>> 104688000: system.littleCluster.cpus0: KVM: Executing for 395312000 ticks
>> 104688000: system.littleCluster.cpus0: KVM: Executed 4382 instructions in 
>> 14942 cycles (7471000 ticks, sim cycles: 14942).
>>
>> Comparing this trace to yours, I'd say that there the frequent KVM exits
>> look a bit suspicious. I would expect secondary CPUs to make very little
>> process while the main CPU initializes the system and starts the early boot
>> code.
>>
>> There area  couple of possibilities that might be causing issues:
>>
>> 1) There is some CPU ID weirdness that confuses the boot code and puts
>> both CPUs in the holding pen. This seems unlikely since there are some
>> writes to the UART.
>>
>> 2) Some device is incorrectly mapped to the CPU event queues and causes
>> frequent KVM exits. Have a look at _build_kvm in fs_bigLITTLE.py, it
>> doesn't use configs/common, so no need to tear your eyes out. ;) Do you map
>> event queues in the same way? It's mapping all simulated devices to one
>> event queue and the CPUs to private event queues. It's important to remap
>> CPU child devices to the device queue instead of the CPU queue. Failing to
>> do this will cause chaos, madness, and quite possibly result in Armageddon.
>>
>> 3) You're using DTB autogeneration. This doesn't work for KVM guests due
>> to issues with the timer interrupt specification. We have a patch for the
>> timer that we are testing internally. Sorry. :(
>>
>> Regards,
>> Andreas
>> On 16/03/2018 23:20, Gabe Black wrote:
>>
>> Ok, diving into this a little deeper, it looks like execution is
>> progressing but is making very slow progress for some reason. I added a
>> call to "dump()" before each ioctl invocation which enters the VM and
>> looked at the PC to get an idea of what it was up to. I made sure to put
>> that before the timers to avoid taking up VM time with printing debug
>> stuff. In any case, I see that neither CPU gets off of PC 0 for about 2ms
>> simulated time (~500Hz), and that's EXTREMELY slow for a CPU which is
>> supposed to be running in the ballpark of 2GHz. It's not clear to me why
>> it's making such slow progress, but that would explain why I'm getting very
>> little out on the simulated console. It's just taking forever to make it
>> that far.
>>
>> Any idea why it's going so slow, or how to debug further?
>>
>> Gabe
>>
>> On Wed, Mar 14, 2018 at 7:42 PM, Gabe Black <[email protected]> wrote:
>>
>>> Some output which I think is suspicious:
>>>
>>> 55462000: system.cpus0: Entering KVM...
>>> 55462000: system.cpus0: KVM: Executing for 1506000 ticks
>>> 55462000: system.cpus0: KVM: Executed 5159 instructions in 13646 cycles
>>> (6823000 ticks, sim cycles: 13646).
>>> 56968000: system.cpus1: Entering KVM...
>>> 56968000: system.cpus1: KVM: Executing for 5317000 ticks
>>> 56968000: system.cpus1: KVM: Executed 7229 instructions in 14379 cycles
>>> (7189500 ticks, sim cycles: 14379).
>>> 62285000: system.cpus0: Entering KVM...
>>> 62285000: system.cpus0: KVM: Executing for 1872500 ticks
>>> 62285000: system.cpus0: KVM: Executed 5159 instructions in 13496 cycles
>>> (6748000 ticks, sim cycles: 13496).
>>> 64157500: system.cpus1: Entering KVM...
>>> 64157500: system.cpus1: KVM: Executing for 4875500 ticks
>>> 64157500: system.cpus1: KVM: Executed 6950 instructions in 13863 cycles
>>> (6931500 ticks, sim cycles: 13863).
>>> 69033000: system.cpus0: Entering KVM...
>>> 69033000: system.cpus0: KVM: Executing for 2056000 ticks
>>> 69033000: system.cpus0: KVM: Executed 5159 instructions in 13454 cycles
>>> (6727000 ticks, sim cycles: 13454).
>>> 71089000: system.cpus1: Entering KVM...
>>> 71089000: system.cpus1: KVM: Executing for 4671000 ticks
>>> 71089000: system.cpus1: KVM: Executed 6950 instructions in 13861 cycles
>>> (6930500 ticks, sim cycles: 13861).
>>> 75760000: system.cpus0: Entering KVM...
>>> 75760000: system.cpus0: KVM: Executing for 2259500 ticks
>>> 75760000: system.cpus0: KVM: Executed 5159 instructions in 13688 cycles
>>> (6844000 ticks, sim cycles: 13688).
>>>
>>> [...]
>>>
>>> 126512000: system.cpus0: handleKvmExit (exit_reason: 6)
>>> 126512000: system.cpus0: KVM: Handling MMIO (w: 1, addr: 0x1c090024,
>>> len: 4)
>>> 126512000: system.cpus0: In updateThreadContext():
>>>
>>> [...]
>>>
>>> 126512000: system.cpus0:   PC := 0xd8 (t: 0, a64: 1)
>>>
>>> On Wed, Mar 14, 2018 at 7:37 PM, Gabe Black <[email protected]>
>>> wrote:
>>>
>>>> I tried it just now, and I still don't see anything on the console. I
>>>> switched back to using my own script since it's a bit simpler (it doesn't
>>>> use all the configs/common stuff), and started looking at the KVM debug
>>>> output. I see that both cpus claim to execute instructions, although cpu1
>>>> didn't take an exit in the output I was looking at. cpu0 took four exits,
>>>> two which touched some UART registers, and two which touched RealView
>>>> registes, the V2M_SYS_CFGDATA and V2M_SYS_CFGCTRL registers judging by the
>>>> comments in the bootloader assembly file.
>>>>
>>>> After that they claim to be doing stuff, although I see no further
>>>> console output or KVM exits. The accesses themselves and their PCs are from
>>>> the bootloader blob, and so I'm pretty confident that it's starting that
>>>> and executing some of those instructions. One thing that looks very odd now
>>>> that I think about it, is that the KVM messages about entering and
>>>> executing instructions (like those below) seem to say that cpu0 has
>>>> executed thousands of instructions, but the exits I see seem to correspond
>>>> to the first maybe 50 instructions it should be seeing in the bootloader
>>>> blob. Are those values bogus for some reason? Is there some existing debug
>>>> output which would let me see where KVM thinks it is periodically to see if
>>>> it's in the kernel or if it went bananas and is executing random memory
>>>> somewhere? Or if it just got stuck waiting for some event that's not going
>>>> to show up?
>>>>
>>>> Are there any important CLs which haven't made their way into upstream
>>>> somehow?
>>>>
>>>> Gabe
>>>>
>>>> On Wed, Mar 14, 2018 at 4:28 AM, Andreas Sandberg <
>>>> [email protected]> wrote:
>>>>
>>>>> Have you tried using the fs_bigLITTLE script in configs/examples/arm?
>>>>> That's the script I have been using for testing.
>>>>>
>>>>> I just tested the script with 8 little CPUs and 0 big CPUs and it seems
>>>>> to work. Timing is a bit temperamental though, so you might need to
>>>>> override the simulation quantum. The default is 1ms, you might need to
>>>>> decrease it to something slightly smaller (I'm currently using 0.5ms).
>>>>> Another caveat is that there seem to be some issues related to dtb
>>>>> auto-generation that affect KVM guests. We are currently testing a
>>>>> solution for this issue.
>>>>>
>>>>> Cheers,
>>>>> Andreas
>>>>>
>>>>>
>>>>>
>>>>> On 12/03/2018 22:26, Gabe Black wrote:
>>>>>
>>>>>> I'm trying to run in FS mode, to boot android/linux.
>>>>>>
>>>>>> Gabe
>>>>>>
>>>>>> On Mon, Mar 12, 2018 at 3:26 PM, Dutu, Alexandru <
>>>>>> [email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> Hi Gabe,
>>>>>>>
>>>>>>> Are you running SE or FS mode?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Alex
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: gem5-dev [mailto:[email protected]] On Behalf Of Gabe
>>>>>>> Black
>>>>>>> Sent: Friday, March 9, 2018 5:46 PM
>>>>>>> To: gem5 Developer List <[email protected]>
>>>>>>> Subject: [gem5-dev] Multicore ARM v8 KVM based simulation
>>>>>>>
>>>>>>> Hi folks. I have a config script set up where I can run a KVM based
>>>>>>> ARM v8
>>>>>>> simulation just fine when I have a single CPU in it, but when I try
>>>>>>> running
>>>>>>> with more than one CPU, it just seems to get lost and not do
>>>>>>> anything. Is
>>>>>>> this a configuration that's supported? If so, are there any caveats
>>>>>>> to how
>>>>>>> it's set up? I may be missing something simple, but it's not
>>>>>>> apparent to me
>>>>>>> at the moment.
>>>>>>>
>>>>>>> Gabe
>>>>>>> _______________________________________________
>>>>>>> gem5-dev mailing list
>>>>>>> [email protected]
>>>>>>> http://m5sim.org/mailman/listinfo/gem5-dev
>>>>>>> _______________________________________________
>>>>>>> gem5-dev mailing list
>>>>>>> [email protected]
>>>>>>> http://m5sim.org/mailman/listinfo/gem5-dev
>>>>>>>
>>>>>> _______________________________________________
>>>>>> gem5-dev mailing list
>>>>>> [email protected]
>>>>>> http://m5sim.org/mailman/listinfo/gem5-dev
>>>>>>
>>>>>
>>>>> IMPORTANT NOTICE: The contents of this email and any attachments are
>>>>> confidential and may also be privileged. If you are not the intended
>>>>> recipient, please notify the sender immediately and do not disclose the
>>>>> contents to any other person, use it for any purpose, or store or copy the
>>>>> information in any medium. Thank you.
>>>>>
>>>>
>>>>
>>>
>>
>> IMPORTANT NOTICE: The contents of this email and any attachments are
>> confidential and may also be privileged. If you are not the intended
>> recipient, please notify the sender immediately and do not disclose the
>> contents to any other person, use it for any purpose, or store or copy the
>> information in any medium. Thank you.
>>
>
>
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

Reply via email to