Ok, thanks. We're deciding internally what approach to use to tackle this. Gabe
On Wed, Mar 21, 2018 at 3:01 AM, Andreas Sandberg <[email protected]> wrote: > Hi Gabe, > > There are issues with the IDE model that prevent it from working with > in-kernel GIC emulation. I believe the model doesn't clear interrupts > correctly, which confuses the host kernel. I tried to debug this at some > point, but wasn't able to do much immaediate progress and decided it wasn't > worth the effort. The VirtIO block devices doesn't suffer from this problem. > > Using the VirtIO device by default seems like a good idea to me. It > doesn't simulate any timing, but that might not be a huge deal since the > IDE device doesn't provide realistic timing anyway. It would be really > awesome if we had a modern storage controller (e.g., NVMe or AHCI) and > proper storage timing models. > > Cheers, > Andreas > > On 20/03/2018 23:38, Gabe Black wrote: > > My next question is about disks. I see that the fs_bigLITTLE.py script > uses PciVirtIO to set up its disks, where I'm using IDE which I inherited > from the fs.py scripts I used as reference. The problem I'm seeing is that > the IDE controllers seem to be mangling commands and dropping interrupts, > so this difference looks particularly suspicious. Is there a KVM related > reason you're using PciVirtIO? Is this something that *should* work with > IDE bug doesn't, or do I have to use PciVirtIO for things to work properly? > I'm not familiar with PciVirtIO beyond briefly skimming the source for it > in gem5. Is this something we should consider using globally as a > replacement for IDE, even in simulations where we're trying to be really > realistic? > > Thanks again for all the help. > > Gabe > > On Tue, Mar 20, 2018 at 3:14 PM, Gabe Black <[email protected]> wrote: > >> Ok, that (multiple event queues) made things way better. There are still >> some glitches to figure out, but at least it makes good forward progress at >> a reasonable speed. Thanks! >> >> Gabe >> >> On Mon, Mar 19, 2018 at 5:12 PM, Gabe Black <[email protected]> wrote: >> >>> This is on an chromebook based on the RK3399 with only ~4GB of RAM which >>> is not ideal, although we have a bigger machine in the works for the >>> future. I agree with your reasoning and don't think option 1 is a problem. >>> We're using static DTBs so I don't think that's an issue either. In my >>> script, I'm not doing anything smart with the event queues, so that's >>> likely at least part of the problem. When I tried using fs_bigLITTLE.py I >>> ran into what looked like a similar issue so that might not be the whole >>> story, but it's definitely something I should fix up. I'll let you know how >>> that goes! >>> >>> Gabe >>> >>> On Mon, Mar 19, 2018 at 4:30 AM, Andreas Sandberg < >>> [email protected]> wrote: >>> >>>> Hmm, OK, this is very strange. >>>> >>>> What type of hardware are you running on? Is it an A57-based chip or >>>> something else? Also, what's your simulation quantum? I have been able to >>>> run with a 0.5ms quantum (5e8 ticks). >>>> I think the following trace of two CPUs running in KVM should be >>>> roughly equivalent to the trace you shared earlier. It was generated on a >>>> commercially available 8xA57 (16GiB ram) using the following command (gem5 >>>> rev 9dc44b417): >>>> >>>> gem5.opt -r --debug-flags Kvm,KvmIO,KvmRun >>>> configs/example/arm/fs_bigLITTLE.py \ >>>> --sim-quantum '0.5ms' \ >>>> --cpu-type kvm --big-cpus 0 --little-cpus 2 \ >>>> --dtb system/arm/dt/armv8_gem5_v1_2cpu.dtb --kernel >>>> vmlinux.aarch64.4.4-d318f95d0c >>>> >>>> Note that the tick counts are a bit weird since we have three different >>>> event queues at play (1 for devices and one per CPU). >>>> >>>> 0: system.littleCluster.cpus0: KVM: Executing for 500000000 ticks >>>> 0: system.littleCluster.cpus1: KVM: Executing for 500000000 ticks >>>> 0: system.littleCluster.cpus0: KVM: Executed 79170 instructions in >>>> 176363 cycles (88181504 ticks, sim cycles: 176363). >>>> 88182000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6) >>>> 88182000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: >>>> 0x1c090024, len: 4) >>>> 88332000: system.littleCluster.cpus0: Entering KVM... >>>> 88332000: system.littleCluster.cpus0: KVM: Executing for 411668000 ticks >>>> 88332000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in >>>> 16854 cycles (8427000 ticks, sim cycles: 16854). >>>> 96759000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6) >>>> 96759000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: >>>> 0x1c090030, len: 4) >>>> 0: system.littleCluster.cpus1: KVM: Executed 409368 instructions in >>>> 666400 cycles (333200000 ticks, sim cycles: 666400). >>>> 333200000: system.littleCluster.cpus1: Entering KVM... >>>> 333200000: system.littleCluster.cpus1: KVM: Executing for 166800000 ticks >>>> 96909000: system.littleCluster.cpus0: Entering KVM... >>>> 96909000: system.littleCluster.cpus0: KVM: Executing for 403091000 ticks >>>> 96909000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in >>>> 15257 cycles (7628500 ticks, sim cycles: 15257). >>>> 104538000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6) >>>> 104538000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: >>>> 0x1c0100a0, len: 4) >>>> 333200000: system.littleCluster.cpus1: KVM: Executed 47544 instructions in >>>> 200820 cycles (100410000 ticks, sim cycles: 200820). >>>> 433610000: system.littleCluster.cpus1: Entering KVM... >>>> 433610000: system.littleCluster.cpus1: KVM: Executing for 66390000 ticks >>>> 104688000: system.littleCluster.cpus0: Entering KVM... >>>> 104688000: system.littleCluster.cpus0: KVM: Executing for 395312000 ticks >>>> 104688000: system.littleCluster.cpus0: KVM: Executed 4382 instructions in >>>> 14942 cycles (7471000 ticks, sim cycles: 14942). >>>> >>>> Comparing this trace to yours, I'd say that there the frequent KVM >>>> exits look a bit suspicious. I would expect secondary CPUs to make very >>>> little process while the main CPU initializes the system and starts the >>>> early boot code. >>>> >>>> There area couple of possibilities that might be causing issues: >>>> >>>> 1) There is some CPU ID weirdness that confuses the boot code and puts >>>> both CPUs in the holding pen. This seems unlikely since there are some >>>> writes to the UART. >>>> >>>> 2) Some device is incorrectly mapped to the CPU event queues and causes >>>> frequent KVM exits. Have a look at _build_kvm in fs_bigLITTLE.py, it >>>> doesn't use configs/common, so no need to tear your eyes out. ;) Do you map >>>> event queues in the same way? It's mapping all simulated devices to one >>>> event queue and the CPUs to private event queues. It's important to remap >>>> CPU child devices to the device queue instead of the CPU queue. Failing to >>>> do this will cause chaos, madness, and quite possibly result in Armageddon. >>>> >>>> 3) You're using DTB autogeneration. This doesn't work for KVM guests >>>> due to issues with the timer interrupt specification. We have a patch for >>>> the timer that we are testing internally. Sorry. :( >>>> >>>> Regards, >>>> Andreas >>>> On 16/03/2018 23:20, Gabe Black wrote: >>>> >>>> Ok, diving into this a little deeper, it looks like execution is >>>> progressing but is making very slow progress for some reason. I added a >>>> call to "dump()" before each ioctl invocation which enters the VM and >>>> looked at the PC to get an idea of what it was up to. I made sure to put >>>> that before the timers to avoid taking up VM time with printing debug >>>> stuff. In any case, I see that neither CPU gets off of PC 0 for about 2ms >>>> simulated time (~500Hz), and that's EXTREMELY slow for a CPU which is >>>> supposed to be running in the ballpark of 2GHz. It's not clear to me why >>>> it's making such slow progress, but that would explain why I'm getting very >>>> little out on the simulated console. It's just taking forever to make it >>>> that far. >>>> >>>> Any idea why it's going so slow, or how to debug further? >>>> >>>> Gabe >>>> >>>> On Wed, Mar 14, 2018 at 7:42 PM, Gabe Black <[email protected]> >>>> wrote: >>>> >>>>> Some output which I think is suspicious: >>>>> >>>>> 55462000: system.cpus0: Entering KVM... >>>>> 55462000: system.cpus0: KVM: Executing for 1506000 ticks >>>>> 55462000: system.cpus0: KVM: Executed 5159 instructions in 13646 >>>>> cycles (6823000 ticks, sim cycles: 13646). >>>>> 56968000: system.cpus1: Entering KVM... >>>>> 56968000: system.cpus1: KVM: Executing for 5317000 ticks >>>>> 56968000: system.cpus1: KVM: Executed 7229 instructions in 14379 >>>>> cycles (7189500 ticks, sim cycles: 14379). >>>>> 62285000: system.cpus0: Entering KVM... >>>>> 62285000: system.cpus0: KVM: Executing for 1872500 ticks >>>>> 62285000: system.cpus0: KVM: Executed 5159 instructions in 13496 >>>>> cycles (6748000 ticks, sim cycles: 13496). >>>>> 64157500: system.cpus1: Entering KVM... >>>>> 64157500: system.cpus1: KVM: Executing for 4875500 ticks >>>>> 64157500: system.cpus1: KVM: Executed 6950 instructions in 13863 >>>>> cycles (6931500 ticks, sim cycles: 13863). >>>>> 69033000: system.cpus0: Entering KVM... >>>>> 69033000: system.cpus0: KVM: Executing for 2056000 ticks >>>>> 69033000: system.cpus0: KVM: Executed 5159 instructions in 13454 >>>>> cycles (6727000 ticks, sim cycles: 13454). >>>>> 71089000: system.cpus1: Entering KVM... >>>>> 71089000: system.cpus1: KVM: Executing for 4671000 ticks >>>>> 71089000: system.cpus1: KVM: Executed 6950 instructions in 13861 >>>>> cycles (6930500 ticks, sim cycles: 13861). >>>>> 75760000: system.cpus0: Entering KVM... >>>>> 75760000: system.cpus0: KVM: Executing for 2259500 ticks >>>>> 75760000: system.cpus0: KVM: Executed 5159 instructions in 13688 >>>>> cycles (6844000 ticks, sim cycles: 13688). >>>>> >>>>> [...] >>>>> >>>>> 126512000: system.cpus0: handleKvmExit (exit_reason: 6) >>>>> 126512000: system.cpus0: KVM: Handling MMIO (w: 1, addr: 0x1c090024, >>>>> len: 4) >>>>> 126512000: system.cpus0: In updateThreadContext(): >>>>> >>>>> [...] >>>>> >>>>> 126512000: system.cpus0: PC := 0xd8 (t: 0, a64: 1) >>>>> >>>>> On Wed, Mar 14, 2018 at 7:37 PM, Gabe Black <[email protected]> >>>>> wrote: >>>>> >>>>>> I tried it just now, and I still don't see anything on the console. I >>>>>> switched back to using my own script since it's a bit simpler (it doesn't >>>>>> use all the configs/common stuff), and started looking at the KVM debug >>>>>> output. I see that both cpus claim to execute instructions, although cpu1 >>>>>> didn't take an exit in the output I was looking at. cpu0 took four exits, >>>>>> two which touched some UART registers, and two which touched RealView >>>>>> registes, the V2M_SYS_CFGDATA and V2M_SYS_CFGCTRL registers judging by >>>>>> the >>>>>> comments in the bootloader assembly file. >>>>>> >>>>>> After that they claim to be doing stuff, although I see no further >>>>>> console output or KVM exits. The accesses themselves and their PCs are >>>>>> from >>>>>> the bootloader blob, and so I'm pretty confident that it's starting that >>>>>> and executing some of those instructions. One thing that looks very odd >>>>>> now >>>>>> that I think about it, is that the KVM messages about entering and >>>>>> executing instructions (like those below) seem to say that cpu0 has >>>>>> executed thousands of instructions, but the exits I see seem to >>>>>> correspond >>>>>> to the first maybe 50 instructions it should be seeing in the bootloader >>>>>> blob. Are those values bogus for some reason? Is there some existing >>>>>> debug >>>>>> output which would let me see where KVM thinks it is periodically to see >>>>>> if >>>>>> it's in the kernel or if it went bananas and is executing random memory >>>>>> somewhere? Or if it just got stuck waiting for some event that's not >>>>>> going >>>>>> to show up? >>>>>> >>>>>> Are there any important CLs which haven't made their way into >>>>>> upstream somehow? >>>>>> >>>>>> Gabe >>>>>> >>>>>> On Wed, Mar 14, 2018 at 4:28 AM, Andreas Sandberg < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Have you tried using the fs_bigLITTLE script in configs/examples/arm? >>>>>>> That's the script I have been using for testing. >>>>>>> >>>>>>> I just tested the script with 8 little CPUs and 0 big CPUs and it >>>>>>> seems >>>>>>> to work. Timing is a bit temperamental though, so you might need to >>>>>>> override the simulation quantum. The default is 1ms, you might need >>>>>>> to >>>>>>> decrease it to something slightly smaller (I'm currently using >>>>>>> 0.5ms). >>>>>>> Another caveat is that there seem to be some issues related to dtb >>>>>>> auto-generation that affect KVM guests. We are currently testing a >>>>>>> solution for this issue. >>>>>>> >>>>>>> Cheers, >>>>>>> Andreas >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 12/03/2018 22:26, Gabe Black wrote: >>>>>>> >>>>>>>> I'm trying to run in FS mode, to boot android/linux. >>>>>>>> >>>>>>>> Gabe >>>>>>>> >>>>>>>> On Mon, Mar 12, 2018 at 3:26 PM, Dutu, Alexandru < >>>>>>>> [email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Hi Gabe, >>>>>>>>> >>>>>>>>> Are you running SE or FS mode? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Alex >>>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: gem5-dev [mailto:[email protected]] On Behalf Of >>>>>>>>> Gabe Black >>>>>>>>> Sent: Friday, March 9, 2018 5:46 PM >>>>>>>>> To: gem5 Developer List <[email protected]> >>>>>>>>> Subject: [gem5-dev] Multicore ARM v8 KVM based simulation >>>>>>>>> >>>>>>>>> Hi folks. I have a config script set up where I can run a KVM >>>>>>>>> based ARM v8 >>>>>>>>> simulation just fine when I have a single CPU in it, but when I >>>>>>>>> try running >>>>>>>>> with more than one CPU, it just seems to get lost and not do >>>>>>>>> anything. Is >>>>>>>>> this a configuration that's supported? If so, are there any >>>>>>>>> caveats to how >>>>>>>>> it's set up? I may be missing something simple, but it's not >>>>>>>>> apparent to me >>>>>>>>> at the moment. >>>>>>>>> >>>>>>>>> Gabe >>>>>>>>> _______________________________________________ >>>>>>>>> gem5-dev mailing list >>>>>>>>> [email protected] >>>>>>>>> http://m5sim.org/mailman/listinfo/gem5-dev >>>>>>>>> _______________________________________________ >>>>>>>>> gem5-dev mailing list >>>>>>>>> [email protected] >>>>>>>>> http://m5sim.org/mailman/listinfo/gem5-dev >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> gem5-dev mailing list >>>>>>>> [email protected] >>>>>>>> http://m5sim.org/mailman/listinfo/gem5-dev >>>>>>>> >>>>>>> >>>>>>> IMPORTANT NOTICE: The contents of this email and any attachments are >>>>>>> confidential and may also be privileged. If you are not the intended >>>>>>> recipient, please notify the sender immediately and do not disclose the >>>>>>> contents to any other person, use it for any purpose, or store or copy >>>>>>> the >>>>>>> information in any medium. Thank you. >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> IMPORTANT NOTICE: The contents of this email and any attachments are >>>> confidential and may also be privileged. If you are not the intended >>>> recipient, please notify the sender immediately and do not disclose the >>>> contents to any other person, use it for any purpose, or store or copy the >>>> information in any medium. Thank you. >>>> >>> >>> >> > > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
