Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-04-05 Thread Gabe Black
Yep, I'll do that. Unfortunately (or fortunately?) this problem
mysteriously went away just like it had mysteriously started happening, so
I can't really poke at it at the moment. If it happens again I'll
definitely grab some traces for you.

Gabe

On Wed, Apr 4, 2018 at 3:39 AM, Andreas Sandberg 
wrote:

> That's very strange. It seems like the KVM GIC interface is trying to read
> register 0x9 in the GIC's CPU interface. The errno indicates that no such
> register exists, which is expected (registers are usually 32 bit aligned).
> I'm not why this happens. The write is /probably/ coming from the
> simulated system, which would indicate that something went horribly wrong
> in the guest.
>
> If this happens again, could you re-run with the GIC debug flag and
> possibly a KVM MMIO trace?
>
> Cheers,
> Andreas
>
> On 30/03/2018 11:52, Gabe Black wrote:
>
> Now out of the blue I'm hitting errors having to do with setting GIC
> "attributes" of some sort with code that was working a few hours earlier.
> Any idea what it's upset about?
>
>
>
> gem5 Simulator System.  http://gem5.org
> gem5 is copyrighted software; use the --copyright option for details.
>
> gem5 compiled Mar 30 2018 03:08:57
> gem5 started Mar 30 2018 03:13:05
> gem5 executing on localhost, pid 9033
> command line: build/ARM/gem5.debug gem5/google/configs/kvm.py
>
> INFO:root:Disk 0: /home/gabeblack/dist/m5/system/disks/disk.img
> INFO:root:Add GPU: NoMali GPU model...
> INFO:root:Kernel: /home/gabeblack/dist/m5/system/binaries/vmlinux
> INFO:root:Device tree: /home/gabeblack/dist/m5/system/binaries/armv8_
> 1440x2560_google_v1_2cpu.dtb
> Global frequency set at 1 ticks per second
> warn: system.pci_ide adopting orphan SimObject param 'disks'
> info: kernel located at: /home/gabeblack/dist/m5/system/binaries/vmlinux
> warn: Highest ARM exception-level set to AArch32 but bootloader is for
> AArch64. Assuming you wanted these to match.
> Listening for system connection on port 5900
> Listening for system connection on port 3456
> Listening for uart1 connection on port 3457
> 0: system.remote_gdb: listening for remote gdb on port 7000
> 0: system.remote_gdb: listening for remote gdb on port 7001
> warn: CoherentXBar system.membus has no snooping ports attached!
> info: Using bootloader at address 0x10
> info: Using kernel entry physical address at 0x8008
> info: Loading DTB file: 
> /home/gabeblack/dist/m5/system/binaries/armv8_1440x2560_google_v1_2cpu.dtb
> at address 0x8800
> info: KVM: Coalesced MMIO disabled by config.
> info: KVM: Coalesced MMIO disabled by config.
> warn: Existing EnergyCtrl, but no enabled DVFSHandler found.
> panic: Failed to set attribute (group: 2, attr: 9, errno: 6)
> Memory Usage: 3516676 KBytes
> Program aborted at tick 0
> --- BEGIN LIBC BACKTRACE ---
> build/ARM/gem5.debug(_Z15print_backtracev+0x2c)[0x1a3e750]
> build/ARM/gem5.debug(_Z12abortHandleri+0x7c)[0x1a47070]
> [0x7988061510]
> /lib/aarch64-linux-gnu/libc.so.6(gsignal+0x38)[0x798771b528]
> --- END LIBC BACKTRACE ---
> Aborted (core dumped)
>
>
> On Wed, Mar 28, 2018 at 5:14 PM, Gabe Black  wrote:
>
>> Ok, I think I figured it out, and it all has to do with the simulation
>> quantum. If the quantum is too big, the kernel might poke hardware and
>> expect to get an interrupt within a certain period of time. It could be
>> that the CPU gets to the end of its timeout before the simulated hardware
>> has had a chance to trigger an interrupt, even though the interrupt would
>> happen first if the event queues were held in tighter sync. If I decrease
>> the size of the quantum from 500ms (per your suggestion) to 1ms, then I see
>> the errors from the keyboard/mouse drivers and the ATA driver go away, at
>> least in the one CPU/multiple event queue configuration.
>>
>> I'm going to do some more testing to make sure there isn't some other
>> problem that pops up, and also to characterize the performance impact which
>> I'm hopeful won't be too bad.
>>
>> Also, I was thinking it would be nice if KVM CPUs could set up their
>> event queues in some more automatic, less error prone way. Before I knew
>> that they needed their own event queue (which I think is just institutional
>> knowledge that isn't documented/warned about/etc.?), I had no idea what was
>> going wrong when just dropping in some KVM CPUs in place of regular CPUs. I
>> don't have a fully fleshed out plan for how to do that, but it doesn't
>> *seem* like something that should be that hard to do.
>>
>> Gabe
>>
>> On Mon, Mar 26, 2018 at 7:06 PM, Gabe Black  wrote:
>>
>>> I looked into this a little further, and I see the same problem happen
>>> with one CPU but with the CPU and the devices in different event queues. I
>>> haven't figured out exactly where things go wrong, but it looks like a
>>> write DMA is set up but doesn't happen for some reason. I'm not sure if the
>>> DMA starts but then gets stuck, or if it 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-04-04 Thread Andreas Sandberg

The performance impact shouldn't be too bad. I did some scalability tests using 
LU from SPLASH 2 years ago. IIRC, I was using an 8-core Westmere-EX based 
system at the time. Native throughput for that benchmark was ~30GIPS @ 8 cores. 
When running in KVM, I got something like ~15GIPS with a 1ms quantum and 10GIPS 
with a 0.5ms quantum. Unfortunately, I don't have that data for any Arm-based 
system.

Turning on the HDLCD will probably reduce throughput quite a bit, but it should 
be running in a functional refresh mode (10Hz by default) when running in KVM. 
It's far from optimised, but should work. We had some KMI issues last time I 
looked at this. IIRC, the KMI model doesn't clear interrupts correctly, which 
confuses the interrupt model in the kernel.

Setting up event queues for KVM automatically would definitely be desirable. As 
you, painfully, noticed, this is currently the responsibility of the config 
script. The Arm example scripts do it already and should work out of the box. I 
suspect it might be tricky to get this right from inside the simulator without 
some re-architecting of the simulator core. What we would have to do is to add 
an API to allocate semi-private EQs from inside C++. Since Python provides an 
EQ number that get allocated in C++ at instantiation time, we would have to 
defer EQ allocation until init() is called or create a better mechanism to 
allocate EQs from Python instead of having a plain EQ index. We still want a 
way to force the old behaviour when simulating single-core systems since that 
makes debugging a lot easier.

Cheers,
Andreas

On 29/03/2018 01:14, Gabe Black wrote:
Ok, I think I figured it out, and it all has to do with the simulation quantum. 
If the quantum is too big, the kernel might poke hardware and expect to get an 
interrupt within a certain period of time. It could be that the CPU gets to the 
end of its timeout before the simulated hardware has had a chance to trigger an 
interrupt, even though the interrupt would happen first if the event queues 
were held in tighter sync. If I decrease the size of the quantum from 500ms 
(per your suggestion) to 1ms, then I see the errors from the keyboard/mouse 
drivers and the ATA driver go away, at least in the one CPU/multiple event 
queue configuration.

I'm going to do some more testing to make sure there isn't some other problem 
that pops up, and also to characterize the performance impact which I'm hopeful 
won't be too bad.

Also, I was thinking it would be nice if KVM CPUs could set up their event 
queues in some more automatic, less error prone way. Before I knew that they 
needed their own event queue (which I think is just institutional knowledge 
that isn't documented/warned about/etc.?), I had no idea what was going wrong 
when just dropping in some KVM CPUs in place of regular CPUs. I don't have a 
fully fleshed out plan for how to do that, but it doesn't *seem* like something 
that should be that hard to do.

Gabe

On Mon, Mar 26, 2018 at 7:06 PM, Gabe Black 
> wrote:
I looked into this a little further, and I see the same problem happen with one 
CPU but with the CPU and the devices in different event queues. I haven't 
figured out exactly where things go wrong, but it looks like a write DMA is set 
up but doesn't happen for some reason. I'm not sure if the DMA starts but then 
gets stuck, or if it never starts at all. It could also be that the DMA 
happens, but the completion event (which is what doesn't seem to happen) is 
mishandled because of the additional event queue.

I turned on the DMA debug flag, but that produced so much debug output that my 
tools are crashing. I'll have to see what I can do to narrow things down a bit.

Gabe

On Thu, Mar 22, 2018 at 11:28 AM, Gabe Black 
> wrote:
Ok, thanks. We're deciding internally what approach to use to tackle this.

Gabe

On Wed, Mar 21, 2018 at 3:01 AM, Andreas Sandberg 
> wrote:

Hi Gabe,

There are issues with the IDE model that prevent it from working with in-kernel 
GIC emulation. I believe the model doesn't clear interrupts correctly, which 
confuses the host kernel. I tried to debug this at some point, but wasn't able 
to do much immaediate progress and decided it wasn't worth the effort. The 
VirtIO block devices doesn't suffer from this problem.

Using the VirtIO device by default seems like a good idea to me. It doesn't 
simulate any timing, but that might not be a huge deal since the IDE device 
doesn't provide realistic timing anyway. It would be really awesome if we had a 
modern storage controller (e.g., NVMe or AHCI) and proper storage timing models.

Cheers,
Andreas

On 20/03/2018 23:38, Gabe Black wrote:
My next question is about disks. I see that the fs_bigLITTLE.py script uses 
PciVirtIO to set up its disks, where I'm using IDE which I inherited from the 
fs.py 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-04-04 Thread Andreas Sandberg

That's very strange. It seems like the KVM GIC interface is trying to read 
register 0x9 in the GIC's CPU interface. The errno indicates that no such 
register exists, which is expected (registers are usually 32 bit aligned). I'm 
not why this happens. The write is /probably/ coming from the simulated system, 
which would indicate that something went horribly wrong in the guest.

If this happens again, could you re-run with the GIC debug flag and possibly a 
KVM MMIO trace?

Cheers,
Andreas

On 30/03/2018 11:52, Gabe Black wrote:
Now out of the blue I'm hitting errors having to do with setting GIC 
"attributes" of some sort with code that was working a few hours earlier. Any 
idea what it's upset about?



gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 compiled Mar 30 2018 03:08:57
gem5 started Mar 30 2018 03:13:05
gem5 executing on localhost, pid 9033
command line: build/ARM/gem5.debug gem5/google/configs/kvm.py

INFO:root:Disk 0: /home/gabeblack/dist/m5/system/disks/disk.img
INFO:root:Add GPU: NoMali GPU model...
INFO:root:Kernel: /home/gabeblack/dist/m5/system/binaries/vmlinux
INFO:root:Device tree: 
/home/gabeblack/dist/m5/system/binaries/armv8_1440x2560_google_v1_2cpu.dtb
Global frequency set at 1 ticks per second
warn: system.pci_ide adopting orphan SimObject param 'disks'
info: kernel located at: /home/gabeblack/dist/m5/system/binaries/vmlinux
warn: Highest ARM exception-level set to AArch32 but bootloader is for AArch64. 
Assuming you wanted these to match.
Listening for system connection on port 5900
Listening for system connection on port 3456
Listening for uart1 connection on port 3457
0: system.remote_gdb: listening for remote gdb on port 7000
0: system.remote_gdb: listening for remote gdb on port 7001
warn: CoherentXBar system.membus has no snooping ports attached!
info: Using bootloader at address 0x10
info: Using kernel entry physical address at 0x8008
info: Loading DTB file: 
/home/gabeblack/dist/m5/system/binaries/armv8_1440x2560_google_v1_2cpu.dtb at 
address 0x8800
info: KVM: Coalesced MMIO disabled by config.
info: KVM: Coalesced MMIO disabled by config.
warn: Existing EnergyCtrl, but no enabled DVFSHandler found.
panic: Failed to set attribute (group: 2, attr: 9, errno: 6)
Memory Usage: 3516676 KBytes
Program aborted at tick 0
--- BEGIN LIBC BACKTRACE ---
build/ARM/gem5.debug(_Z15print_backtracev+0x2c)[0x1a3e750]
build/ARM/gem5.debug(_Z12abortHandleri+0x7c)[0x1a47070]
[0x7988061510]
/lib/aarch64-linux-gnu/libc.so.6(gsignal+0x38)[0x798771b528]
--- END LIBC BACKTRACE ---
Aborted (core dumped)


On Wed, Mar 28, 2018 at 5:14 PM, Gabe Black 
> wrote:
Ok, I think I figured it out, and it all has to do with the simulation quantum. 
If the quantum is too big, the kernel might poke hardware and expect to get an 
interrupt within a certain period of time. It could be that the CPU gets to the 
end of its timeout before the simulated hardware has had a chance to trigger an 
interrupt, even though the interrupt would happen first if the event queues 
were held in tighter sync. If I decrease the size of the quantum from 500ms 
(per your suggestion) to 1ms, then I see the errors from the keyboard/mouse 
drivers and the ATA driver go away, at least in the one CPU/multiple event 
queue configuration.

I'm going to do some more testing to make sure there isn't some other problem 
that pops up, and also to characterize the performance impact which I'm hopeful 
won't be too bad.

Also, I was thinking it would be nice if KVM CPUs could set up their event 
queues in some more automatic, less error prone way. Before I knew that they 
needed their own event queue (which I think is just institutional knowledge 
that isn't documented/warned about/etc.?), I had no idea what was going wrong 
when just dropping in some KVM CPUs in place of regular CPUs. I don't have a 
fully fleshed out plan for how to do that, but it doesn't *seem* like something 
that should be that hard to do.

Gabe

On Mon, Mar 26, 2018 at 7:06 PM, Gabe Black 
> wrote:
I looked into this a little further, and I see the same problem happen with one 
CPU but with the CPU and the devices in different event queues. I haven't 
figured out exactly where things go wrong, but it looks like a write DMA is set 
up but doesn't happen for some reason. I'm not sure if the DMA starts but then 
gets stuck, or if it never starts at all. It could also be that the DMA 
happens, but the completion event (which is what doesn't seem to happen) is 
mishandled because of the additional event queue.

I turned on the DMA debug flag, but that produced so much debug output that my 
tools are crashing. I'll have to see what I can do to narrow things down a bit.

Gabe

On Thu, Mar 22, 2018 at 11:28 AM, Gabe Black 
> wrote:
Ok, thanks. 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-30 Thread Gabe Black
Now out of the blue I'm hitting errors having to do with setting GIC
"attributes" of some sort with code that was working a few hours earlier.
Any idea what it's upset about?



gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 compiled Mar 30 2018 03:08:57
gem5 started Mar 30 2018 03:13:05
gem5 executing on localhost, pid 9033
command line: build/ARM/gem5.debug gem5/google/configs/kvm.py

INFO:root:Disk 0: /home/gabeblack/dist/m5/system/disks/disk.img
INFO:root:Add GPU: NoMali GPU model...
INFO:root:Kernel: /home/gabeblack/dist/m5/system/binaries/vmlinux
INFO:root:Device tree:
/home/gabeblack/dist/m5/system/binaries/armv8_1440x2560_google_v1_2cpu.dtb
Global frequency set at 1 ticks per second
warn: system.pci_ide adopting orphan SimObject param 'disks'
info: kernel located at: /home/gabeblack/dist/m5/system/binaries/vmlinux
warn: Highest ARM exception-level set to AArch32 but bootloader is for
AArch64. Assuming you wanted these to match.
Listening for system connection on port 5900
Listening for system connection on port 3456
Listening for uart1 connection on port 3457
0: system.remote_gdb: listening for remote gdb on port 7000
0: system.remote_gdb: listening for remote gdb on port 7001
warn: CoherentXBar system.membus has no snooping ports attached!
info: Using bootloader at address 0x10
info: Using kernel entry physical address at 0x8008
info: Loading DTB file:
/home/gabeblack/dist/m5/system/binaries/armv8_1440x2560_google_v1_2cpu.dtb
at address 0x8800
info: KVM: Coalesced MMIO disabled by config.
info: KVM: Coalesced MMIO disabled by config.
warn: Existing EnergyCtrl, but no enabled DVFSHandler found.
panic: Failed to set attribute (group: 2, attr: 9, errno: 6)
Memory Usage: 3516676 KBytes
Program aborted at tick 0
--- BEGIN LIBC BACKTRACE ---
build/ARM/gem5.debug(_Z15print_backtracev+0x2c)[0x1a3e750]
build/ARM/gem5.debug(_Z12abortHandleri+0x7c)[0x1a47070]
[0x7988061510]
/lib/aarch64-linux-gnu/libc.so.6(gsignal+0x38)[0x798771b528]
--- END LIBC BACKTRACE ---
Aborted (core dumped)


On Wed, Mar 28, 2018 at 5:14 PM, Gabe Black  wrote:

> Ok, I think I figured it out, and it all has to do with the simulation
> quantum. If the quantum is too big, the kernel might poke hardware and
> expect to get an interrupt within a certain period of time. It could be
> that the CPU gets to the end of its timeout before the simulated hardware
> has had a chance to trigger an interrupt, even though the interrupt would
> happen first if the event queues were held in tighter sync. If I decrease
> the size of the quantum from 500ms (per your suggestion) to 1ms, then I see
> the errors from the keyboard/mouse drivers and the ATA driver go away, at
> least in the one CPU/multiple event queue configuration.
>
> I'm going to do some more testing to make sure there isn't some other
> problem that pops up, and also to characterize the performance impact which
> I'm hopeful won't be too bad.
>
> Also, I was thinking it would be nice if KVM CPUs could set up their event
> queues in some more automatic, less error prone way. Before I knew that
> they needed their own event queue (which I think is just institutional
> knowledge that isn't documented/warned about/etc.?), I had no idea what was
> going wrong when just dropping in some KVM CPUs in place of regular CPUs. I
> don't have a fully fleshed out plan for how to do that, but it doesn't
> *seem* like something that should be that hard to do.
>
> Gabe
>
> On Mon, Mar 26, 2018 at 7:06 PM, Gabe Black  wrote:
>
>> I looked into this a little further, and I see the same problem happen
>> with one CPU but with the CPU and the devices in different event queues. I
>> haven't figured out exactly where things go wrong, but it looks like a
>> write DMA is set up but doesn't happen for some reason. I'm not sure if the
>> DMA starts but then gets stuck, or if it never starts at all. It could also
>> be that the DMA happens, but the completion event (which is what doesn't
>> seem to happen) is mishandled because of the additional event queue.
>>
>> I turned on the DMA debug flag, but that produced so much debug output
>> that my tools are crashing. I'll have to see what I can do to narrow things
>> down a bit.
>>
>> Gabe
>>
>> On Thu, Mar 22, 2018 at 11:28 AM, Gabe Black 
>> wrote:
>>
>>> Ok, thanks. We're deciding internally what approach to use to tackle
>>> this.
>>>
>>> Gabe
>>>
>>> On Wed, Mar 21, 2018 at 3:01 AM, Andreas Sandberg <
>>> andreas.sandb...@arm.com> wrote:
>>>
 Hi Gabe,

 There are issues with the IDE model that prevent it from working with
 in-kernel GIC emulation. I believe the model doesn't clear interrupts
 correctly, which confuses the host kernel. I tried to debug this at some
 point, but wasn't able to do much immaediate progress and decided it wasn't
 worth the effort. The VirtIO block 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-28 Thread Gabe Black
Ok, I think I figured it out, and it all has to do with the simulation
quantum. If the quantum is too big, the kernel might poke hardware and
expect to get an interrupt within a certain period of time. It could be
that the CPU gets to the end of its timeout before the simulated hardware
has had a chance to trigger an interrupt, even though the interrupt would
happen first if the event queues were held in tighter sync. If I decrease
the size of the quantum from 500ms (per your suggestion) to 1ms, then I see
the errors from the keyboard/mouse drivers and the ATA driver go away, at
least in the one CPU/multiple event queue configuration.

I'm going to do some more testing to make sure there isn't some other
problem that pops up, and also to characterize the performance impact which
I'm hopeful won't be too bad.

Also, I was thinking it would be nice if KVM CPUs could set up their event
queues in some more automatic, less error prone way. Before I knew that
they needed their own event queue (which I think is just institutional
knowledge that isn't documented/warned about/etc.?), I had no idea what was
going wrong when just dropping in some KVM CPUs in place of regular CPUs. I
don't have a fully fleshed out plan for how to do that, but it doesn't
*seem* like something that should be that hard to do.

Gabe

On Mon, Mar 26, 2018 at 7:06 PM, Gabe Black  wrote:

> I looked into this a little further, and I see the same problem happen
> with one CPU but with the CPU and the devices in different event queues. I
> haven't figured out exactly where things go wrong, but it looks like a
> write DMA is set up but doesn't happen for some reason. I'm not sure if the
> DMA starts but then gets stuck, or if it never starts at all. It could also
> be that the DMA happens, but the completion event (which is what doesn't
> seem to happen) is mishandled because of the additional event queue.
>
> I turned on the DMA debug flag, but that produced so much debug output
> that my tools are crashing. I'll have to see what I can do to narrow things
> down a bit.
>
> Gabe
>
> On Thu, Mar 22, 2018 at 11:28 AM, Gabe Black  wrote:
>
>> Ok, thanks. We're deciding internally what approach to use to tackle this.
>>
>> Gabe
>>
>> On Wed, Mar 21, 2018 at 3:01 AM, Andreas Sandberg <
>> andreas.sandb...@arm.com> wrote:
>>
>>> Hi Gabe,
>>>
>>> There are issues with the IDE model that prevent it from working with
>>> in-kernel GIC emulation. I believe the model doesn't clear interrupts
>>> correctly, which confuses the host kernel. I tried to debug this at some
>>> point, but wasn't able to do much immaediate progress and decided it wasn't
>>> worth the effort. The VirtIO block devices doesn't suffer from this problem.
>>>
>>> Using the VirtIO device by default seems like a good idea to me. It
>>> doesn't simulate any timing, but that might not be a huge deal since the
>>> IDE device doesn't provide realistic timing anyway. It would be really
>>> awesome if we had a modern storage controller (e.g., NVMe or AHCI) and
>>> proper storage timing models.
>>>
>>> Cheers,
>>> Andreas
>>>
>>> On 20/03/2018 23:38, Gabe Black wrote:
>>>
>>> My next question is about disks. I see that the fs_bigLITTLE.py script
>>> uses PciVirtIO to set up its disks, where I'm using IDE which I inherited
>>> from the fs.py scripts I used as reference. The problem I'm seeing is that
>>> the IDE controllers seem to be mangling commands and dropping interrupts,
>>> so this difference looks particularly suspicious. Is there a KVM related
>>> reason you're using PciVirtIO? Is this something that *should* work with
>>> IDE bug doesn't, or do I have to use PciVirtIO for things to work properly?
>>> I'm not familiar with PciVirtIO beyond briefly skimming the source for it
>>> in gem5. Is this something we should consider using globally as a
>>> replacement for IDE, even in simulations where we're trying to be really
>>> realistic?
>>>
>>> Thanks again for all the help.
>>>
>>> Gabe
>>>
>>> On Tue, Mar 20, 2018 at 3:14 PM, Gabe Black 
>>> wrote:
>>>
 Ok, that (multiple event queues) made things way better. There are
 still some glitches to figure out, but at least it makes good forward
 progress at a reasonable speed. Thanks!

 Gabe

 On Mon, Mar 19, 2018 at 5:12 PM, Gabe Black 
 wrote:

> This is on an chromebook based on the RK3399 with only ~4GB of RAM
> which is not ideal, although we have a bigger machine in the works for the
> future. I agree with your reasoning and don't think option 1 is a problem.
> We're using static DTBs so I don't think that's an issue either. In my
> script, I'm not doing anything smart with the event queues, so that's
> likely at least part of the problem. When I tried using fs_bigLITTLE.py I
> ran into what looked like a similar issue so that might not be the whole
> story, but it's definitely 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-26 Thread Gabe Black
I looked into this a little further, and I see the same problem happen with
one CPU but with the CPU and the devices in different event queues. I
haven't figured out exactly where things go wrong, but it looks like a
write DMA is set up but doesn't happen for some reason. I'm not sure if the
DMA starts but then gets stuck, or if it never starts at all. It could also
be that the DMA happens, but the completion event (which is what doesn't
seem to happen) is mishandled because of the additional event queue.

I turned on the DMA debug flag, but that produced so much debug output that
my tools are crashing. I'll have to see what I can do to narrow things down
a bit.

Gabe

On Thu, Mar 22, 2018 at 11:28 AM, Gabe Black  wrote:

> Ok, thanks. We're deciding internally what approach to use to tackle this.
>
> Gabe
>
> On Wed, Mar 21, 2018 at 3:01 AM, Andreas Sandberg <
> andreas.sandb...@arm.com> wrote:
>
>> Hi Gabe,
>>
>> There are issues with the IDE model that prevent it from working with
>> in-kernel GIC emulation. I believe the model doesn't clear interrupts
>> correctly, which confuses the host kernel. I tried to debug this at some
>> point, but wasn't able to do much immaediate progress and decided it wasn't
>> worth the effort. The VirtIO block devices doesn't suffer from this problem.
>>
>> Using the VirtIO device by default seems like a good idea to me. It
>> doesn't simulate any timing, but that might not be a huge deal since the
>> IDE device doesn't provide realistic timing anyway. It would be really
>> awesome if we had a modern storage controller (e.g., NVMe or AHCI) and
>> proper storage timing models.
>>
>> Cheers,
>> Andreas
>>
>> On 20/03/2018 23:38, Gabe Black wrote:
>>
>> My next question is about disks. I see that the fs_bigLITTLE.py script
>> uses PciVirtIO to set up its disks, where I'm using IDE which I inherited
>> from the fs.py scripts I used as reference. The problem I'm seeing is that
>> the IDE controllers seem to be mangling commands and dropping interrupts,
>> so this difference looks particularly suspicious. Is there a KVM related
>> reason you're using PciVirtIO? Is this something that *should* work with
>> IDE bug doesn't, or do I have to use PciVirtIO for things to work properly?
>> I'm not familiar with PciVirtIO beyond briefly skimming the source for it
>> in gem5. Is this something we should consider using globally as a
>> replacement for IDE, even in simulations where we're trying to be really
>> realistic?
>>
>> Thanks again for all the help.
>>
>> Gabe
>>
>> On Tue, Mar 20, 2018 at 3:14 PM, Gabe Black  wrote:
>>
>>> Ok, that (multiple event queues) made things way better. There are still
>>> some glitches to figure out, but at least it makes good forward progress at
>>> a reasonable speed. Thanks!
>>>
>>> Gabe
>>>
>>> On Mon, Mar 19, 2018 at 5:12 PM, Gabe Black 
>>> wrote:
>>>
 This is on an chromebook based on the RK3399 with only ~4GB of RAM
 which is not ideal, although we have a bigger machine in the works for the
 future. I agree with your reasoning and don't think option 1 is a problem.
 We're using static DTBs so I don't think that's an issue either. In my
 script, I'm not doing anything smart with the event queues, so that's
 likely at least part of the problem. When I tried using fs_bigLITTLE.py I
 ran into what looked like a similar issue so that might not be the whole
 story, but it's definitely something I should fix up. I'll let you know how
 that goes!

 Gabe

 On Mon, Mar 19, 2018 at 4:30 AM, Andreas Sandberg <
 andreas.sandb...@arm.com> wrote:

> Hmm, OK, this is very strange.
>
> What type of hardware are you running on? Is it an A57-based chip or
> something else? Also, what's your simulation quantum? I have been able to
> run with a 0.5ms quantum  (5e8 ticks).
> I think the following trace of two CPUs running in KVM should be
> roughly equivalent to the trace you shared earlier. It was generated on a
> commercially available 8xA57 (16GiB ram) using the following command (gem5
> rev 9dc44b417):
>
> gem5.opt -r --debug-flags Kvm,KvmIO,KvmRun 
> configs/example/arm/fs_bigLITTLE.py \
> --sim-quantum '0.5ms' \
> --cpu-type kvm --big-cpus 0 --little-cpus 2 \
> --dtb system/arm/dt/armv8_gem5_v1_2cpu.dtb --kernel 
> vmlinux.aarch64.4.4-d318f95d0c
>
> Note that the tick counts are a bit weird since we have three
> different event queues at play (1 for devices and one per CPU).
>
>   0: system.littleCluster.cpus0: KVM: Executing for 5 ticks
>   0: system.littleCluster.cpus1: KVM: Executing for 5 ticks
>   0: system.littleCluster.cpus0: KVM: Executed 79170 instructions in 
> 176363 cycles (88181504 ticks, sim cycles: 176363).
> 88182000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
> 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-22 Thread Gabe Black
Ok, thanks. We're deciding internally what approach to use to tackle this.

Gabe

On Wed, Mar 21, 2018 at 3:01 AM, Andreas Sandberg 
wrote:

> Hi Gabe,
>
> There are issues with the IDE model that prevent it from working with
> in-kernel GIC emulation. I believe the model doesn't clear interrupts
> correctly, which confuses the host kernel. I tried to debug this at some
> point, but wasn't able to do much immaediate progress and decided it wasn't
> worth the effort. The VirtIO block devices doesn't suffer from this problem.
>
> Using the VirtIO device by default seems like a good idea to me. It
> doesn't simulate any timing, but that might not be a huge deal since the
> IDE device doesn't provide realistic timing anyway. It would be really
> awesome if we had a modern storage controller (e.g., NVMe or AHCI) and
> proper storage timing models.
>
> Cheers,
> Andreas
>
> On 20/03/2018 23:38, Gabe Black wrote:
>
> My next question is about disks. I see that the fs_bigLITTLE.py script
> uses PciVirtIO to set up its disks, where I'm using IDE which I inherited
> from the fs.py scripts I used as reference. The problem I'm seeing is that
> the IDE controllers seem to be mangling commands and dropping interrupts,
> so this difference looks particularly suspicious. Is there a KVM related
> reason you're using PciVirtIO? Is this something that *should* work with
> IDE bug doesn't, or do I have to use PciVirtIO for things to work properly?
> I'm not familiar with PciVirtIO beyond briefly skimming the source for it
> in gem5. Is this something we should consider using globally as a
> replacement for IDE, even in simulations where we're trying to be really
> realistic?
>
> Thanks again for all the help.
>
> Gabe
>
> On Tue, Mar 20, 2018 at 3:14 PM, Gabe Black  wrote:
>
>> Ok, that (multiple event queues) made things way better. There are still
>> some glitches to figure out, but at least it makes good forward progress at
>> a reasonable speed. Thanks!
>>
>> Gabe
>>
>> On Mon, Mar 19, 2018 at 5:12 PM, Gabe Black  wrote:
>>
>>> This is on an chromebook based on the RK3399 with only ~4GB of RAM which
>>> is not ideal, although we have a bigger machine in the works for the
>>> future. I agree with your reasoning and don't think option 1 is a problem.
>>> We're using static DTBs so I don't think that's an issue either. In my
>>> script, I'm not doing anything smart with the event queues, so that's
>>> likely at least part of the problem. When I tried using fs_bigLITTLE.py I
>>> ran into what looked like a similar issue so that might not be the whole
>>> story, but it's definitely something I should fix up. I'll let you know how
>>> that goes!
>>>
>>> Gabe
>>>
>>> On Mon, Mar 19, 2018 at 4:30 AM, Andreas Sandberg <
>>> andreas.sandb...@arm.com> wrote:
>>>
 Hmm, OK, this is very strange.

 What type of hardware are you running on? Is it an A57-based chip or
 something else? Also, what's your simulation quantum? I have been able to
 run with a 0.5ms quantum  (5e8 ticks).
 I think the following trace of two CPUs running in KVM should be
 roughly equivalent to the trace you shared earlier. It was generated on a
 commercially available 8xA57 (16GiB ram) using the following command (gem5
 rev 9dc44b417):

 gem5.opt -r --debug-flags Kvm,KvmIO,KvmRun 
 configs/example/arm/fs_bigLITTLE.py \
 --sim-quantum '0.5ms' \
 --cpu-type kvm --big-cpus 0 --little-cpus 2 \
 --dtb system/arm/dt/armv8_gem5_v1_2cpu.dtb --kernel 
 vmlinux.aarch64.4.4-d318f95d0c

 Note that the tick counts are a bit weird since we have three different
 event queues at play (1 for devices and one per CPU).

   0: system.littleCluster.cpus0: KVM: Executing for 5 ticks
   0: system.littleCluster.cpus1: KVM: Executing for 5 ticks
   0: system.littleCluster.cpus0: KVM: Executed 79170 instructions in 
 176363 cycles (88181504 ticks, sim cycles: 176363).
 88182000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
 88182000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
 0x1c090024, len: 4)
 88332000: system.littleCluster.cpus0: Entering KVM...
 88332000: system.littleCluster.cpus0: KVM: Executing for 411668000 ticks
 88332000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 
 16854 cycles (8427000 ticks, sim cycles: 16854).
 96759000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
 96759000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
 0x1c090030, len: 4)
   0: system.littleCluster.cpus1: KVM: Executed 409368 instructions in 
 666400 cycles (33320 ticks, sim cycles: 666400).
 33320: system.littleCluster.cpus1: Entering KVM...
 33320: system.littleCluster.cpus1: KVM: Executing for 16680 ticks
 96909000: 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-21 Thread Andreas Sandberg

Hi Gabe,

There are issues with the IDE model that prevent it from working with in-kernel 
GIC emulation. I believe the model doesn't clear interrupts correctly, which 
confuses the host kernel. I tried to debug this at some point, but wasn't able 
to do much immaediate progress and decided it wasn't worth the effort. The 
VirtIO block devices doesn't suffer from this problem.

Using the VirtIO device by default seems like a good idea to me. It doesn't 
simulate any timing, but that might not be a huge deal since the IDE device 
doesn't provide realistic timing anyway. It would be really awesome if we had a 
modern storage controller (e.g., NVMe or AHCI) and proper storage timing models.

Cheers,
Andreas

On 20/03/2018 23:38, Gabe Black wrote:
My next question is about disks. I see that the fs_bigLITTLE.py script uses 
PciVirtIO to set up its disks, where I'm using IDE which I inherited from the 
fs.py scripts I used as reference. The problem I'm seeing is that the IDE 
controllers seem to be mangling commands and dropping interrupts, so this 
difference looks particularly suspicious. Is there a KVM related reason you're 
using PciVirtIO? Is this something that *should* work with IDE bug doesn't, or 
do I have to use PciVirtIO for things to work properly? I'm not familiar with 
PciVirtIO beyond briefly skimming the source for it in gem5. Is this something 
we should consider using globally as a replacement for IDE, even in simulations 
where we're trying to be really realistic?

Thanks again for all the help.

Gabe

On Tue, Mar 20, 2018 at 3:14 PM, Gabe Black 
> wrote:
Ok, that (multiple event queues) made things way better. There are still some 
glitches to figure out, but at least it makes good forward progress at a 
reasonable speed. Thanks!

Gabe

On Mon, Mar 19, 2018 at 5:12 PM, Gabe Black 
> wrote:
This is on an chromebook based on the RK3399 with only ~4GB of RAM which is not 
ideal, although we have a bigger machine in the works for the future. I agree 
with your reasoning and don't think option 1 is a problem. We're using static 
DTBs so I don't think that's an issue either. In my script, I'm not doing 
anything smart with the event queues, so that's likely at least part of the 
problem. When I tried using fs_bigLITTLE.py I ran into what looked like a 
similar issue so that might not be the whole story, but it's definitely 
something I should fix up. I'll let you know how that goes!

Gabe

On Mon, Mar 19, 2018 at 4:30 AM, Andreas Sandberg 
> wrote:

Hmm, OK, this is very strange.

What type of hardware are you running on? Is it an A57-based chip or something 
else? Also, what's your simulation quantum? I have been able to run with a 
0.5ms quantum  (5e8 ticks).

I think the following trace of two CPUs running in KVM should be roughly 
equivalent to the trace you shared earlier. It was generated on a commercially 
available 8xA57 (16GiB ram) using the following command (gem5 rev 9dc44b417):

gem5.opt -r --debug-flags Kvm,KvmIO,KvmRun configs/example/arm/fs_bigLITTLE.py \
   --sim-quantum '0.5ms' \
   --cpu-type kvm --big-cpus 0 --little-cpus 2 \
   --dtb system/arm/dt/armv8_gem5_v1_2cpu.dtb --kernel 
vmlinux.aarch64.4.4-d318f95d0c

Note that the tick counts are a bit weird since we have three different event 
queues at play (1 for devices and one per CPU).

 0: system.littleCluster.cpus0: KVM: Executing for 5 ticks
 0: system.littleCluster.cpus1: KVM: Executing for 5 ticks
 0: system.littleCluster.cpus0: KVM: Executed 79170 instructions in 176363 
cycles (88181504 ticks, sim cycles: 176363).
88182000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
88182000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
0x1c090024, len: 4)
88332000: system.littleCluster.cpus0: Entering KVM...
88332000: system.littleCluster.cpus0: KVM: Executing for 411668000 ticks
88332000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 16854 
cycles (8427000 ticks, sim cycles: 16854).
96759000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
96759000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
0x1c090030, len: 4)
 0: system.littleCluster.cpus1: KVM: Executed 409368 instructions in 666400 
cycles (33320 ticks, sim cycles: 666400).
33320: system.littleCluster.cpus1: Entering KVM...
33320: system.littleCluster.cpus1: KVM: Executing for 16680 ticks
96909000: system.littleCluster.cpus0: Entering KVM...
96909000: system.littleCluster.cpus0: KVM: Executing for 403091000 ticks
96909000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 15257 
cycles (7628500 ticks, sim cycles: 15257).
104538000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
104538000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
0x1c0100a0, len: 4)

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-20 Thread Gabe Black
My next question is about disks. I see that the fs_bigLITTLE.py script uses
PciVirtIO to set up its disks, where I'm using IDE which I inherited from
the fs.py scripts I used as reference. The problem I'm seeing is that the
IDE controllers seem to be mangling commands and dropping interrupts, so
this difference looks particularly suspicious. Is there a KVM related
reason you're using PciVirtIO? Is this something that *should* work with
IDE bug doesn't, or do I have to use PciVirtIO for things to work properly?
I'm not familiar with PciVirtIO beyond briefly skimming the source for it
in gem5. Is this something we should consider using globally as a
replacement for IDE, even in simulations where we're trying to be really
realistic?

Thanks again for all the help.

Gabe

On Tue, Mar 20, 2018 at 3:14 PM, Gabe Black  wrote:

> Ok, that (multiple event queues) made things way better. There are still
> some glitches to figure out, but at least it makes good forward progress at
> a reasonable speed. Thanks!
>
> Gabe
>
> On Mon, Mar 19, 2018 at 5:12 PM, Gabe Black  wrote:
>
>> This is on an chromebook based on the RK3399 with only ~4GB of RAM which
>> is not ideal, although we have a bigger machine in the works for the
>> future. I agree with your reasoning and don't think option 1 is a problem.
>> We're using static DTBs so I don't think that's an issue either. In my
>> script, I'm not doing anything smart with the event queues, so that's
>> likely at least part of the problem. When I tried using fs_bigLITTLE.py I
>> ran into what looked like a similar issue so that might not be the whole
>> story, but it's definitely something I should fix up. I'll let you know how
>> that goes!
>>
>> Gabe
>>
>> On Mon, Mar 19, 2018 at 4:30 AM, Andreas Sandberg <
>> andreas.sandb...@arm.com> wrote:
>>
>>> Hmm, OK, this is very strange.
>>>
>>> What type of hardware are you running on? Is it an A57-based chip or
>>> something else? Also, what's your simulation quantum? I have been able to
>>> run with a 0.5ms quantum  (5e8 ticks).
>>> I think the following trace of two CPUs running in KVM should be roughly
>>> equivalent to the trace you shared earlier. It was generated on a
>>> commercially available 8xA57 (16GiB ram) using the following command (gem5
>>> rev 9dc44b417):
>>>
>>> gem5.opt -r --debug-flags Kvm,KvmIO,KvmRun 
>>> configs/example/arm/fs_bigLITTLE.py \
>>> --sim-quantum '0.5ms' \
>>> --cpu-type kvm --big-cpus 0 --little-cpus 2 \
>>> --dtb system/arm/dt/armv8_gem5_v1_2cpu.dtb --kernel 
>>> vmlinux.aarch64.4.4-d318f95d0c
>>>
>>> Note that the tick counts are a bit weird since we have three different
>>> event queues at play (1 for devices and one per CPU).
>>>
>>>   0: system.littleCluster.cpus0: KVM: Executing for 5 ticks
>>>   0: system.littleCluster.cpus1: KVM: Executing for 5 ticks
>>>   0: system.littleCluster.cpus0: KVM: Executed 79170 instructions in 
>>> 176363 cycles (88181504 ticks, sim cycles: 176363).
>>> 88182000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>>> 88182000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>>> 0x1c090024, len: 4)
>>> 88332000: system.littleCluster.cpus0: Entering KVM...
>>> 88332000: system.littleCluster.cpus0: KVM: Executing for 411668000 ticks
>>> 88332000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 
>>> 16854 cycles (8427000 ticks, sim cycles: 16854).
>>> 96759000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>>> 96759000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>>> 0x1c090030, len: 4)
>>>   0: system.littleCluster.cpus1: KVM: Executed 409368 instructions in 
>>> 666400 cycles (33320 ticks, sim cycles: 666400).
>>> 33320: system.littleCluster.cpus1: Entering KVM...
>>> 33320: system.littleCluster.cpus1: KVM: Executing for 16680 ticks
>>> 96909000: system.littleCluster.cpus0: Entering KVM...
>>> 96909000: system.littleCluster.cpus0: KVM: Executing for 403091000 ticks
>>> 96909000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 
>>> 15257 cycles (7628500 ticks, sim cycles: 15257).
>>> 104538000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>>> 104538000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>>> 0x1c0100a0, len: 4)
>>> 33320: system.littleCluster.cpus1: KVM: Executed 47544 instructions in 
>>> 200820 cycles (10041 ticks, sim cycles: 200820).
>>> 43361: system.littleCluster.cpus1: Entering KVM...
>>> 43361: system.littleCluster.cpus1: KVM: Executing for 6639 ticks
>>> 104688000: system.littleCluster.cpus0: Entering KVM...
>>> 104688000: system.littleCluster.cpus0: KVM: Executing for 395312000 ticks
>>> 104688000: system.littleCluster.cpus0: KVM: Executed 4382 instructions in 
>>> 14942 cycles (7471000 ticks, sim cycles: 14942).
>>>
>>> Comparing this trace to yours, I'd say that there the frequent KVM exits
>>> look a bit 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-20 Thread Gabe Black
Ok, that (multiple event queues) made things way better. There are still
some glitches to figure out, but at least it makes good forward progress at
a reasonable speed. Thanks!

Gabe

On Mon, Mar 19, 2018 at 5:12 PM, Gabe Black  wrote:

> This is on an chromebook based on the RK3399 with only ~4GB of RAM which
> is not ideal, although we have a bigger machine in the works for the
> future. I agree with your reasoning and don't think option 1 is a problem.
> We're using static DTBs so I don't think that's an issue either. In my
> script, I'm not doing anything smart with the event queues, so that's
> likely at least part of the problem. When I tried using fs_bigLITTLE.py I
> ran into what looked like a similar issue so that might not be the whole
> story, but it's definitely something I should fix up. I'll let you know how
> that goes!
>
> Gabe
>
> On Mon, Mar 19, 2018 at 4:30 AM, Andreas Sandberg <
> andreas.sandb...@arm.com> wrote:
>
>> Hmm, OK, this is very strange.
>>
>> What type of hardware are you running on? Is it an A57-based chip or
>> something else? Also, what's your simulation quantum? I have been able to
>> run with a 0.5ms quantum  (5e8 ticks).
>> I think the following trace of two CPUs running in KVM should be roughly
>> equivalent to the trace you shared earlier. It was generated on a
>> commercially available 8xA57 (16GiB ram) using the following command (gem5
>> rev 9dc44b417):
>>
>> gem5.opt -r --debug-flags Kvm,KvmIO,KvmRun 
>> configs/example/arm/fs_bigLITTLE.py \
>> --sim-quantum '0.5ms' \
>> --cpu-type kvm --big-cpus 0 --little-cpus 2 \
>> --dtb system/arm/dt/armv8_gem5_v1_2cpu.dtb --kernel 
>> vmlinux.aarch64.4.4-d318f95d0c
>>
>> Note that the tick counts are a bit weird since we have three different
>> event queues at play (1 for devices and one per CPU).
>>
>>   0: system.littleCluster.cpus0: KVM: Executing for 5 ticks
>>   0: system.littleCluster.cpus1: KVM: Executing for 5 ticks
>>   0: system.littleCluster.cpus0: KVM: Executed 79170 instructions in 
>> 176363 cycles (88181504 ticks, sim cycles: 176363).
>> 88182000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>> 88182000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>> 0x1c090024, len: 4)
>> 88332000: system.littleCluster.cpus0: Entering KVM...
>> 88332000: system.littleCluster.cpus0: KVM: Executing for 411668000 ticks
>> 88332000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 
>> 16854 cycles (8427000 ticks, sim cycles: 16854).
>> 96759000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>> 96759000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>> 0x1c090030, len: 4)
>>   0: system.littleCluster.cpus1: KVM: Executed 409368 instructions in 
>> 666400 cycles (33320 ticks, sim cycles: 666400).
>> 33320: system.littleCluster.cpus1: Entering KVM...
>> 33320: system.littleCluster.cpus1: KVM: Executing for 16680 ticks
>> 96909000: system.littleCluster.cpus0: Entering KVM...
>> 96909000: system.littleCluster.cpus0: KVM: Executing for 403091000 ticks
>> 96909000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 
>> 15257 cycles (7628500 ticks, sim cycles: 15257).
>> 104538000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
>> 104538000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
>> 0x1c0100a0, len: 4)
>> 33320: system.littleCluster.cpus1: KVM: Executed 47544 instructions in 
>> 200820 cycles (10041 ticks, sim cycles: 200820).
>> 43361: system.littleCluster.cpus1: Entering KVM...
>> 43361: system.littleCluster.cpus1: KVM: Executing for 6639 ticks
>> 104688000: system.littleCluster.cpus0: Entering KVM...
>> 104688000: system.littleCluster.cpus0: KVM: Executing for 395312000 ticks
>> 104688000: system.littleCluster.cpus0: KVM: Executed 4382 instructions in 
>> 14942 cycles (7471000 ticks, sim cycles: 14942).
>>
>> Comparing this trace to yours, I'd say that there the frequent KVM exits
>> look a bit suspicious. I would expect secondary CPUs to make very little
>> process while the main CPU initializes the system and starts the early boot
>> code.
>>
>> There area  couple of possibilities that might be causing issues:
>>
>> 1) There is some CPU ID weirdness that confuses the boot code and puts
>> both CPUs in the holding pen. This seems unlikely since there are some
>> writes to the UART.
>>
>> 2) Some device is incorrectly mapped to the CPU event queues and causes
>> frequent KVM exits. Have a look at _build_kvm in fs_bigLITTLE.py, it
>> doesn't use configs/common, so no need to tear your eyes out. ;) Do you map
>> event queues in the same way? It's mapping all simulated devices to one
>> event queue and the CPUs to private event queues. It's important to remap
>> CPU child devices to the device queue instead of the CPU queue. Failing to
>> do this will cause chaos, madness, and quite possibly result in Armageddon.
>>
>> 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-19 Thread Gabe Black
This is on an chromebook based on the RK3399 with only ~4GB of RAM which is
not ideal, although we have a bigger machine in the works for the future. I
agree with your reasoning and don't think option 1 is a problem. We're
using static DTBs so I don't think that's an issue either. In my script,
I'm not doing anything smart with the event queues, so that's likely at
least part of the problem. When I tried using fs_bigLITTLE.py I ran into
what looked like a similar issue so that might not be the whole story, but
it's definitely something I should fix up. I'll let you know how that goes!

Gabe

On Mon, Mar 19, 2018 at 4:30 AM, Andreas Sandberg 
wrote:

> Hmm, OK, this is very strange.
>
> What type of hardware are you running on? Is it an A57-based chip or
> something else? Also, what's your simulation quantum? I have been able to
> run with a 0.5ms quantum  (5e8 ticks).
> I think the following trace of two CPUs running in KVM should be roughly
> equivalent to the trace you shared earlier. It was generated on a
> commercially available 8xA57 (16GiB ram) using the following command (gem5
> rev 9dc44b417):
>
> gem5.opt -r --debug-flags Kvm,KvmIO,KvmRun 
> configs/example/arm/fs_bigLITTLE.py \
> --sim-quantum '0.5ms' \
> --cpu-type kvm --big-cpus 0 --little-cpus 2 \
> --dtb system/arm/dt/armv8_gem5_v1_2cpu.dtb --kernel 
> vmlinux.aarch64.4.4-d318f95d0c
>
> Note that the tick counts are a bit weird since we have three different
> event queues at play (1 for devices and one per CPU).
>
>   0: system.littleCluster.cpus0: KVM: Executing for 5 ticks
>   0: system.littleCluster.cpus1: KVM: Executing for 5 ticks
>   0: system.littleCluster.cpus0: KVM: Executed 79170 instructions in 
> 176363 cycles (88181504 ticks, sim cycles: 176363).
> 88182000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
> 88182000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
> 0x1c090024, len: 4)
> 88332000: system.littleCluster.cpus0: Entering KVM...
> 88332000: system.littleCluster.cpus0: KVM: Executing for 411668000 ticks
> 88332000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 
> 16854 cycles (8427000 ticks, sim cycles: 16854).
> 96759000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
> 96759000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
> 0x1c090030, len: 4)
>   0: system.littleCluster.cpus1: KVM: Executed 409368 instructions in 
> 666400 cycles (33320 ticks, sim cycles: 666400).
> 33320: system.littleCluster.cpus1: Entering KVM...
> 33320: system.littleCluster.cpus1: KVM: Executing for 16680 ticks
> 96909000: system.littleCluster.cpus0: Entering KVM...
> 96909000: system.littleCluster.cpus0: KVM: Executing for 403091000 ticks
> 96909000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 
> 15257 cycles (7628500 ticks, sim cycles: 15257).
> 104538000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
> 104538000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
> 0x1c0100a0, len: 4)
> 33320: system.littleCluster.cpus1: KVM: Executed 47544 instructions in 
> 200820 cycles (10041 ticks, sim cycles: 200820).
> 43361: system.littleCluster.cpus1: Entering KVM...
> 43361: system.littleCluster.cpus1: KVM: Executing for 6639 ticks
> 104688000: system.littleCluster.cpus0: Entering KVM...
> 104688000: system.littleCluster.cpus0: KVM: Executing for 395312000 ticks
> 104688000: system.littleCluster.cpus0: KVM: Executed 4382 instructions in 
> 14942 cycles (7471000 ticks, sim cycles: 14942).
>
> Comparing this trace to yours, I'd say that there the frequent KVM exits
> look a bit suspicious. I would expect secondary CPUs to make very little
> process while the main CPU initializes the system and starts the early boot
> code.
>
> There area  couple of possibilities that might be causing issues:
>
> 1) There is some CPU ID weirdness that confuses the boot code and puts
> both CPUs in the holding pen. This seems unlikely since there are some
> writes to the UART.
>
> 2) Some device is incorrectly mapped to the CPU event queues and causes
> frequent KVM exits. Have a look at _build_kvm in fs_bigLITTLE.py, it
> doesn't use configs/common, so no need to tear your eyes out. ;) Do you map
> event queues in the same way? It's mapping all simulated devices to one
> event queue and the CPUs to private event queues. It's important to remap
> CPU child devices to the device queue instead of the CPU queue. Failing to
> do this will cause chaos, madness, and quite possibly result in Armageddon.
>
> 3) You're using DTB autogeneration. This doesn't work for KVM guests due
> to issues with the timer interrupt specification. We have a patch for the
> timer that we are testing internally. Sorry. :(
>
> Regards,
> Andreas
> On 16/03/2018 23:20, Gabe Black wrote:
>
> Ok, diving into this a little deeper, it looks like execution is
> progressing but is making 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-19 Thread Andreas Sandberg

Hmm, OK, this is very strange.

What type of hardware are you running on? Is it an A57-based chip or something 
else? Also, what's your simulation quantum? I have been able to run with a 
0.5ms quantum  (5e8 ticks).

I think the following trace of two CPUs running in KVM should be roughly 
equivalent to the trace you shared earlier. It was generated on a commercially 
available 8xA57 (16GiB ram) using the following command (gem5 rev 9dc44b417):

gem5.opt -r --debug-flags Kvm,KvmIO,KvmRun configs/example/arm/fs_bigLITTLE.py \
   --sim-quantum '0.5ms' \
   --cpu-type kvm --big-cpus 0 --little-cpus 2 \
   --dtb system/arm/dt/armv8_gem5_v1_2cpu.dtb --kernel 
vmlinux.aarch64.4.4-d318f95d0c

Note that the tick counts are a bit weird since we have three different event 
queues at play (1 for devices and one per CPU).

 0: system.littleCluster.cpus0: KVM: Executing for 5 ticks
 0: system.littleCluster.cpus1: KVM: Executing for 5 ticks
 0: system.littleCluster.cpus0: KVM: Executed 79170 instructions in 176363 
cycles (88181504 ticks, sim cycles: 176363).
88182000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
88182000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
0x1c090024, len: 4)
88332000: system.littleCluster.cpus0: Entering KVM...
88332000: system.littleCluster.cpus0: KVM: Executing for 411668000 ticks
88332000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 16854 
cycles (8427000 ticks, sim cycles: 16854).
96759000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
96759000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
0x1c090030, len: 4)
 0: system.littleCluster.cpus1: KVM: Executed 409368 instructions in 666400 
cycles (33320 ticks, sim cycles: 666400).
33320: system.littleCluster.cpus1: Entering KVM...
33320: system.littleCluster.cpus1: KVM: Executing for 16680 ticks
96909000: system.littleCluster.cpus0: Entering KVM...
96909000: system.littleCluster.cpus0: KVM: Executing for 403091000 ticks
96909000: system.littleCluster.cpus0: KVM: Executed 4384 instructions in 15257 
cycles (7628500 ticks, sim cycles: 15257).
104538000: system.littleCluster.cpus0: handleKvmExit (exit_reason: 6)
104538000: system.littleCluster.cpus0: KVM: Handling MMIO (w: 1, addr: 
0x1c0100a0, len: 4)
33320: system.littleCluster.cpus1: KVM: Executed 47544 instructions in 
200820 cycles (10041 ticks, sim cycles: 200820).
43361: system.littleCluster.cpus1: Entering KVM...
43361: system.littleCluster.cpus1: KVM: Executing for 6639 ticks
104688000: system.littleCluster.cpus0: Entering KVM...
104688000: system.littleCluster.cpus0: KVM: Executing for 395312000 ticks
104688000: system.littleCluster.cpus0: KVM: Executed 4382 instructions in 14942 
cycles (7471000 ticks, sim cycles: 14942).


Comparing this trace to yours, I'd say that there the frequent KVM exits look a 
bit suspicious. I would expect secondary CPUs to make very little process while 
the main CPU initializes the system and starts the early boot code.

There area  couple of possibilities that might be causing issues:

1) There is some CPU ID weirdness that confuses the boot code and puts both 
CPUs in the holding pen. This seems unlikely since there are some writes to the 
UART.

2) Some device is incorrectly mapped to the CPU event queues and causes 
frequent KVM exits. Have a look at _build_kvm in fs_bigLITTLE.py, it doesn't 
use configs/common, so no need to tear your eyes out. ;) Do you map event 
queues in the same way? It's mapping all simulated devices to one event queue 
and the CPUs to private event queues. It's important to remap CPU child devices 
to the device queue instead of the CPU queue. Failing to do this will cause 
chaos, madness, and quite possibly result in Armageddon.

3) You're using DTB autogeneration. This doesn't work for KVM guests due to 
issues with the timer interrupt specification. We have a patch for the timer 
that we are testing internally. Sorry. :(

Regards,
Andreas

On 16/03/2018 23:20, Gabe Black wrote:
Ok, diving into this a little deeper, it looks like execution is progressing but is 
making very slow progress for some reason. I added a call to "dump()" before 
each ioctl invocation which enters the VM and looked at the PC to get an idea of what it 
was up to. I made sure to put that before the timers to avoid taking up VM time with 
printing debug stuff. In any case, I see that neither CPU gets off of PC 0 for about 2ms 
simulated time (~500Hz), and that's EXTREMELY slow for a CPU which is supposed to be 
running in the ballpark of 2GHz. It's not clear to me why it's making such slow progress, 
but that would explain why I'm getting very little out on the simulated console. It's 
just taking forever to make it that far.

Any idea why it's going so slow, or how to debug further?

Gabe

On Wed, Mar 14, 2018 at 7:42 PM, Gabe Black 
> wrote:
Some output 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-16 Thread Gabe Black
Ok, diving into this a little deeper, it looks like execution is
progressing but is making very slow progress for some reason. I added a
call to "dump()" before each ioctl invocation which enters the VM and
looked at the PC to get an idea of what it was up to. I made sure to put
that before the timers to avoid taking up VM time with printing debug
stuff. In any case, I see that neither CPU gets off of PC 0 for about 2ms
simulated time (~500Hz), and that's EXTREMELY slow for a CPU which is
supposed to be running in the ballpark of 2GHz. It's not clear to me why
it's making such slow progress, but that would explain why I'm getting very
little out on the simulated console. It's just taking forever to make it
that far.

Any idea why it's going so slow, or how to debug further?

Gabe

On Wed, Mar 14, 2018 at 7:42 PM, Gabe Black  wrote:

> Some output which I think is suspicious:
>
> 55462000: system.cpus0: Entering KVM...
> 55462000: system.cpus0: KVM: Executing for 1506000 ticks
> 55462000: system.cpus0: KVM: Executed 5159 instructions in 13646 cycles
> (6823000 ticks, sim cycles: 13646).
> 56968000: system.cpus1: Entering KVM...
> 56968000: system.cpus1: KVM: Executing for 5317000 ticks
> 56968000: system.cpus1: KVM: Executed 7229 instructions in 14379 cycles
> (7189500 ticks, sim cycles: 14379).
> 62285000: system.cpus0: Entering KVM...
> 62285000: system.cpus0: KVM: Executing for 1872500 ticks
> 62285000: system.cpus0: KVM: Executed 5159 instructions in 13496 cycles
> (6748000 ticks, sim cycles: 13496).
> 64157500: system.cpus1: Entering KVM...
> 64157500: system.cpus1: KVM: Executing for 4875500 ticks
> 64157500: system.cpus1: KVM: Executed 6950 instructions in 13863 cycles
> (6931500 ticks, sim cycles: 13863).
> 69033000: system.cpus0: Entering KVM...
> 69033000: system.cpus0: KVM: Executing for 2056000 ticks
> 69033000: system.cpus0: KVM: Executed 5159 instructions in 13454 cycles
> (6727000 ticks, sim cycles: 13454).
> 71089000: system.cpus1: Entering KVM...
> 71089000: system.cpus1: KVM: Executing for 4671000 ticks
> 71089000: system.cpus1: KVM: Executed 6950 instructions in 13861 cycles
> (6930500 ticks, sim cycles: 13861).
> 7576: system.cpus0: Entering KVM...
> 7576: system.cpus0: KVM: Executing for 2259500 ticks
> 7576: system.cpus0: KVM: Executed 5159 instructions in 13688 cycles
> (6844000 ticks, sim cycles: 13688).
>
> [...]
>
> 126512000: system.cpus0: handleKvmExit (exit_reason: 6)
> 126512000: system.cpus0: KVM: Handling MMIO (w: 1, addr: 0x1c090024, len:
> 4)
> 126512000: system.cpus0: In updateThreadContext():
>
> [...]
>
> 126512000: system.cpus0:   PC := 0xd8 (t: 0, a64: 1)
>
> On Wed, Mar 14, 2018 at 7:37 PM, Gabe Black  wrote:
>
>> I tried it just now, and I still don't see anything on the console. I
>> switched back to using my own script since it's a bit simpler (it doesn't
>> use all the configs/common stuff), and started looking at the KVM debug
>> output. I see that both cpus claim to execute instructions, although cpu1
>> didn't take an exit in the output I was looking at. cpu0 took four exits,
>> two which touched some UART registers, and two which touched RealView
>> registes, the V2M_SYS_CFGDATA and V2M_SYS_CFGCTRL registers judging by the
>> comments in the bootloader assembly file.
>>
>> After that they claim to be doing stuff, although I see no further
>> console output or KVM exits. The accesses themselves and their PCs are from
>> the bootloader blob, and so I'm pretty confident that it's starting that
>> and executing some of those instructions. One thing that looks very odd now
>> that I think about it, is that the KVM messages about entering and
>> executing instructions (like those below) seem to say that cpu0 has
>> executed thousands of instructions, but the exits I see seem to correspond
>> to the first maybe 50 instructions it should be seeing in the bootloader
>> blob. Are those values bogus for some reason? Is there some existing debug
>> output which would let me see where KVM thinks it is periodically to see if
>> it's in the kernel or if it went bananas and is executing random memory
>> somewhere? Or if it just got stuck waiting for some event that's not going
>> to show up?
>>
>> Are there any important CLs which haven't made their way into upstream
>> somehow?
>>
>> Gabe
>>
>> On Wed, Mar 14, 2018 at 4:28 AM, Andreas Sandberg <
>> andreas.sandb...@arm.com> wrote:
>>
>>> Have you tried using the fs_bigLITTLE script in configs/examples/arm?
>>> That's the script I have been using for testing.
>>>
>>> I just tested the script with 8 little CPUs and 0 big CPUs and it seems
>>> to work. Timing is a bit temperamental though, so you might need to
>>> override the simulation quantum. The default is 1ms, you might need to
>>> decrease it to something slightly smaller (I'm currently using 0.5ms).
>>> Another caveat is that there seem to be some issues related to dtb
>>> auto-generation that affect KVM guests. 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-14 Thread Gabe Black
Some output which I think is suspicious:

55462000: system.cpus0: Entering KVM...
55462000: system.cpus0: KVM: Executing for 1506000 ticks
55462000: system.cpus0: KVM: Executed 5159 instructions in 13646 cycles
(6823000 ticks, sim cycles: 13646).
56968000: system.cpus1: Entering KVM...
56968000: system.cpus1: KVM: Executing for 5317000 ticks
56968000: system.cpus1: KVM: Executed 7229 instructions in 14379 cycles
(7189500 ticks, sim cycles: 14379).
62285000: system.cpus0: Entering KVM...
62285000: system.cpus0: KVM: Executing for 1872500 ticks
62285000: system.cpus0: KVM: Executed 5159 instructions in 13496 cycles
(6748000 ticks, sim cycles: 13496).
64157500: system.cpus1: Entering KVM...
64157500: system.cpus1: KVM: Executing for 4875500 ticks
64157500: system.cpus1: KVM: Executed 6950 instructions in 13863 cycles
(6931500 ticks, sim cycles: 13863).
69033000: system.cpus0: Entering KVM...
69033000: system.cpus0: KVM: Executing for 2056000 ticks
69033000: system.cpus0: KVM: Executed 5159 instructions in 13454 cycles
(6727000 ticks, sim cycles: 13454).
71089000: system.cpus1: Entering KVM...
71089000: system.cpus1: KVM: Executing for 4671000 ticks
71089000: system.cpus1: KVM: Executed 6950 instructions in 13861 cycles
(6930500 ticks, sim cycles: 13861).
7576: system.cpus0: Entering KVM...
7576: system.cpus0: KVM: Executing for 2259500 ticks
7576: system.cpus0: KVM: Executed 5159 instructions in 13688 cycles
(6844000 ticks, sim cycles: 13688).

[...]

126512000: system.cpus0: handleKvmExit (exit_reason: 6)
126512000: system.cpus0: KVM: Handling MMIO (w: 1, addr: 0x1c090024, len:
4)
126512000: system.cpus0: In updateThreadContext():

[...]

126512000: system.cpus0:   PC := 0xd8 (t: 0, a64: 1)

On Wed, Mar 14, 2018 at 7:37 PM, Gabe Black  wrote:

> I tried it just now, and I still don't see anything on the console. I
> switched back to using my own script since it's a bit simpler (it doesn't
> use all the configs/common stuff), and started looking at the KVM debug
> output. I see that both cpus claim to execute instructions, although cpu1
> didn't take an exit in the output I was looking at. cpu0 took four exits,
> two which touched some UART registers, and two which touched RealView
> registes, the V2M_SYS_CFGDATA and V2M_SYS_CFGCTRL registers judging by the
> comments in the bootloader assembly file.
>
> After that they claim to be doing stuff, although I see no further console
> output or KVM exits. The accesses themselves and their PCs are from the
> bootloader blob, and so I'm pretty confident that it's starting that and
> executing some of those instructions. One thing that looks very odd now
> that I think about it, is that the KVM messages about entering and
> executing instructions (like those below) seem to say that cpu0 has
> executed thousands of instructions, but the exits I see seem to correspond
> to the first maybe 50 instructions it should be seeing in the bootloader
> blob. Are those values bogus for some reason? Is there some existing debug
> output which would let me see where KVM thinks it is periodically to see if
> it's in the kernel or if it went bananas and is executing random memory
> somewhere? Or if it just got stuck waiting for some event that's not going
> to show up?
>
> Are there any important CLs which haven't made their way into upstream
> somehow?
>
> Gabe
>
> On Wed, Mar 14, 2018 at 4:28 AM, Andreas Sandberg <
> andreas.sandb...@arm.com> wrote:
>
>> Have you tried using the fs_bigLITTLE script in configs/examples/arm?
>> That's the script I have been using for testing.
>>
>> I just tested the script with 8 little CPUs and 0 big CPUs and it seems
>> to work. Timing is a bit temperamental though, so you might need to
>> override the simulation quantum. The default is 1ms, you might need to
>> decrease it to something slightly smaller (I'm currently using 0.5ms).
>> Another caveat is that there seem to be some issues related to dtb
>> auto-generation that affect KVM guests. We are currently testing a
>> solution for this issue.
>>
>> Cheers,
>> Andreas
>>
>>
>>
>> On 12/03/2018 22:26, Gabe Black wrote:
>>
>>> I'm trying to run in FS mode, to boot android/linux.
>>>
>>> Gabe
>>>
>>> On Mon, Mar 12, 2018 at 3:26 PM, Dutu, Alexandru >> >
>>> wrote:
>>>
>>> Hi Gabe,

 Are you running SE or FS mode?

 Thanks,
 Alex

 -Original Message-
 From: gem5-dev [mailto:gem5-dev-boun...@gem5.org] On Behalf Of Gabe
 Black
 Sent: Friday, March 9, 2018 5:46 PM
 To: gem5 Developer List 
 Subject: [gem5-dev] Multicore ARM v8 KVM based simulation

 Hi folks. I have a config script set up where I can run a KVM based ARM
 v8
 simulation just fine when I have a single CPU in it, but when I try
 running
 with more than one CPU, it just seems to get lost and not do anything.
 Is
 this a configuration that's supported? If so, are there any 

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-14 Thread Gabe Black
I tried it just now, and I still don't see anything on the console. I
switched back to using my own script since it's a bit simpler (it doesn't
use all the configs/common stuff), and started looking at the KVM debug
output. I see that both cpus claim to execute instructions, although cpu1
didn't take an exit in the output I was looking at. cpu0 took four exits,
two which touched some UART registers, and two which touched RealView
registes, the V2M_SYS_CFGDATA and V2M_SYS_CFGCTRL registers judging by the
comments in the bootloader assembly file.

After that they claim to be doing stuff, although I see no further console
output or KVM exits. The accesses themselves and their PCs are from the
bootloader blob, and so I'm pretty confident that it's starting that and
executing some of those instructions. One thing that looks very odd now
that I think about it, is that the KVM messages about entering and
executing instructions (like those below) seem to say that cpu0 has
executed thousands of instructions, but the exits I see seem to correspond
to the first maybe 50 instructions it should be seeing in the bootloader
blob. Are those values bogus for some reason? Is there some existing debug
output which would let me see where KVM thinks it is periodically to see if
it's in the kernel or if it went bananas and is executing random memory
somewhere? Or if it just got stuck waiting for some event that's not going
to show up?

Are there any important CLs which haven't made their way into upstream
somehow?

Gabe

On Wed, Mar 14, 2018 at 4:28 AM, Andreas Sandberg 
wrote:

> Have you tried using the fs_bigLITTLE script in configs/examples/arm?
> That's the script I have been using for testing.
>
> I just tested the script with 8 little CPUs and 0 big CPUs and it seems
> to work. Timing is a bit temperamental though, so you might need to
> override the simulation quantum. The default is 1ms, you might need to
> decrease it to something slightly smaller (I'm currently using 0.5ms).
> Another caveat is that there seem to be some issues related to dtb
> auto-generation that affect KVM guests. We are currently testing a
> solution for this issue.
>
> Cheers,
> Andreas
>
>
>
> On 12/03/2018 22:26, Gabe Black wrote:
>
>> I'm trying to run in FS mode, to boot android/linux.
>>
>> Gabe
>>
>> On Mon, Mar 12, 2018 at 3:26 PM, Dutu, Alexandru 
>> wrote:
>>
>> Hi Gabe,
>>>
>>> Are you running SE or FS mode?
>>>
>>> Thanks,
>>> Alex
>>>
>>> -Original Message-
>>> From: gem5-dev [mailto:gem5-dev-boun...@gem5.org] On Behalf Of Gabe
>>> Black
>>> Sent: Friday, March 9, 2018 5:46 PM
>>> To: gem5 Developer List 
>>> Subject: [gem5-dev] Multicore ARM v8 KVM based simulation
>>>
>>> Hi folks. I have a config script set up where I can run a KVM based ARM
>>> v8
>>> simulation just fine when I have a single CPU in it, but when I try
>>> running
>>> with more than one CPU, it just seems to get lost and not do anything. Is
>>> this a configuration that's supported? If so, are there any caveats to
>>> how
>>> it's set up? I may be missing something simple, but it's not apparent to
>>> me
>>> at the moment.
>>>
>>> Gabe
>>> ___
>>> gem5-dev mailing list
>>> gem5-dev@gem5.org
>>> http://m5sim.org/mailman/listinfo/gem5-dev
>>> ___
>>> gem5-dev mailing list
>>> gem5-dev@gem5.org
>>> http://m5sim.org/mailman/listinfo/gem5-dev
>>>
>> ___
>> gem5-dev mailing list
>> gem5-dev@gem5.org
>> http://m5sim.org/mailman/listinfo/gem5-dev
>>
>
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.
>
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-14 Thread Andreas Sandberg

Have you tried using the fs_bigLITTLE script in configs/examples/arm?
That's the script I have been using for testing.

I just tested the script with 8 little CPUs and 0 big CPUs and it seems
to work. Timing is a bit temperamental though, so you might need to
override the simulation quantum. The default is 1ms, you might need to
decrease it to something slightly smaller (I'm currently using 0.5ms).
Another caveat is that there seem to be some issues related to dtb
auto-generation that affect KVM guests. We are currently testing a
solution for this issue.

Cheers,
Andreas


On 12/03/2018 22:26, Gabe Black wrote:

I'm trying to run in FS mode, to boot android/linux.

Gabe

On Mon, Mar 12, 2018 at 3:26 PM, Dutu, Alexandru 
wrote:


Hi Gabe,

Are you running SE or FS mode?

Thanks,
Alex

-Original Message-
From: gem5-dev [mailto:gem5-dev-boun...@gem5.org] On Behalf Of Gabe Black
Sent: Friday, March 9, 2018 5:46 PM
To: gem5 Developer List 
Subject: [gem5-dev] Multicore ARM v8 KVM based simulation

Hi folks. I have a config script set up where I can run a KVM based ARM v8
simulation just fine when I have a single CPU in it, but when I try running
with more than one CPU, it just seems to get lost and not do anything. Is
this a configuration that's supported? If so, are there any caveats to how
it's set up? I may be missing something simple, but it's not apparent to me
at the moment.

Gabe
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-12 Thread Gabe Black
I'm trying to run in FS mode, to boot android/linux.

Gabe

On Mon, Mar 12, 2018 at 3:26 PM, Dutu, Alexandru 
wrote:

> Hi Gabe,
>
> Are you running SE or FS mode?
>
> Thanks,
> Alex
>
> -Original Message-
> From: gem5-dev [mailto:gem5-dev-boun...@gem5.org] On Behalf Of Gabe Black
> Sent: Friday, March 9, 2018 5:46 PM
> To: gem5 Developer List 
> Subject: [gem5-dev] Multicore ARM v8 KVM based simulation
>
> Hi folks. I have a config script set up where I can run a KVM based ARM v8
> simulation just fine when I have a single CPU in it, but when I try running
> with more than one CPU, it just seems to get lost and not do anything. Is
> this a configuration that's supported? If so, are there any caveats to how
> it's set up? I may be missing something simple, but it's not apparent to me
> at the moment.
>
> Gabe
> ___
> gem5-dev mailing list
> gem5-dev@gem5.org
> http://m5sim.org/mailman/listinfo/gem5-dev
> ___
> gem5-dev mailing list
> gem5-dev@gem5.org
> http://m5sim.org/mailman/listinfo/gem5-dev
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Multicore ARM v8 KVM based simulation

2018-03-12 Thread Dutu, Alexandru
Hi Gabe,

Are you running SE or FS mode?

Thanks,
Alex

-Original Message-
From: gem5-dev [mailto:gem5-dev-boun...@gem5.org] On Behalf Of Gabe Black
Sent: Friday, March 9, 2018 5:46 PM
To: gem5 Developer List 
Subject: [gem5-dev] Multicore ARM v8 KVM based simulation

Hi folks. I have a config script set up where I can run a KVM based ARM v8 
simulation just fine when I have a single CPU in it, but when I try running 
with more than one CPU, it just seems to get lost and not do anything. Is this 
a configuration that's supported? If so, are there any caveats to how it's set 
up? I may be missing something simple, but it's not apparent to me at the 
moment.

Gabe
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev