> On April 20, 2013, 8:25 a.m., Ali Saidi wrote: > > Cool! I wonder if this is what causes the long SE regressions to change a > > little bit every so often. I agree with Andreas' name suggestion, but > > otherwise thanks! How did you figure it out? > >
One of my checkpoints worked on the cluster but failed on my laptop. The only difference was the build environment and it failed fairly quickly, so I did a quick run with valgrind. I really hit the first bug, but noticed the second bug while fixing it. Here's what happened: 1. Checkpoint is restored. BTB holds zeros, page table cache holds zeros as well. 2. Program starts running, a branch redirects fetch to address zero (indirect unconditional) 4678654008000: system.cpu.fetch: [tid:0]: Instruction is: uopReg_uop.w pc, r35 4678654008000: system.cpu.fetch: [tid:0]: [sn:85]: Branch predicted to be taken to (0x4=>0x8).(0=>1). 3. Zero hits in the pagetable cache, so no page fault is signaled 4678654008000: system.cpu.workload: Translating: 0->0 4678654008000: system.cpu.itb: Translation returning delay=0 fault=0 4. Depending on the value of the invalid TLB entry either the warning at src/cpu/o3/fetch_impl.hh:639 is issued and fetch stalls forever (this really should be a panic, not a warning) or the frontend continues accessing and creating invalid instructions (accessing uninitialized data) until the original branch properly resolves and squashes everything. - Mitch ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/1830/#review4265 ----------------------------------------------------------- On April 20, 2013, 1:03 a.m., Mitch Hayenga wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/1830/ > ----------------------------------------------------------- > > (Updated April 20, 2013, 1:03 a.m.) > > > Review request for Default. > > > Description > ------- > > Fixes two bugs relating to software caching of PageTable entries. > > The existing implementation can read uninitialized data or stale information > from the cached PageTable entries. > > 1) Add a valid bit for the cache entries. Simply using zero for the virtual > address to signify invalid entries is not sufficient. Speculative, > wrong-path accesses frequently access page zero. The current implementation > would return a uninitialized TLB entry when address zero was accessed and the > PageTable cache entry was invalid. > > 2) When unmapping/mapping/remaping a page, invalidate the corresponding > PageTable cache entry if one already exists. > > > Diffs > ----- > > src/mem/page_table.hh 745e42ffcc80 > src/mem/page_table.cc 745e42ffcc80 > > Diff: http://reviews.gem5.org/r/1830/diff/ > > > Testing > ------- > > > Thanks, > > Mitch Hayenga > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
