I was imagining that we flush the caches during the serialization/checkpointing 
process, not before it.  I'm thinking the cache trace creation is the first 
step of the Ruby serialize function, then we flush the caches, and final we 
take the memory checkpoint.  Is there a reason we can do that in that order?

Brad


> -----Original Message-----
> From: Nilay Vaish [mailto:[email protected]]
> Sent: Thursday, December 08, 2011 3:58 PM
> To: Beckmann, Brad
> Cc: Steve Reinhardt; Ali Saidi; Gabe Black; Nathan Binkert; Default
> Subject: RE: Review Request: Ruby: Resurrect Cache Warmup Capability
> 
> Brad, you are right. Now that I think of it, it really does not make much 
> sense
> to take periodic checkpoints when the simulation is in timing mode (and not
> in atomic mode) as checkpointing interferes with the timing.
> 
> I was thinking about checkpointing the memory image. I have not been able
> to convince myself about some reasonably correct way of doing this. We
> need to flush the caches, before we can take a checkpoint. It appears this
> can only happen while the system is draining. My understanding of cache
> flushing is that it would write back the data to the memory and invalidate the
> cache line. Since the cache does not have the line any more, this means that
> we cannot have that line in the cache trace. It seems only the lines that have
> access permission as Read_Only can be part of the cache trace. Is my
> understanding correct?
> 
> Thanks
> Nilay
> 
> On Thu, 8 Dec 2011, Beckmann, Brad wrote:
> 
> > I'm curious to know why you want to support periodic checkpointing
> > with Ruby.  Certainly periodic checkpointing with the Classic memory
> > system is desired, especially in atomic mode.  It makes sense to use
> > Classic+atomic w/ periodic checkpointing to find the interesting parts
> > of a workload and the run from those interesting checkpoints using
> > more detailed simulation (Ruby, O3, etc.).  However, due to the
> > slowdown of Ruby, it is not clear to me why one would want to use
> > periodic checkpointing with Ruby.  Furthermore, as you know, taking a
> > Ruby checkpoint perturbs the system.  Ruby requires that all
> > outstanding requests be completed before checkpointing the memory and
> cache state.
> > I would like to avoid having to take a Ruby checkpoint unless
> > absolutely necessary.  One may argue that we should checkpoint all the
> > outstanding state in Ruby so that checkpoint doesn't perturb the
> > system, but I strongly believe that it is important to make Ruby
> > checkpoints protocol and configuration agnostic.  Tuning workloads is
> > a tough job and once one creates a good set of checkpoints, you want
> > to leverage that work as much as possible.
> >
> > Brad
> >
> >
> >
> >> -----Original Message-----
> >> From: Nilay Vaish [mailto:[email protected]]
> >> Sent: Thursday, December 08, 2011 6:56 AM
> >> To: Beckmann, Brad
> >> Cc: Steve Reinhardt; Ali Saidi; Gabe Black; Nathan Binkert; Default
> >> Subject: RE: Review Request: Ruby: Resurrect Cache Warmup Capability
> >>
> >> Brad, but flushing the caches would mean that we cannot support
> >> periodic checkpointing.
> >>
> >> --
> >> Nilay
> >>
> >> On Wed, 7 Dec 2011, Beckmann, Brad wrote:
> >>
> >>> Switching to email.
> >>>
> >>> The thing to remember is the cache trace doesn’t keep track of
> >>> whether shared data is dirty or not.  It simply marks that address
> >>> for a load request.  We don’t want to store dirty state in the cache
> >>> since we want to make these traces protocol agnostic and each
> >>> protocol can potentially manage dirty data differently.  That is why
> >>> the current patch breaks those checks.
> >>>
> >>> Brad
> >>>
> >>>
> >>>
> >>> Brad, thanks for the review. I can take care of all of the things
> >>> you have pointed
> >>>
> >>> out. I'll add functions for serializing and unserializing the memory 
> >>> image.
> >>>
> >>>
> >>>
> >>> But I have other questions. Is flushing the cache necessary? If we
> >>> are correctly
> >>>
> >>> restoring the data in the caches, I think that we can checkpoint the
> >>> memory
> >>>
> >>> image even with stale data. Secondly, why were those checks breaking
> >> earlier?
> >>>
> >>> I picked those lines directly from the patch you had provided to
> Somayeh.
> >>>
> >>>
> >>> - Nilay
> >>>
> >
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to