Dan:

The trace buffer itself doesn't allow you to make the counters wider.
It gives you a place to store the counts periodically.  This is used to
create a histogram of how the counts accumulated.  As you said, you
could then sum the stored values from the trace buffer in a 64 bit
variable to present a count that was more then 32 or 16 bits.  I suspect
the overhead of trying to do this would make the virtual 64 bit counter
support expensive.  Also, it would not be possible to generate
interrupts when the counters were full.  This is needed for sampling. I
haven't thought completely through this but I think trying to do virtual
counters with the trace buffer would not be real clean or efficient. 

The debugger they talk about is a hardware level debug facility.
Basically the debug bus which is used to route the performance counter
signals from the islands to the performance counters can also be used to
route internal hardware signals to a debug port where you then connect
up a hardware debugger (logic analyzer).  This debugger is independent
of your software debuggers such as GDB.  You can either enable the
performance counters or the hardware debug at a given time, but not
both.  

             Carl Love


On Tue, 2007-03-27 at 15:36 -0400, Dan Terpstra wrote:
> Sorry for being so late to this conversation, and for being naïve about Cell
> implementation. My reading of the May 2006 BE Handbook suggested that
> counter values are (or can be) automatically stored to the 1024 entry trace
> array on interval timer timeout; and that an interrupt can be generated on
> trace array full. Could this feature be used to increase the effective width
> of the counters by 10 (2^10 = 1024) bits? This could reduce interrupt
> handling significantly, but would require summing the values across the
> trace array.
> Also, there are repeated warnings that the counter logic and the debug logic
> share the same hardware. Does this imply that the debugger dies if the
> counters are in use? Or that the debugger stomps on Perfmon? Will it be
> possible to use debuggers and Perfmon simultaneously?
> Curiosity killed the cat...
> - d
> 
> > -----Original Message-----
> > From: [EMAIL PROTECTED] [mailto:perfmon-
> > [EMAIL PROTECTED] On Behalf Of Carl Love
> > Sent: Monday, March 26, 2007 8:33 PM
> > To: [EMAIL PROTECTED]
> > Cc: [EMAIL PROTECTED]; William Cohen; Carl Love; Kevin Corry;
> > Philip Mucci
> > Subject: Re: [perfmon] Cell port for Perfmon
> > 
> > Stephen:
> > 
> > Right, sorry I put the wrong variable in my message.  I think the key
> > thing is that the register on Cell to get the mask of overflowed is read
> > and stored in set->povfl_pmds.  The hardware automatically clears the
> > bits in the CELL register as a side effect of reading the register.
> > Then the povfl_pmds is used in the overflow handler to process all of
> > the registers.
> > 
> > If we had to read the cell register to get the pmd overflow each time in
> > a loop to see if register i had overflowed, we would have a problem in
> > that the overflow bits would have been cleared when the first pmd
> > register was processed.  So I think the architecture of the code will
> > work given the underlying hardware design.  On cell, we will not have to
> > do anything to clear the interrupt mask.
> > 
> >          Carl Love
> > 
> > 
> > On Mon, 2007-03-26 at 15:19 -0800, Stephane Eranian wrote:
> > > Carl,
> > >
> > > On Mon, Mar 26, 2007 at 04:05:15PM -0800, Carl Love wrote:
> > > > If I read the overflow code correctly, the mask of the registers that
> > > > overflowed is stored in set->reset_pmds before the overflow hander is
> > > > called.  Then the overflow handler does all of the registers in a
> > loop.
> > > > It then determines if there were any 64 bit counter overflows or if
> > the
> > > > overflow was simple an overflow of the smaller HW counter register.
> > > > >From what I see so far, it seems like the Cell interrupt
> > enable/overflow
> > > > reporting should work ok within the perfmon2 code structure.
> > > >
> > > Not quite. Upon entering the interrupt handler, the PMU is frozen
> > > and a bitmask of overflowed counters is constructed in arch-specific
> > > fashion. On IA-64 (like CEll), it's just a matter of reading a control
> > > registers. On i386, there is no overflowed mask, you need to inspect
> > > all used counter and check their values. The collected information is
> > > in set->povfl_pmds and set->npend_ovfls. Worst processor is P4 because
> > > to freeze the PMU, you need to clear the control register which also
> > hold
> > > the overflow bit (OVF).
> > >
> > > The interrupt handler then scans povfl_pmds to update the 64-bit
> > > sotware maintained counter values. If it detects a 64-bit overflow
> > > then it does record a sample and/or notify user level. Otherwise
> > > execution resumes.
> > >
> > > Upon leaving the interrupt handler, the PMU is unfrozen unless
> > > the sampling buffer became full in which case (default format)
> > > monitoring remains stopped.
> > >
> > > > On Mon, 2007-03-26 at 17:45 -0500, Kevin Corry wrote:
> > > > > Hi Stephane,
> > > > >
> > > > > On Mon March 26 2007 5:30 pm, Stephane Eranian wrote:
> > > > > > > > I think it should not be too much work to put the field with
> > in the
> > > > > > > > description table. With a flag, high level perfmon can just
> > skip
> > > > > > > > consulting this field and go with a default.
> > > > > > >
> > > > > > > Yeah, I had similar thoughts about how to support multiple
> > counter sizes.
> > > > > > > It should be relatively easy to add a counter_size field to the
> > pfm_pmd
> > > > > > > structure and consult that in the overflow handling code.
> > > > > >
> > > > > > Yes, that is one place where the mask is used. But it is also used
> > > > > > when we write and read PMD registers (counters). I don't know how
> > this
> > > > > > works on Cell, but on x86, you needs to set the upper bits of a
> > counter
> > > > > > for it to trigger the PMU interrupt on overflow. For that you also
> > need
> > > > > > to apply the counter width mask. The mask may also be used to
> > determine
> > > > > > which counter overflowed, unless Cell provides a bitmask for that
> > already.
> > > > >
> > > > > Interesting, I didn't realize that. I had only worked with the
> > Pentium4
> > > > > previously, and it has the counter-overflow bit and the interrupt-
> > enable bit
> > > > > in the per-counter control reigsters (CCCRs).
> > > > >
> > > > > On Cell, there is one global pm_status control register that is used
> > to enable
> > > > > interrupts for each counter and to determine which counters
> > overflowed (along
> > > > > with some status bits related to the hardware sampling feature).
> > > > >
> > > > > Hmmm....now that I take another glance at the Cell PMU docs, I see
> > that
> > > > > reading the pm_status register clears all the status bits and resets
> > the
> > > > > pending interrupts. This means that the overflow handler may have to
> > handle
> > > > > the overflow of multiple counters in one run (in addition to dealing
> > with
> > > > > hardware-sampling interrupts). I haven't gone through Perfmon2's
> > overflow
> > > > > interrupt handling enough to know if this will cause any problems.
> > Any
> > > > > thoughts?
> > > > >
> > > > > Thanks,
> > > >
> > > > _______________________________________________
> > > > perfmon mailing list
> > > > [email protected]
> > > > http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
> > >
> > 
> > _______________________________________________
> > perfmon mailing list
> > [email protected]
> > http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
> 
> 
> _______________________________________________
> perfmon mailing list
> [email protected]
> http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Reply via email to