On Fri, 11 Jan 2008, dean gaudet wrote:
> On Fri, 11 Jan 2008, Ingo Molnar wrote:
>
> > * Andi Kleen <[EMAIL PROTECTED]> wrote:
> >
> > > Cached requires the cache line to be read first before you can write
> > > it.
> >
> > nonsense, and you should know it. It is perfectly possible to constru
On Fri, 11 Jan 2008 09:02:46 -0800 (PST)
dean gaudet <[EMAIL PROTECTED]> wrote:
> > Bulk ops (string ops, etc.) will do full cacheline writes too,
> > without filling in the cacheline.
>
> on intel with fast strings enabled yes. mind you intel gives hints in
> the documentation these operations
On Fri, 11 Jan 2008, Ingo Molnar wrote:
> * Andi Kleen <[EMAIL PROTECTED]> wrote:
>
> > Cached requires the cache line to be read first before you can write
> > it.
>
> nonsense, and you should know it. It is perfectly possible to construct
> fully written cachelines, without reading the cache
> It is perfectly possible to construct
> fully written cachelines, without reading the cacheline first. MOVDQ is
If you write a aligned full 64 (or 128) byte area and even then you can
have occassional reads which can be either painfully slow or even incorrect.
> but that's totally besides th
> Write-Combining can be very useful for devices that are behind a slow or
> a high-latency transport, such as PCI, and which are mapped UnCached
That is what I wrote! If you meant the same we must have been
spectacularly miscommunicating.
-Andi
--
To unsubscribe from this list: send the line "
* Ingo Molnar <[EMAIL PROTECTED]> wrote:
> > > I think you have it fundamentally backwards: the best for
> > > performance is WB + cflush. What would WC offer for performance
> > > that cflush cannot do?
> >
> > Cached requires the cache line to be read first before you can write
> > it.
>
>
* Andi Kleen <[EMAIL PROTECTED]> wrote:
> > > > but that's not too smart: why dont they use WB plus cflush
> > > > instead?
> > >
> > > Because they need to access it WC for performance.
> >
> > I think you have it fundamentally backwards: the best for
> > performance is WB + cflush. What wou
On Thu, Jan 10, 2008 at 01:22:04PM +0100, Ingo Molnar wrote:
>
> * Andi Kleen <[EMAIL PROTECTED]> wrote:
>
> > > What is very real though are the hard limitations of MTRRs. So i'd
> > > rather first like to see a clean PAT approach (which all other
> > > modern OSs have already migrated to in t
* Andi Kleen <[EMAIL PROTECTED]> wrote:
> > What is very real though are the hard limitations of MTRRs. So i'd
> > rather first like to see a clean PAT approach (which all other
> > modern OSs have already migrated to in the past 10 years)
>
> That's mostly orthogonal. Don't know why you bring
On Thu, Jan 10, 2008 at 11:57:26AM +0100, Ingo Molnar wrote:
>
> > > > > WBINVD isnt particular fast (takes a few msecs), but why is
> > > > > that a problem? Drivers dont do high-frequency ioremap-ing.
> > > > > It's typically only done at driver/device startup and that's
> > > > > it.
other modern
That's mostly orthogonal. Don't know why you bring it up now?
Anyways more efficient c_p_a() makes PAT usage easier.
> structural cleanups and bugfixes you did as well, which would allow us
> to phase out MTRR use (of the DRM drivers, etc.), and _then_ layer an
* Andi Kleen <[EMAIL PROTECTED]> wrote:
> > > > WBINVD isnt particular fast (takes a few msecs), but why is
> > > > that a problem? Drivers dont do high-frequency ioremap-ing.
> > > > It's typically only done at driver/device startup and that's
> > > > it.
> > >
> > > Actually graphic
On Thu, Jan 10, 2008 at 08:20:26PM +1000, Dave Airlie wrote:
> This is only possible as long as we know all the parts involved, for
> example on AMD we have problems with that
> over-eager prefetching so for drivers on AMD chipsets we have to do
> something else more than likely using change_page_a
ich all other modern
OSs have already migrated to in the past 10 years), with all the
structural cleanups and bugfixes you did as well, which would allow us
to phase out MTRR use (of the DRM drivers, etc.), and _then_ layer an
(optional) cflush approach basically as the final step. Right now cflus
On Jan 10, 2008 7:55 PM, Andi Kleen <[EMAIL PROTECTED]> wrote:
> On Thu, Jan 10, 2008 at 07:44:03PM +1000, Dave Airlie wrote:
> > >
> > > finally managed to get the time to review your CPA patchset, and i
> > > fundamentally agree with most of the detail chan
On Thu, Jan 10, 2008 at 11:04:43AM +0100, Ingo Molnar wrote:
>
> * Andi Kleen <[EMAIL PROTECTED]> wrote:
>
> > > WBINVD isnt particular fast (takes a few msecs), but why is that a
> > > problem? Drivers dont do high-frequency ioremap-ing. It's
> > > typically only done at driver/device st
* Andi Kleen <[EMAIL PROTECTED]> wrote:
> > WBINVD isnt particular fast (takes a few msecs), but why is that a
> > problem? Drivers dont do high-frequency ioremap-ing. It's
> > typically only done at driver/device startup and that's it.
>
> Actually graphics drivers can do higher frequen
* Dave Airlie <[EMAIL PROTECTED]> wrote:
> > - firstly, there's no rationale given. So we'll change ioremap()/etc.
> > from doing a cflush-range instruction instead of a WBINVD. But why?
> > WBINVD isnt particular fast (takes a few msecs), but why is that a
> > problem? Drivers dont do high
On Thu, Jan 10, 2008 at 07:44:03PM +1000, Dave Airlie wrote:
> >
> > finally managed to get the time to review your CPA patchset, and i
> > fundamentally agree with most of the detail changes done in it. But here
> > are a few structural high-level observations:
>
On Thu, Jan 10, 2008 at 10:31:26AM +0100, Ingo Molnar wrote:
>
> Andi,
>
> finally managed to get the time to review your CPA patchset, and i
> fundamentally agree with most of the detail changes done in it. But here
> are a few structural high-level observations:
I have
>
> finally managed to get the time to review your CPA patchset, and i
> fundamentally agree with most of the detail changes done in it. But here
> are a few structural high-level observations:
>
> - firstly, there's no rationale given. So we'll change ioremap()/etc.
&
Andi,
finally managed to get the time to review your CPA patchset, and i
fundamentally agree with most of the detail changes done in it. But here
are a few structural high-level observations:
- firstly, there's no rationale given. So we'll change ioremap()/etc.
from doing a cf
22 matches
Mail list logo