Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-09 Thread Ivan Kokshaysky

On Mon, Apr 09, 2001 at 12:02:54PM +0200, Maciej W. Rozycki wrote:
>  I think you need an mb here.  To force sychronization with other CPUs.
> Unless you know you are UP or there is no possibility another CPU may
> access the relevant device.

Yes - in most cases you need synchronization at a higher level.
For instance, you don't want other CPUs accessing the device while
you are sending command sequences to it.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-09 Thread Maciej W. Rozycki

On Sun, 8 Apr 2001, Ivan Kokshaysky wrote:

> Of course. I meant that if you are reading, for example, some status register
> in a loop waiting for "ready bit" set, the memory barrier won't help you
> to notice this event any faster. Actually you'll notice that *later*, as
> "mb" is expensive.

 I think you need an mb here.  To force sychronization with other CPUs.
Unless you know you are UP or there is no possibility another CPU may
access the relevant device.

 Of course mbs hit performance but it's a trade off for coherency. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-09 Thread Maciej W. Rozycki

On Sun, 8 Apr 2001, Ivan Kokshaysky wrote:

 Of course. I meant that if you are reading, for example, some status register
 in a loop waiting for "ready bit" set, the memory barrier won't help you
 to notice this event any faster. Actually you'll notice that *later*, as
 "mb" is expensive.

 I think you need an mb here.  To force sychronization with other CPUs.
Unless you know you are UP or there is no possibility another CPU may
access the relevant device.

 Of course mbs hit performance but it's a trade off for coherency. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-09 Thread Ivan Kokshaysky

On Mon, Apr 09, 2001 at 12:02:54PM +0200, Maciej W. Rozycki wrote:
  I think you need an mb here.  To force sychronization with other CPUs.
 Unless you know you are UP or there is no possibility another CPU may
 access the relevant device.

Yes - in most cases you need synchronization at a higher level.
For instance, you don't want other CPUs accessing the device while
you are sending command sequences to it.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-08 Thread Ivan Kokshaysky

On Fri, Apr 06, 2001 at 07:13:21PM +0200, Maciej W. Rozycki wrote:
>  You do.  PCI-space registers are volatile and they may change depending
> on what was written (or read) previously.  A memory barrier before a PCI
> read will ensure you get a value that is relevant to previous code
> actions.  Without a barrier you may get pretty anything, depending on
> which of previous writes managed to complete before. 

Of course. I meant that if you are reading, for example, some status register
in a loop waiting for "ready bit" set, the memory barrier won't help you
to notice this event any faster. Actually you'll notice that *later*, as
"mb" is expensive.

Well, here is some info on ev6 IO write buffers - they are a bit different
than ev4/ev5 ones.
Merging rules:
 - byte/word stores aren't allowed to merge into a write buffer;
 - different size stores (32- and 64-bit) aren't allowed to merge;
 - addresses must be in ascending order and non-overlapping,
   but not necessarily consecutive.
The I/O register merge window close (ie write-buffer flushing) occurs after
 - mb and wmb instructions;
 - IO-space load instruction (!);
 - after 1024 cycles if there were no IO-space stores.
Store requests are sent offchip in program order (!).

All this explains, in particular, why XFree86-4.0 worked on ev6 without
memory barriers of any kind, while it crashed badly on ev4/ev5.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-08 Thread Ivan Kokshaysky

On Fri, Apr 06, 2001 at 07:13:21PM +0200, Maciej W. Rozycki wrote:
  You do.  PCI-space registers are volatile and they may change depending
 on what was written (or read) previously.  A memory barrier before a PCI
 read will ensure you get a value that is relevant to previous code
 actions.  Without a barrier you may get pretty anything, depending on
 which of previous writes managed to complete before. 

Of course. I meant that if you are reading, for example, some status register
in a loop waiting for "ready bit" set, the memory barrier won't help you
to notice this event any faster. Actually you'll notice that *later*, as
"mb" is expensive.

Well, here is some info on ev6 IO write buffers - they are a bit different
than ev4/ev5 ones.
Merging rules:
 - byte/word stores aren't allowed to merge into a write buffer;
 - different size stores (32- and 64-bit) aren't allowed to merge;
 - addresses must be in ascending order and non-overlapping,
   but not necessarily consecutive.
The I/O register merge window close (ie write-buffer flushing) occurs after
 - mb and wmb instructions;
 - IO-space load instruction (!);
 - after 1024 cycles if there were no IO-space stores.
Store requests are sent offchip in program order (!).

All this explains, in particular, why XFree86-4.0 worked on ev6 without
memory barriers of any kind, while it crashed badly on ev4/ev5.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Maciej W. Rozycki

On Fri, 6 Apr 2001, Andrea Arcangeli wrote:

> ev6 works the way you described AFIK (to flush the write buffer you can use

 Thanks for the clarification -- you made me calm down.

> wmb(), note that wmb() semantics doesn't require the cpu to really "flush" but
> just to keep writes oredered across other mb or wmb, but it's basically the
> same from a software point of you and flushing the write buffer synchronously
> obviously provides that semantics).  I didn't followed very closely the

 Of course -- you only want to do mb (and not wmb) if you need to meet
hw's specific timing or you want to perform a read from a volatile
register of a peripheral device. 

> previous part of the thread so I'm not sure what is the issue.

 Someone complained of Alpha not having Intel-style MTRRs to set write
combining for fb memory...

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Andrea Arcangeli

On Fri, Apr 06, 2001 at 07:27:24PM +0200, Maciej W. Rozycki wrote:
> [..] You normally have
> non-cached locations buffered (since you don't always need peripheral
> device accesses to be posted immediately) and can force a writeback with a
> memory barrier. [..]

ev6 works the way you described AFIK (to flush the write buffer you can use
wmb(), note that wmb() semantics doesn't require the cpu to really "flush" but
just to keep writes oredered across other mb or wmb, but it's basically the
same from a software point of you and flushing the write buffer synchronously
obviously provides that semantics).  I didn't followed very closely the
previous part of the thread so I'm not sure what is the issue.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Maciej W. Rozycki

On 6 Apr 2001, Eric W. Biederman wrote:

> I recall on the ev6 all memory accesses to locations with bit 40 set 
> are always to IO space are never cached and are never write buffered.

 If that is the case then EV6 is seriously flawed.  You normally have
non-cached locations buffered (since you don't always need peripheral
device accesses to be posted immediately) and can force a writeback with a
memory barrier.  I don't have my 21264 handbook handy, so I can't check
EV6 details at the moment, especially why it is different.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Maciej W. Rozycki

On Fri, 6 Apr 2001, Ivan Kokshaysky wrote:

> >  Memory barriers are a separate issue.  On the alpha the
> > natural way to implement it would be in the page table fill code.
> > Memory barriers are o.k. but the really don't help the case when what
> > you want to do is read the latest value out of a pci register.  
> 
> You don't need memory barrier for that. "Write memory barriers" are
> used to ensure correct write order, and "memory barriers" are used
> to ensure that all pending reads/writes will complete before next read
> or write.

 You do.  PCI-space registers are volatile and they may change depending
on what was written (or read) previously.  A memory barrier before a PCI
read will ensure you get a value that is relevant to previous code
actions.  Without a barrier you may get pretty anything, depending on
which of previous writes managed to complete before. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Maciej W. Rozycki

On 5 Apr 2001, Eric W. Biederman wrote:

> The point is on the Alpha all ram is always cached, and i/o space is
> completely uncached.  You cannot do write-combing for video card

 You don't want to cache fb memory, do you?  All you want is write
combining and you achieve it with memory barriers.  You write to fb memory
space whatever you need to and write buffers actually deliver data to fb
memory whenever the bus is idle or they get filled up.  When you finally
decide you wrote all data and you want ensure it actually reaches the fb
memory before you perform an operation (say you send a command to fb's
support circuitry) you issue a write memory barrier.  Or a memory barrier,
if you want ensure the data reaches the fb memory ASAP.

 In other words, you have write-combining by default and request
write-through explicitly.

> memory.  Memory barriers are a separate issue.  On the alpha the
> natural way to implement it would be in the page table fill code.

 Please forgive me -- I can't see how this is related to write combining.

> Memory barriers are o.k. but the really don't help the case when what
> you want to do is read the latest value out of a pci register.  

 They do -- you issue an mb and you are sure all pending writes reached
the involved PCI hw. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Eric W. Biederman

Ivan Kokshaysky <[EMAIL PROTECTED]> writes:

> On Thu, Apr 05, 2001 at 12:20:22PM -0600, Eric W. Biederman wrote:
> > The point is on the Alpha all ram is always cached, and i/o space is
> > completely uncached.  You cannot do write-combing for video card
> > memory.
> 
> Incorrect. Alphas have write buffers - 6x32 bytes on ev5 and
> 4x64 on ev6, IIRC. So alphas do write up to 32 or 64 bytes
> in a single pci transaction.

Sorry I was thinking the current alpha the ev6.  So what I'm saying
doesn't apply to the alpha architecture in general just it's current
specific implementation.   

Yes for the ev6 you have write buffers but can't say just use the
write buffers,  on an arbitrary area of memory. 
 
> >  Memory barriers are a separate issue.  On the alpha the
> > natural way to implement it would be in the page table fill code.
> > Memory barriers are o.k. but the really don't help the case when what
> > you want to do is read the latest value out of a pci register.  
> 
> You don't need memory barrier for that. "Write memory barriers" are
> used to ensure correct write order, and "memory barriers" are used
> to ensure that all pending reads/writes will complete before next read
> or write.

100% Agreed. That is what I was saying.  What the ev6 doesn't have
is the ability to say this: I am using this area of the memory address
space in a particular way: don't cache it but do write combing on it.

Theoretically you could use memory barrier instructions for this but
it would require an I/O bus that supported a cache coherency
protocol.  At which point the problem moves down to your PCI bus
controller.

I recall on the ev6 all memory accesses to locations with bit 40 set 
are always to IO space are never cached and are never write buffered.
Accesses to memory locations with bit 40 clear are always to RAM are
always cached and always write buffered. 

With the high I/O bus speeds unless you are trying to push things to
the absolute limit you are unlikely to see the IO accesses being the
bottleneck in or out to a PCI device.  At which point DMA probably
already compensates, for most devices.

IIRC For PCI card IO regions where you need maximum IO speed through
the memory address space (like frame buffers) the ev6 falls down.

I really like the alpha this is why this gals me so much about the
ev6.  I hope they have it fixed for the ev7 or ev8.  If those chips
ever actually arrive.  But as the ev7 is just supposed to be the ev6
core with an on chip cache I don't have much hope.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Ivan Kokshaysky

On Thu, Apr 05, 2001 at 12:20:22PM -0600, Eric W. Biederman wrote:
> The point is on the Alpha all ram is always cached, and i/o space is
> completely uncached.  You cannot do write-combing for video card
> memory.

Incorrect. Alphas have write buffers - 6x32 bytes on ev5 and
4x64 on ev6, IIRC. So alphas do write up to 32 or 64 bytes
in a single pci transaction.

>  Memory barriers are a separate issue.  On the alpha the
> natural way to implement it would be in the page table fill code.
> Memory barriers are o.k. but the really don't help the case when what
> you want to do is read the latest value out of a pci register.  

You don't need memory barrier for that. "Write memory barriers" are
used to ensure correct write order, and "memory barriers" are used
to ensure that all pending reads/writes will complete before next read
or write.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Ivan Kokshaysky

On Thu, Apr 05, 2001 at 12:20:22PM -0600, Eric W. Biederman wrote:
 The point is on the Alpha all ram is always cached, and i/o space is
 completely uncached.  You cannot do write-combing for video card
 memory.

Incorrect. Alphas have write buffers - 6x32 bytes on ev5 and
4x64 on ev6, IIRC. So alphas do write up to 32 or 64 bytes
in a single pci transaction.

  Memory barriers are a separate issue.  On the alpha the
 natural way to implement it would be in the page table fill code.
 Memory barriers are o.k. but the really don't help the case when what
 you want to do is read the latest value out of a pci register.  

You don't need memory barrier for that. "Write memory barriers" are
used to ensure correct write order, and "memory barriers" are used
to ensure that all pending reads/writes will complete before next read
or write.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Eric W. Biederman

Ivan Kokshaysky [EMAIL PROTECTED] writes:

 On Thu, Apr 05, 2001 at 12:20:22PM -0600, Eric W. Biederman wrote:
  The point is on the Alpha all ram is always cached, and i/o space is
  completely uncached.  You cannot do write-combing for video card
  memory.
 
 Incorrect. Alphas have write buffers - 6x32 bytes on ev5 and
 4x64 on ev6, IIRC. So alphas do write up to 32 or 64 bytes
 in a single pci transaction.

Sorry I was thinking the current alpha the ev6.  So what I'm saying
doesn't apply to the alpha architecture in general just it's current
specific implementation.   

Yes for the ev6 you have write buffers but can't say just use the
write buffers,  on an arbitrary area of memory. 
 
   Memory barriers are a separate issue.  On the alpha the
  natural way to implement it would be in the page table fill code.
  Memory barriers are o.k. but the really don't help the case when what
  you want to do is read the latest value out of a pci register.  
 
 You don't need memory barrier for that. "Write memory barriers" are
 used to ensure correct write order, and "memory barriers" are used
 to ensure that all pending reads/writes will complete before next read
 or write.

100% Agreed. That is what I was saying.  What the ev6 doesn't have
is the ability to say this: I am using this area of the memory address
space in a particular way: don't cache it but do write combing on it.

Theoretically you could use memory barrier instructions for this but
it would require an I/O bus that supported a cache coherency
protocol.  At which point the problem moves down to your PCI bus
controller.

I recall on the ev6 all memory accesses to locations with bit 40 set 
are always to IO space are never cached and are never write buffered.
Accesses to memory locations with bit 40 clear are always to RAM are
always cached and always write buffered. 

With the high I/O bus speeds unless you are trying to push things to
the absolute limit you are unlikely to see the IO accesses being the
bottleneck in or out to a PCI device.  At which point DMA probably
already compensates, for most devices.

IIRC For PCI card IO regions where you need maximum IO speed through
the memory address space (like frame buffers) the ev6 falls down.

I really like the alpha this is why this gals me so much about the
ev6.  I hope they have it fixed for the ev7 or ev8.  If those chips
ever actually arrive.  But as the ev7 is just supposed to be the ev6
core with an on chip cache I don't have much hope.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Maciej W. Rozycki

On 5 Apr 2001, Eric W. Biederman wrote:

 The point is on the Alpha all ram is always cached, and i/o space is
 completely uncached.  You cannot do write-combing for video card

 You don't want to cache fb memory, do you?  All you want is write
combining and you achieve it with memory barriers.  You write to fb memory
space whatever you need to and write buffers actually deliver data to fb
memory whenever the bus is idle or they get filled up.  When you finally
decide you wrote all data and you want ensure it actually reaches the fb
memory before you perform an operation (say you send a command to fb's
support circuitry) you issue a write memory barrier.  Or a memory barrier,
if you want ensure the data reaches the fb memory ASAP.

 In other words, you have write-combining by default and request
write-through explicitly.

 memory.  Memory barriers are a separate issue.  On the alpha the
 natural way to implement it would be in the page table fill code.

 Please forgive me -- I can't see how this is related to write combining.

 Memory barriers are o.k. but the really don't help the case when what
 you want to do is read the latest value out of a pci register.  

 They do -- you issue an mb and you are sure all pending writes reached
the involved PCI hw. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Maciej W. Rozycki

On Fri, 6 Apr 2001, Ivan Kokshaysky wrote:

   Memory barriers are a separate issue.  On the alpha the
  natural way to implement it would be in the page table fill code.
  Memory barriers are o.k. but the really don't help the case when what
  you want to do is read the latest value out of a pci register.  
 
 You don't need memory barrier for that. "Write memory barriers" are
 used to ensure correct write order, and "memory barriers" are used
 to ensure that all pending reads/writes will complete before next read
 or write.

 You do.  PCI-space registers are volatile and they may change depending
on what was written (or read) previously.  A memory barrier before a PCI
read will ensure you get a value that is relevant to previous code
actions.  Without a barrier you may get pretty anything, depending on
which of previous writes managed to complete before. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Maciej W. Rozycki

On 6 Apr 2001, Eric W. Biederman wrote:

 I recall on the ev6 all memory accesses to locations with bit 40 set 
 are always to IO space are never cached and are never write buffered.

 If that is the case then EV6 is seriously flawed.  You normally have
non-cached locations buffered (since you don't always need peripheral
device accesses to be posted immediately) and can force a writeback with a
memory barrier.  I don't have my 21264 handbook handy, so I can't check
EV6 details at the moment, especially why it is different.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-06 Thread Maciej W. Rozycki

On Fri, 6 Apr 2001, Andrea Arcangeli wrote:

 ev6 works the way you described AFIK (to flush the write buffer you can use

 Thanks for the clarification -- you made me calm down.

 wmb(), note that wmb() semantics doesn't require the cpu to really "flush" but
 just to keep writes oredered across other mb or wmb, but it's basically the
 same from a software point of you and flushing the write buffer synchronously
 obviously provides that semantics).  I didn't followed very closely the

 Of course -- you only want to do mb (and not wmb) if you need to meet
hw's specific timing or you want to perform a read from a volatile
register of a peripheral device. 

 previous part of the thread so I'm not sure what is the issue.

 Someone complained of Alpha not having Intel-style MTRRs to set write
combining for fb memory...

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-05 Thread Eric W. Biederman

"Maciej W. Rozycki" <[EMAIL PROTECTED]> writes:

> On Thu, 5 Apr 2001, Geert Uytterhoeven wrote:
> 
> > > 32bit writes on a bus with a word size of 64 or more bits.  By the way
> > > does anyone know who didn't implement MTRR's or the equivalent on
> > > alpha so we can shoot them?
> > 
> > People never get shot in Open Source projects. Not when they write buggy code,
> 
> > not when they don't implement some features.
> 
>  Was DEC Alpha an Open Source project? ;-)
> 
>  Memory barriers are more RISC-styled and more flexible anyway (e.g. you
> can't run out of them ;-) ), though they require a greater care when
> writing code.  MTRRs are the Intel style of complicating designs.  Still
> they are probably a reasonable solution to preserve DOS compatibility. 

The point is on the Alpha all ram is always cached, and i/o space is
completely uncached.  You cannot do write-combing for video card
memory.  Memory barriers are a separate issue.  On the alpha the
natural way to implement it would be in the page table fill code.
Memory barriers are o.k. but the really don't help the case when what
you want to do is read the latest value out of a pci register.  

Eric



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-05 Thread Maciej W. Rozycki

On Thu, 5 Apr 2001, Geert Uytterhoeven wrote:

> > 32bit writes on a bus with a word size of 64 or more bits.  By the way
> > does anyone know who didn't implement MTRR's or the equivalent on
> > alpha so we can shoot them?
> 
> People never get shot in Open Source projects. Not when they write buggy code,
> not when they don't implement some features.

 Was DEC Alpha an Open Source project? ;-)

 Memory barriers are more RISC-styled and more flexible anyway (e.g. you
can't run out of them ;-) ), though they require a greater care when
writing code.  MTRRs are the Intel style of complicating designs.  Still
they are probably a reasonable solution to preserve DOS compatibility. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-05 Thread Geert Uytterhoeven

On 5 Apr 2001, Eric W. Biederman wrote:
> 32bit writes on a bus with a word size of 64 or more bits.  By the way
> does anyone know who didn't implement MTRR's or the equivalent on
> alpha so we can shoot them?

People never get shot in Open Source projects. Not when they write buggy code,
not when they don't implement some features.

Gr{oetje,eeting}s,

Geert

P.S. Perhaps ESR tends to disagree? ;-)
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-05 Thread Eric W. Biederman

James Simmons <[EMAIL PROTECTED]> writes:

> >>> As long as you are copying in real memory. So the PCI bus or the host
> bridge
> >>> implementation may be the actual limit.
> >>
> >> The CyrixIII sits on the same host bridges as the intel processors
> >
> >I don't know if it applies to this case but one thing I have seen make
> >a noticeable difference is whether or not write-combining is enabled.
> >If we have only be enabling MTRR's for intel this could do account
> >for it.
> 
> I think what Geert was trying to point out is does MTRR perform was well
> with normal memory over bus to video memory transfers as compared to
> normal memory to normal memory transfers. MTTRs might not be optimzed for
> these kinds of transfers. I honestly can't say since I haven't tried it. I
> brought the MMX book home from works so I'm going to be experimenting
> with it this weekend to find out. I really like to compare the MMX
> performance to the word aligned transfers over the bus I have going. I had
> a bug in my soft accel code that prevented word alignment. Once I fixed
> that bug I seen a 10 fold improvement in rendering on the framebuffer.
> I'm not kidding about that improvement either :-)
> 
> MTTRs enabled always makes a difference. I liek to try it with and
> without. I will do some benchmarkings.

While I'm thinking about it what we really should be using is the PAT
extension and not MTRR's.  The PAT extension allows you to set the
attributes per page so you don't have the resource contention you do
with MTRR's.  I can just imagine the performance challenges right now
if you try to do a multi-head where multi > number of free MTRR's.

What happens with write-combining is active is that close adjacent
writes are batched together.  Without write-combining you tend to get
32bit writes on a bus with a word size of 64 or more bits.  By the way
does anyone know who didn't implement MTRR's or the equivalent on
alpha so we can shoot them?

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-05 Thread Eric W. Biederman

James Simmons [EMAIL PROTECTED] writes:

  As long as you are copying in real memory. So the PCI bus or the host
 bridge
  implementation may be the actual limit.
 
  The CyrixIII sits on the same host bridges as the intel processors
 
 I don't know if it applies to this case but one thing I have seen make
 a noticeable difference is whether or not write-combining is enabled.
 If we have only be enabling MTRR's for intel this could do account
 for it.
 
 I think what Geert was trying to point out is does MTRR perform was well
 with normal memory over bus to video memory transfers as compared to
 normal memory to normal memory transfers. MTTRs might not be optimzed for
 these kinds of transfers. I honestly can't say since I haven't tried it. I
 brought the MMX book home from works so I'm going to be experimenting
 with it this weekend to find out. I really like to compare the MMX
 performance to the word aligned transfers over the bus I have going. I had
 a bug in my soft accel code that prevented word alignment. Once I fixed
 that bug I seen a 10 fold improvement in rendering on the framebuffer.
 I'm not kidding about that improvement either :-)
 
 MTTRs enabled always makes a difference. I liek to try it with and
 without. I will do some benchmarkings.

While I'm thinking about it what we really should be using is the PAT
extension and not MTRR's.  The PAT extension allows you to set the
attributes per page so you don't have the resource contention you do
with MTRR's.  I can just imagine the performance challenges right now
if you try to do a multi-head where multi  number of free MTRR's.

What happens with write-combining is active is that close adjacent
writes are batched together.  Without write-combining you tend to get
32bit writes on a bus with a word size of 64 or more bits.  By the way
does anyone know who didn't implement MTRR's or the equivalent on
alpha so we can shoot them?

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-05 Thread Geert Uytterhoeven

On 5 Apr 2001, Eric W. Biederman wrote:
 32bit writes on a bus with a word size of 64 or more bits.  By the way
 does anyone know who didn't implement MTRR's or the equivalent on
 alpha so we can shoot them?

People never get shot in Open Source projects. Not when they write buggy code,
not when they don't implement some features.

Gr{oetje,eeting}s,

Geert

P.S. Perhaps ESR tends to disagree? ;-)
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-05 Thread Maciej W. Rozycki

On Thu, 5 Apr 2001, Geert Uytterhoeven wrote:

  32bit writes on a bus with a word size of 64 or more bits.  By the way
  does anyone know who didn't implement MTRR's or the equivalent on
  alpha so we can shoot them?
 
 People never get shot in Open Source projects. Not when they write buggy code,
 not when they don't implement some features.

 Was DEC Alpha an Open Source project? ;-)

 Memory barriers are more RISC-styled and more flexible anyway (e.g. you
can't run out of them ;-) ), though they require a greater care when
writing code.  MTRRs are the Intel style of complicating designs.  Still
they are probably a reasonable solution to preserve DOS compatibility. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-05 Thread Eric W. Biederman

"Maciej W. Rozycki" [EMAIL PROTECTED] writes:

 On Thu, 5 Apr 2001, Geert Uytterhoeven wrote:
 
   32bit writes on a bus with a word size of 64 or more bits.  By the way
   does anyone know who didn't implement MTRR's or the equivalent on
   alpha so we can shoot them?
  
  People never get shot in Open Source projects. Not when they write buggy code,
 
  not when they don't implement some features.
 
  Was DEC Alpha an Open Source project? ;-)
 
  Memory barriers are more RISC-styled and more flexible anyway (e.g. you
 can't run out of them ;-) ), though they require a greater care when
 writing code.  MTRRs are the Intel style of complicating designs.  Still
 they are probably a reasonable solution to preserve DOS compatibility. 

The point is on the Alpha all ram is always cached, and i/o space is
completely uncached.  You cannot do write-combing for video card
memory.  Memory barriers are a separate issue.  On the alpha the
natural way to implement it would be in the page table fill code.
Memory barriers are o.k. but the really don't help the case when what
you want to do is read the latest value out of a pci register.  

Eric



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-04 Thread James Simmons


>>> As long as you are copying in real memory. So the PCI bus or the host
bridge
>>> implementation may be the actual limit.
>>
>> The CyrixIII sits on the same host bridges as the intel processors
>
>I don't know if it applies to this case but one thing I have seen make
>a noticeable difference is whether or not write-combining is enabled.
>If we have only be enabling MTRR's for intel this could do account
>for it.

I think what Geert was trying to point out is does MTRR perform was well
with normal memory over bus to video memory transfers as compared to
normal memory to normal memory transfers. MTTRs might not be optimzed for
these kinds of transfers. I honestly can't say since I haven't tried it. I
brought the MMX book home from works so I'm going to be experimenting
with it this weekend to find out. I really like to compare the MMX
performance to the word aligned transfers over the bus I have going. I had
a bug in my soft accel code that prevented word alignment. Once I fixed
that bug I seen a 10 fold improvement in rendering on the framebuffer.
I'm not kidding about that improvement either :-)

MTTRs enabled always makes a difference. I liek to try it with and
without. I will do some benchmarkings.

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-04 Thread Jamie Lokier

Eric W. Biederman wrote:
> I don't know if it applies to this case but one thing I have seen make
> a noticeable difference is whether or not write-combining is enabled.
> If we have only be enabling MTRR's for intel this could do account
> for it.

And on some laptops, even on Intel MTRRs are not enabled for 2.5M
framebuffers.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-04 Thread Eric W. Biederman

Alan Cox <[EMAIL PROTECTED]> writes:

> > > The MMX memcpy for CyrixIII and Athlon boxes is something like twice the
> > > speed of rep movs. On most pentium II/III boxes the fast paths for rep movs
> > > and for MMX are the same speed
> > 
> > As long as you are copying in real memory. So the PCI bus or the host bridge
> > implementation may be the actual limit.
> 
> The CyrixIII sits on the same host bridges as the intel processors

I don't know if it applies to this case but one thing I have seen make
a noticeable difference is whether or not write-combining is enabled.
If we have only be enabling MTRR's for intel this could do account
for it.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-04 Thread Eric W. Biederman

Alan Cox [EMAIL PROTECTED] writes:

   The MMX memcpy for CyrixIII and Athlon boxes is something like twice the
   speed of rep movs. On most pentium II/III boxes the fast paths for rep movs
   and for MMX are the same speed
  
  As long as you are copying in real memory. So the PCI bus or the host bridge
  implementation may be the actual limit.
 
 The CyrixIII sits on the same host bridges as the intel processors

I don't know if it applies to this case but one thing I have seen make
a noticeable difference is whether or not write-combining is enabled.
If we have only be enabling MTRR's for intel this could do account
for it.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-04 Thread Jamie Lokier

Eric W. Biederman wrote:
 I don't know if it applies to this case but one thing I have seen make
 a noticeable difference is whether or not write-combining is enabled.
 If we have only be enabling MTRR's for intel this could do account
 for it.

And on some laptops, even on Intel MTRRs are not enabled for 2.5M
framebuffers.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-04 Thread James Simmons


 As long as you are copying in real memory. So the PCI bus or the host
bridge
 implementation may be the actual limit.

 The CyrixIII sits on the same host bridges as the intel processors

I don't know if it applies to this case but one thing I have seen make
a noticeable difference is whether or not write-combining is enabled.
If we have only be enabling MTRR's for intel this could do account
for it.

I think what Geert was trying to point out is does MTRR perform was well
with normal memory over bus to video memory transfers as compared to
normal memory to normal memory transfers. MTTRs might not be optimzed for
these kinds of transfers. I honestly can't say since I haven't tried it. I
brought the MMX book home from works so I'm going to be experimenting
with it this weekend to find out. I really like to compare the MMX
performance to the word aligned transfers over the bus I have going. I had
a bug in my soft accel code that prevented word alignment. Once I fixed
that bug I seen a 10 fold improvement in rendering on the framebuffer.
I'm not kidding about that improvement either :-)

MTTRs enabled always makes a difference. I liek to try it with and
without. I will do some benchmarkings.

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-03 Thread Alan Cox

> > The MMX memcpy for CyrixIII and Athlon boxes is something like twice the
> > speed of rep movs. On most pentium II/III boxes the fast paths for rep movs
> > and for MMX are the same speed
> 
> As long as you are copying in real memory. So the PCI bus or the host bridge
> implementation may be the actual limit.

The CyrixIII sits on the same host bridges as the intel processors
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-03 Thread Geert Uytterhoeven

On Mon, 2 Apr 2001, Alan Cox wrote:
> > Yes this is problem. See my response to Paul about this. The only reason
> > I'm using MMX for the vesa framebuffer because it has no acceleration. MMX
> > gives a big boost for genuine intel chips. Other types of MMX are fast but
> > they don't seemed to be optimized for memory transfers like MMX on intel
> > chips. I also have regular code that does all kinds of tricks to optimize
> 
> Then you are doing something badly wrong.
> 
> The MMX memcpy for CyrixIII and Athlon boxes is something like twice the
> speed of rep movs. On most pentium II/III boxes the fast paths for rep movs
> and for MMX are the same speed

As long as you are copying in real memory. So the PCI bus or the host bridge
implementation may be the actual limit.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-03 Thread Geert Uytterhoeven

On Mon, 2 Apr 2001, Alan Cox wrote:
  Yes this is problem. See my response to Paul about this. The only reason
  I'm using MMX for the vesa framebuffer because it has no acceleration. MMX
  gives a big boost for genuine intel chips. Other types of MMX are fast but
  they don't seemed to be optimized for memory transfers like MMX on intel
  chips. I also have regular code that does all kinds of tricks to optimize
 
 Then you are doing something badly wrong.
 
 The MMX memcpy for CyrixIII and Athlon boxes is something like twice the
 speed of rep movs. On most pentium II/III boxes the fast paths for rep movs
 and for MMX are the same speed

As long as you are copying in real memory. So the PCI bus or the host bridge
implementation may be the actual limit.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-03 Thread Alan Cox

  The MMX memcpy for CyrixIII and Athlon boxes is something like twice the
  speed of rep movs. On most pentium II/III boxes the fast paths for rep movs
  and for MMX are the same speed
 
 As long as you are copying in real memory. So the PCI bus or the host bridge
 implementation may be the actual limit.

The CyrixIII sits on the same host bridges as the intel processors
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-02 Thread James Simmons


>Is it possible that "jump scroll" would provide more performance benefit
>than an accelerated driver anyway?

I wouldn't rule it out. If someone wants to wipe up some code I would have
no problem testing it to see if it is worth it.

>Seeing as you bring up this topic of writing a 9525 driver.  It seems to
>me rather wasteful that you (collectively linux framebuffer authors),
>XFree86 and Berlin are all writing drivers for the same, hugely diverse
>class of hardware, to support more or less the same ops on the hardware.
>
>Isn't possible to pool the development effort of video drivers?  Doesn't
>X require basically the same set of operations as the kernel?  I.e.,
>initialise the card and video mode (usually the very complex part); do
>some rendering ops (usually fairly simple).  Sure, X provides a few more
>kinds of rendering op, but that part of the code is usually much simpler
>and smaller than the initialisation code.

Well the goal of each is very much different. Fbcon was developed to deal
the fact that most modern video hardware doesn't support text but graphical
based modes instead. VGA text is slowly going away. Since are goal is to
emulate a text console we just have to provide basic support to provide
just this. We need to

1) Draw basic text -> Glyph operations.

2) scrolling -> hardware panning or a copy area operation.

3) scroll a region of the screen -> copy area operation.

4) Clear the display or region of display -> fillrect

5) Set color palette.

6) Manage a hardware cursor.

7) Manage the current resolution for VC switching or a mode change vi
   VT_RESIZE or TIOCSWINSZ.

So fbcon is out of necessite. Now X you mean XFree86 which is really a OS
in itself. Its goal to do everything itself so it can run everywhere
know to mankind. As for Berlin I don't know the code so I can't say.
As people are finding out XFree86 doing everything itself is having
issues. A good example is the classic problem of X dying and you have to
reboot the machine. Also when under heavy load and you exit X to the
console you don't get the text mode. Well right now its tough luck and
just reboot your machine. A M$ solution but people have been doing it
so long they don't mind it. I hope to fix those problems for 2.5.X.
As you can see I think the OS should handle the transfer from console mode
to text mode and vice versa. Now for programming the accel engine to do
graphics in userland. Well their is nothing wrong that each does their own
thing. What does matter is their is a GIU independent kernel manager of
the graphics engine state. DRI attempts to handle this.

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-02 Thread Alan Cox

> Yes this is problem. See my response to Paul about this. The only reason
> I'm using MMX for the vesa framebuffer because it has no acceleration. MMX
> gives a big boost for genuine intel chips. Other types of MMX are fast but
> they don't seemed to be optimized for memory transfers like MMX on intel
> chips. I also have regular code that does all kinds of tricks to optimize

Then you are doing something badly wrong.

The MMX memcpy for CyrixIII and Athlon boxes is something like twice the
speed of rep movs. On most pentium II/III boxes the fast paths for rep movs
and for MMX are the same speed

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-02 Thread Alan Cox

 Yes this is problem. See my response to Paul about this. The only reason
 I'm using MMX for the vesa framebuffer because it has no acceleration. MMX
 gives a big boost for genuine intel chips. Other types of MMX are fast but
 they don't seemed to be optimized for memory transfers like MMX on intel
 chips. I also have regular code that does all kinds of tricks to optimize

Then you are doing something badly wrong.

The MMX memcpy for CyrixIII and Athlon boxes is something like twice the
speed of rep movs. On most pentium II/III boxes the fast paths for rep movs
and for MMX are the same speed

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-02 Thread James Simmons


Is it possible that "jump scroll" would provide more performance benefit
than an accelerated driver anyway?

I wouldn't rule it out. If someone wants to wipe up some code I would have
no problem testing it to see if it is worth it.

Seeing as you bring up this topic of writing a 9525 driver.  It seems to
me rather wasteful that you (collectively linux framebuffer authors),
XFree86 and Berlin are all writing drivers for the same, hugely diverse
class of hardware, to support more or less the same ops on the hardware.

Isn't possible to pool the development effort of video drivers?  Doesn't
X require basically the same set of operations as the kernel?  I.e.,
initialise the card and video mode (usually the very complex part); do
some rendering ops (usually fairly simple).  Sure, X provides a few more
kinds of rendering op, but that part of the code is usually much simpler
and smaller than the initialisation code.

Well the goal of each is very much different. Fbcon was developed to deal
the fact that most modern video hardware doesn't support text but graphical
based modes instead. VGA text is slowly going away. Since are goal is to
emulate a text console we just have to provide basic support to provide
just this. We need to

1) Draw basic text - Glyph operations.

2) scrolling - hardware panning or a copy area operation.

3) scroll a region of the screen - copy area operation.

4) Clear the display or region of display - fillrect

5) Set color palette.

6) Manage a hardware cursor.

7) Manage the current resolution for VC switching or a mode change vi
   VT_RESIZE or TIOCSWINSZ.

So fbcon is out of necessite. Now X you mean XFree86 which is really a OS
in itself. Its goal to do everything itself so it can run everywhere
know to mankind. As for Berlin I don't know the code so I can't say.
As people are finding out XFree86 doing everything itself is having
issues. A good example is the classic problem of X dying and you have to
reboot the machine. Also when under heavy load and you exit X to the
console you don't get the text mode. Well right now its tough luck and
just reboot your machine. A M$ solution but people have been doing it
so long they don't mind it. I hope to fix those problems for 2.5.X.
As you can see I think the OS should handle the transfer from console mode
to text mode and vice versa. Now for programming the accel engine to do
graphics in userland. Well their is nothing wrong that each does their own
thing. What does matter is their is a GIU independent kernel manager of
the graphics engine state. DRI attempts to handle this.

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-01 Thread Jamie Lokier

James Simmons wrote:
> >No, it's the Trident Cyber9525
> 
> Sorry. I only have a early driver for trident 9750 and 9850. Their is a
> gropup working on trident framebuffers.

Is it possible that "jump scroll" would provide more performance benefit
than an accelerated driver anyway?

Seeing as you bring up this topic of writing a 9525 driver.  It seems to
me rather wasteful that you (collectively linux framebuffer authors),
XFree86 and Berlin are all writing drivers for the same, hugely diverse
class of hardware, to support more or less the same ops on the hardware.

Isn't possible to pool the development effort of video drivers?  Doesn't
X require basically the same set of operations as the kernel?  I.e.,
initialise the card and video mode (usually the very complex part); do
some rendering ops (usually fairly simple).  Sure, X provides a few more
kinds of rendering op, but that part of the code is usually much simpler
and smaller than the initialisation code.

Sorry if this sounds insulting -- it isn't intended that way.  I don't
really know what is involved in writing video drivers.  All I am seeing
is an _apparent_ reinventing of a rather complex wheel, when it's hard
enough as it is to keep up with all the different cards.

thanks,
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-01 Thread James Simmons


>No, it's the Trident Cyber9525

Sorry. I only have a early driver for trident 9750 and 9850. Their is a
gropup working on trident framebuffers.

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-01 Thread James Simmons


No, it's the Trident Cyber9525

Sorry. I only have a early driver for trident 9750 and 9850. Their is a
gropup working on trident framebuffers.

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-01 Thread Jamie Lokier

James Simmons wrote:
 No, it's the Trident Cyber9525
 
 Sorry. I only have a early driver for trident 9750 and 9850. Their is a
 gropup working on trident framebuffers.

Is it possible that "jump scroll" would provide more performance benefit
than an accelerated driver anyway?

Seeing as you bring up this topic of writing a 9525 driver.  It seems to
me rather wasteful that you (collectively linux framebuffer authors),
XFree86 and Berlin are all writing drivers for the same, hugely diverse
class of hardware, to support more or less the same ops on the hardware.

Isn't possible to pool the development effort of video drivers?  Doesn't
X require basically the same set of operations as the kernel?  I.e.,
initialise the card and video mode (usually the very complex part); do
some rendering ops (usually fairly simple).  Sure, X provides a few more
kinds of rendering op, but that part of the code is usually much simpler
and smaller than the initialisation code.

Sorry if this sounds insulting -- it isn't intended that way.  I don't
really know what is involved in writing video drivers.  All I am seeing
is an _apparent_ reinventing of a rather complex wheel, when it's hard
enough as it is to keep up with all the different cards.

thanks,
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-03-31 Thread Jamie Lokier

James Simmons wrote:
> > > You have same toshiba satellite as me, right?
> >
> > Yes
> 
> Is this the NeoMagic chipset?

No, it's the Trident Cyber9525

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-03-31 Thread Jamie Lokier

James Simmons wrote:
   You have same toshiba satellite as me, right?
 
  Yes
 
 Is this the NeoMagic chipset?

No, it's the Trident Cyber9525

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-03-30 Thread James Simmons


>I took to using X, with a single screen size xterm to present the
>illusion of console mode.

Cute trick. I have seen some slow text mode cards. As time goes on it will
get worst since text mode support is not the prime goal anymore. Especially
now that you see graphical BIOS interfaces. Some graphics cards manufactures
have dropped vga text mode support all together. In the next 5 years you
will see the elimination of vga text mode.

>Probably the lack of hardware area copies has something to do with
>this.

Yes this is problem. See my response to Paul about this. The only reason
I'm using MMX for the vesa framebuffer because it has no acceleration. MMX
gives a big boost for genuine intel chips. Other types of MMX are fast but
they don't seemed to be optimized for memory transfers like MMX on intel
chips. I also have regular code that does all kinds of tricks to optimize
data transfers over the bus. It needs more testing but from my comparison
between my voodoo 3 accel engine and this code it ran nearly as fast as
the accelerator at all depths :-)

Another idea for 2.5.X is to implement a font cache in video memory. Even
with AGP it is just to slow to constantly transfer font data over the bus.
Of course this requires a bit of work since we only have so much video
memory but it is worth it for the performance improvement.

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-03-30 Thread James Simmons


>The console driver does not actually use 2.5MB.  Does it make sense to
>use an MTRR for the smaller power-of-two region?

If we implement a font cache in the future it could. Also that extra
memory is used to allow scrollback. We could break up the size of the
region. Have it a*2^n+b*2^(n-1)+c*2^(n-2)+... = 2.5 MB. Isn't math grand
:-)

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-03-30 Thread James Simmons


> > You have same toshiba satellite as me, right?
>
> Yes

Is this the NeoMagic chipset?

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-03-30 Thread James Simmons


  You have same toshiba satellite as me, right?

 Yes

Is this the NeoMagic chipset?

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-03-30 Thread James Simmons


I took to using X, with a single screen size xterm to present the
illusion of console mode.

Cute trick. I have seen some slow text mode cards. As time goes on it will
get worst since text mode support is not the prime goal anymore. Especially
now that you see graphical BIOS interfaces. Some graphics cards manufactures
have dropped vga text mode support all together. In the next 5 years you
will see the elimination of vga text mode.

Probably the lack of hardware area copies has something to do with
this.

Yes this is problem. See my response to Paul about this. The only reason
I'm using MMX for the vesa framebuffer because it has no acceleration. MMX
gives a big boost for genuine intel chips. Other types of MMX are fast but
they don't seemed to be optimized for memory transfers like MMX on intel
chips. I also have regular code that does all kinds of tricks to optimize
data transfers over the bus. It needs more testing but from my comparison
between my voodoo 3 accel engine and this code it ran nearly as fast as
the accelerator at all depths :-)

Another idea for 2.5.X is to implement a font cache in video memory. Even
with AGP it is just to slow to constantly transfer font data over the bus.
Of course this requires a bit of work since we only have so much video
memory but it is worth it for the performance improvement.

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-03-30 Thread James Simmons


The console driver does not actually use 2.5MB.  Does it make sense to
use an MTRR for the smaller power-of-two region?

If we implement a font cache in the future it could. Also that extra
memory is used to allow scrollback. We could break up the size of the
region. Have it a*2^n+b*2^(n-1)+c*2^(n-2)+... = 2.5 MB. Isn't math grand
:-)

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/