Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-02-25 Thread Pavel Machek
On Thu 2015-01-29 14:11:25, Dave Airlie wrote:
 These two copy to/from VGA memory, however on the Silicon
 Motion SMI750 VGA card on a 64-bit system cause console corruption.
 
 This is due to the hw being buggy and not handling a 64-bit transaction
 correctly.
 
 We could try and create a 32-bit version of these routines,
 but I'm not sure the optimisation is worth much today.
 
 Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1132826
 
 Tested-by: Huawei engineering.
 Signed-off-by: Dave Airlie airl...@redhat.com

Actually... are you sure this is right fix?

IOW can gcc do the optimalization behind your back and still break the
buggy card?
Pavel

 diff --git a/include/linux/vt_buffer.h b/include/linux/vt_buffer.h
 index 057db7d..f38c10b 100644
 --- a/include/linux/vt_buffer.h
 +++ b/include/linux/vt_buffer.h
 @@ -21,10 +21,6 @@
  #ifndef VT_BUF_HAVE_RW
  #define scr_writew(val, addr) (*(addr) = (val))
  #define scr_readw(addr) (*(addr))
 -#define scr_memcpyw(d, s, c) memcpy(d, s, c)
 -#define scr_memmovew(d, s, c) memmove(d, s, c)
 -#define VT_BUF_HAVE_MEMCPYW
 -#define VT_BUF_HAVE_MEMMOVEW
  #endif
  
  #ifndef VT_BUF_HAVE_MEMSETW

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-02-09 Thread Geert Uytterhoeven
On Mon, Feb 9, 2015 at 11:35 AM, Daniel Stone dan...@fooishbar.org wrote:
 On 5 February 2015 at 11:35, One Thousand Gnomes
 gno...@lxorguk.ukuu.org.uk wrote:
 If I'm not mistaken, that would be as simple as adding

 #define VT_BUF_HAVE_RW.
 #define scr_writew(val, addr) (*(addr) = (val))
 #define scr_readw(addr) (*(addr))

 to arch/x86/include/asm/vga.h.

 and stick an

 #if defined (CONFIG_SUPPORT_SHITE_VGA_ADAPTERS)

 #endif

 around that and its sorted as an option everyone can leave off but the
 afflicted.

 Well, given all the distros will enable that, might as well be #if
 !defined(CONFIG_BREAK_SOME_HARDWARE_BUT_VGA_SCROLLING_WILL_BE_IMMEASURABLY_FASTER).

All distros on 1 out of 29 architectures?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-02-09 Thread One Thousand Gnomes
On Mon, 9 Feb 2015 11:00:55 +
Daniel Stone dan...@fooishbar.org wrote:

 On 9 February 2015 at 10:49, Geert Uytterhoeven ge...@linux-m68k.org wrote:
  On Mon, Feb 9, 2015 at 11:35 AM, Daniel Stone dan...@fooishbar.org wrote:
  On 5 February 2015 at 11:35, One Thousand Gnomes
  gno...@lxorguk.ukuu.org.uk wrote:
  #if defined (CONFIG_SUPPORT_SHITE_VGA_ADAPTERS)
 
  #endif
 
  around that and its sorted as an option everyone can leave off but the
  afflicted.
 
  Well, given all the distros will enable that, might as well be #if
  !defined(CONFIG_BREAK_SOME_HARDWARE_BUT_VGA_SCROLLING_WILL_BE_IMMEASURABLY_FASTER).
 
  All distros on 1 out of 29 architectures?
 
 It's a fairly popular architecture.

I imagine most distros wouldn't enable it even on x86. It's an incredibly
obscure setup from the evidence of how long it took to get reported.

Most distributions don't support non PAE processors and other far more
common things 8)

Alan


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-02-09 Thread Daniel Stone
On 9 February 2015 at 10:49, Geert Uytterhoeven ge...@linux-m68k.org wrote:
 On Mon, Feb 9, 2015 at 11:35 AM, Daniel Stone dan...@fooishbar.org wrote:
 On 5 February 2015 at 11:35, One Thousand Gnomes
 gno...@lxorguk.ukuu.org.uk wrote:
 #if defined (CONFIG_SUPPORT_SHITE_VGA_ADAPTERS)

 #endif

 around that and its sorted as an option everyone can leave off but the
 afflicted.

 Well, given all the distros will enable that, might as well be #if
 !defined(CONFIG_BREAK_SOME_HARDWARE_BUT_VGA_SCROLLING_WILL_BE_IMMEASURABLY_FASTER).

 All distros on 1 out of 29 architectures?

It's a fairly popular architecture.

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-02-09 Thread Daniel Stone
On 5 February 2015 at 11:35, One Thousand Gnomes
gno...@lxorguk.ukuu.org.uk wrote:
 If I'm not mistaken, that would be as simple as adding

 #define VT_BUF_HAVE_RW.
 #define scr_writew(val, addr) (*(addr) = (val))
 #define scr_readw(addr) (*(addr))

 to arch/x86/include/asm/vga.h.

 and stick an

 #if defined (CONFIG_SUPPORT_SHITE_VGA_ADAPTERS)

 #endif

 around that and its sorted as an option everyone can leave off but the
 afflicted.

Well, given all the distros will enable that, might as well be #if
!defined(CONFIG_BREAK_SOME_HARDWARE_BUT_VGA_SCROLLING_WILL_BE_IMMEASURABLY_FASTER).

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-02-05 Thread Geert Uytterhoeven
On Tue, Feb 3, 2015 at 4:54 PM, One Thousand Gnomes
gno...@lxorguk.ukuu.org.uk wrote:
 On Thu, 29 Jan 2015 15:40:33 -0800
 Linus Torvalds torva...@linux-foundation.org wrote:

 On Wed, Jan 28, 2015 at 8:11 PM, Dave Airlie airl...@redhat.com wrote:
 
  Linus, this came up a while back I finally got some confirmation
  that it fixes those servers.

 I'm certainly ok with this. which way should it go in? The users are:

  - drivers/tty/vt/vt.c (Greg KH, tty layer)

  - drivers/video/console/* (fbcon people: Tomi Valkeinen and friends)

 and it might make sense to have *some* indication of how much worse
 this makes fbcon performance in particular..

 For devices that have no hardware scrolling it used to be double digit
 percentages difference between 32 and 64bit when reading from the fb
 because the reads are not posted and the latency killed you. Writes - not
 so big a deal - but the bridge should combine them anyway. I imagine
 16bit read would be unprintably bad.

Fbcon uses scr_mem{cpy,move}w() for the VT buffer (characters + attributes)
only, not for the frame buffer data.
So the performance degradation should be minimal.

However, as this affects real VGA on x86 only, perhaps it can be fixed
in arch/x86/include/asm/vga.h instead of include/linux/vt_buffer.h, so
platforms not having VGA are not affected? We have these VT_BUF_*
and scr_*() abstractions for a reason...

If I'm not mistaken, that would be as simple as adding

#define VT_BUF_HAVE_RW.
#define scr_writew(val, addr) (*(addr) = (val))
#define scr_readw(addr) (*(addr))

to arch/x86/include/asm/vga.h.

If someone wants to put one of the bad VGA cards in a non-x86 PCI slot,
perhaps a few more architecture-specific asm/vga.h have to be updated:

$ git grep -w VT_BUF_HAVE_RW -- arch
arch/alpha/include/asm/vga.h:#define VT_BUF_HAVE_RW
arch/mips/include/asm/vga.h:#define VT_BUF_HAVE_RW
arch/powerpc/include/asm/vga.h:#define VT_BUF_HAVE_RW
arch/sparc/include/asm/vga.h:#define VT_BUF_HAVE_RW
arch/tile/include/asm/vga.h:#define VT_BUF_HAVE_RW

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-02-05 Thread One Thousand Gnomes
 If I'm not mistaken, that would be as simple as adding
 
 #define VT_BUF_HAVE_RW.
 #define scr_writew(val, addr) (*(addr) = (val))
 #define scr_readw(addr) (*(addr))
 
 to arch/x86/include/asm/vga.h.

and stick an

#if defined (CONFIG_SUPPORT_SHITE_VGA_ADAPTERS)

#endif

around that and its sorted as an option everyone can leave off but the
afflicted.

Alan

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-02-03 Thread One Thousand Gnomes
On Thu, 29 Jan 2015 15:40:33 -0800
Linus Torvalds torva...@linux-foundation.org wrote:

 On Wed, Jan 28, 2015 at 8:11 PM, Dave Airlie airl...@redhat.com wrote:
 
  Linus, this came up a while back I finally got some confirmation
  that it fixes those servers.
 
 I'm certainly ok with this. which way should it go in? The users are:
 
  - drivers/tty/vt/vt.c (Greg KH, tty layer)
 
  - drivers/video/console/* (fbcon people: Tomi Valkeinen and friends)
 
 and it might make sense to have *some* indication of how much worse
 this makes fbcon performance in particular..

For devices that have no hardware scrolling it used to be double digit
percentages difference between 32 and 64bit when reading from the fb
because the reads are not posted and the latency killed you. Writes - not
so big a deal - but the bridge should combine them anyway. I imagine
16bit read would be unprintably bad.

Is it reads or writes that kill the card ?

Also note that switching to lots of small writes may break the 3Dfx
driver for the early 3Dfx PCI cards - they are really quite touchy about
how they are fed.

Unfortunately fbcon still matters for dumb EFI framebuffer fallbacks.

vgacon it doesn't matter (if it was too slow you could make vgacon as
fast as you want by only updating the off screen characters once per
vertical blank). fbcon that is a bit harder as you are allowed to
scribble on the display as well. You can't even check open/mmapped as you
can open, scribble and close.

Alan

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-01-29 Thread Linus Torvalds
On Wed, Jan 28, 2015 at 8:11 PM, Dave Airlie airl...@redhat.com wrote:

 Linus, this came up a while back I finally got some confirmation
 that it fixes those servers.

I'm certainly ok with this. which way should it go in? The users are:

 - drivers/tty/vt/vt.c (Greg KH, tty layer)

 - drivers/video/console/* (fbcon people: Tomi Valkeinen and friends)

and it might make sense to have *some* indication of how much worse
this makes fbcon performance in particular..

Greg/Tomi - the patch is removing this:

  #define scr_memcpyw(d, s, c) memcpy(d, s, c)
  #define scr_memmovew(d, s, c) memmove(d, s, c)
  #define VT_BUF_HAVE_MEMCPYW
  #define VT_BUF_HAVE_MEMMOVEW

from linux/vt_buffer.h, because some stupid graphics cards
apparently cannot handle 64-bit accesses of regular memcpy/memmove.

And on other setups, this will be the reverse: 8-bit accesses due to
using rep movsb, which is the fast way to move/clear memory on
modern Intel CPU's, but is really wrong for MMIO where it will be slow
as hell.

So just getting rid of the memcpy/memmove is likely the right thing in
general, since the fallbacks go this the traditional 16-bit-at-a-time
way. And getting rid of the memcpy _may_ speed things up.

But if it slows things down, we might have to try something else. Like
saying all cards we've ever seen have been ok with aligned 32-bit
accesses, and extend the open-coded scr_memcpy/memmove functions to
do that.

Hmm?

   Linus

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-01-29 Thread Greg Kroah-Hartman
On Thu, Jan 29, 2015 at 03:40:33PM -0800, Linus Torvalds wrote:
 On Wed, Jan 28, 2015 at 8:11 PM, Dave Airlie airl...@redhat.com wrote:
 
  Linus, this came up a while back I finally got some confirmation
  that it fixes those servers.
 
 I'm certainly ok with this. which way should it go in? The users are:
 
  - drivers/tty/vt/vt.c (Greg KH, tty layer)
 
  - drivers/video/console/* (fbcon people: Tomi Valkeinen and friends)
 
 and it might make sense to have *some* indication of how much worse
 this makes fbcon performance in particular..
 
 Greg/Tomi - the patch is removing this:
 
   #define scr_memcpyw(d, s, c) memcpy(d, s, c)
   #define scr_memmovew(d, s, c) memmove(d, s, c)
   #define VT_BUF_HAVE_MEMCPYW
   #define VT_BUF_HAVE_MEMMOVEW
 
 from linux/vt_buffer.h, because some stupid graphics cards
 apparently cannot handle 64-bit accesses of regular memcpy/memmove.
 
 And on other setups, this will be the reverse: 8-bit accesses due to
 using rep movsb, which is the fast way to move/clear memory on
 modern Intel CPU's, but is really wrong for MMIO where it will be slow
 as hell.
 
 So just getting rid of the memcpy/memmove is likely the right thing in
 general, since the fallbacks go this the traditional 16-bit-at-a-time
 way. And getting rid of the memcpy _may_ speed things up.
 
 But if it slows things down, we might have to try something else. Like
 saying all cards we've ever seen have been ok with aligned 32-bit
 accesses, and extend the open-coded scr_memcpy/memmove functions to
 do that.
 
 Hmm?

I can take this through the tty tree, but can I put it in linux-next and
wait for the 3.20 merge window to give people who might notice a
slow-down a chance to object?

thanks,

greg k-h

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-01-29 Thread Dave Airlie
On 30 January 2015 at 10:03, Linus Torvalds
torva...@linux-foundation.org wrote:
 On Thu, Jan 29, 2015 at 3:57 PM, Greg Kroah-Hartman
 gre...@linuxfoundation.org wrote:

 I can take this through the tty tree, but can I put it in linux-next and
 wait for the 3.20 merge window to give people who might notice a
 slow-down a chance to object?

 Yes. The problem only affects one (or a couple of) truly outrageously
 bad graphics cards that are only used in servers (because they are
 such crap that they wouldn't be acceptable anywhere else anyway), and
 they have afaik never worked with 64-bit kernels, so it's not even a
 regression.

 So it's worth fixing because it's a real - albeit very rare - problem
 (especially since the enhanched rep instruction model of memcpy could
 easily be *worse* than the 16-bit-at-a-time manual version), but I
 wouldn't consider it anywhere near high priority.

Totally not a priority, it just finally got tested for RHEL so I wanted to
make sure I posted it upstream before I forgot about it for months,

I also filed:
https://bugzilla.kernel.org/show_bug.cgi?id=92311

since the RH bug is private and full of crap, that bug contains
a screenshot of the remote console to see what sort of crap it produces.

Dave.

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] vt_buffer: drop console buffer copying optimisations

2015-01-29 Thread Linus Torvalds
On Thu, Jan 29, 2015 at 3:57 PM, Greg Kroah-Hartman
gre...@linuxfoundation.org wrote:

 I can take this through the tty tree, but can I put it in linux-next and
 wait for the 3.20 merge window to give people who might notice a
 slow-down a chance to object?

Yes. The problem only affects one (or a couple of) truly outrageously
bad graphics cards that are only used in servers (because they are
such crap that they wouldn't be acceptable anywhere else anyway), and
they have afaik never worked with 64-bit kernels, so it's not even a
regression.

So it's worth fixing because it's a real - albeit very rare - problem
(especially since the enhanched rep instruction model of memcpy could
easily be *worse* than the 16-bit-at-a-time manual version), but I
wouldn't consider it anywhere near high priority.

 Linus

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel