Re: Question Regarding ERMS memcpy

2017-03-06 Thread hpa
On March 6, 2017 9:12:41 AM PST, Logan Gunthorpe wrote: > > >On 06/03/17 12:28 AM, H. Peter Anvin wrote: >> On 03/05/17 23:01, Logan Gunthorpe wrote: >>> >>> On 05/03/17 12:54 PM, Borislav Petkov wrote: Logan, wanna give that a try, see if it takes care of your issue?

Re: Question Regarding ERMS memcpy

2017-03-06 Thread hpa
On March 6, 2017 9:12:41 AM PST, Logan Gunthorpe wrote: > > >On 06/03/17 12:28 AM, H. Peter Anvin wrote: >> On 03/05/17 23:01, Logan Gunthorpe wrote: >>> >>> On 05/03/17 12:54 PM, Borislav Petkov wrote: Logan, wanna give that a try, see if it takes care of your issue? >>> >>> Well honestly

Re: Question Regarding ERMS memcpy

2017-03-06 Thread Logan Gunthorpe
On 06/03/17 12:28 AM, H. Peter Anvin wrote: > On 03/05/17 23:01, Logan Gunthorpe wrote: >> >> On 05/03/17 12:54 PM, Borislav Petkov wrote: >>> Logan, wanna give that a try, see if it takes care of your issue? >> >> Well honestly my issue was solved by fixing my kernel config. I have no >> idea

Re: Question Regarding ERMS memcpy

2017-03-06 Thread Logan Gunthorpe
On 06/03/17 12:28 AM, H. Peter Anvin wrote: > On 03/05/17 23:01, Logan Gunthorpe wrote: >> >> On 05/03/17 12:54 PM, Borislav Petkov wrote: >>> Logan, wanna give that a try, see if it takes care of your issue? >> >> Well honestly my issue was solved by fixing my kernel config. I have no >> idea

Re: Question Regarding ERMS memcpy

2017-03-06 Thread Borislav Petkov
On Mon, Mar 06, 2017 at 05:41:22AM -0800, h...@zytor.com wrote: > It isn't really that straightforward IMO. > > For UC memory transaction size really needs to be specified explicitly > at all times and should be part of the API, rather than implicit. > > For WC/WT/WB device memory, the ordinary

Re: Question Regarding ERMS memcpy

2017-03-06 Thread Borislav Petkov
On Mon, Mar 06, 2017 at 05:41:22AM -0800, h...@zytor.com wrote: > It isn't really that straightforward IMO. > > For UC memory transaction size really needs to be specified explicitly > at all times and should be part of the API, rather than implicit. > > For WC/WT/WB device memory, the ordinary

Re: Question Regarding ERMS memcpy

2017-03-06 Thread hpa
On March 6, 2017 5:33:28 AM PST, Borislav Petkov wrote: >On Mon, Mar 06, 2017 at 12:01:10AM -0700, Logan Gunthorpe wrote: >> Well honestly my issue was solved by fixing my kernel config. I have >no >> idea why I had optimize for size in there in the first place. > >I still think

Re: Question Regarding ERMS memcpy

2017-03-06 Thread hpa
On March 6, 2017 5:33:28 AM PST, Borislav Petkov wrote: >On Mon, Mar 06, 2017 at 12:01:10AM -0700, Logan Gunthorpe wrote: >> Well honestly my issue was solved by fixing my kernel config. I have >no >> idea why I had optimize for size in there in the first place. > >I still think that we should

Re: Question Regarding ERMS memcpy

2017-03-06 Thread Borislav Petkov
On Mon, Mar 06, 2017 at 12:01:10AM -0700, Logan Gunthorpe wrote: > Well honestly my issue was solved by fixing my kernel config. I have no > idea why I had optimize for size in there in the first place. I still think that we should address the iomem memcpy Linus mentioned. So how about this

Re: Question Regarding ERMS memcpy

2017-03-06 Thread Borislav Petkov
On Mon, Mar 06, 2017 at 12:01:10AM -0700, Logan Gunthorpe wrote: > Well honestly my issue was solved by fixing my kernel config. I have no > idea why I had optimize for size in there in the first place. I still think that we should address the iomem memcpy Linus mentioned. So how about this

Re: Question Regarding ERMS memcpy

2017-03-05 Thread H. Peter Anvin
On 03/05/17 23:01, Logan Gunthorpe wrote: > > On 05/03/17 12:54 PM, Borislav Petkov wrote: >> Logan, wanna give that a try, see if it takes care of your issue? > > Well honestly my issue was solved by fixing my kernel config. I have no > idea why I had optimize for size in there in the first

Re: Question Regarding ERMS memcpy

2017-03-05 Thread H. Peter Anvin
On 03/05/17 23:01, Logan Gunthorpe wrote: > > On 05/03/17 12:54 PM, Borislav Petkov wrote: >> Logan, wanna give that a try, see if it takes care of your issue? > > Well honestly my issue was solved by fixing my kernel config. I have no > idea why I had optimize for size in there in the first

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Logan Gunthorpe
On Sun, Mar 05, 2017 at 11:19:42AM -0800, Linus Torvalds wrote: >> But it is *not* the right thing to use on IO memory, because the CPU >> only does the magic cacheline access optimizations on cacheable >> memory! Yes, and actually this is where I started. I thought my memcpy was using byte

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Logan Gunthorpe
On Sun, Mar 05, 2017 at 11:19:42AM -0800, Linus Torvalds wrote: >> But it is *not* the right thing to use on IO memory, because the CPU >> only does the magic cacheline access optimizations on cacheable >> memory! Yes, and actually this is where I started. I thought my memcpy was using byte

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Linus Torvalds
On Sun, Mar 5, 2017 at 11:54 AM, Borislav Petkov wrote: >> >> We seem to have broken this *really* long ago, though. > > I wonder why nothing blew up or failed strangely by now... The hardware that cared was pretty broken to begin with, and I think it was mainly some really odd

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Linus Torvalds
On Sun, Mar 5, 2017 at 11:54 AM, Borislav Petkov wrote: >> >> We seem to have broken this *really* long ago, though. > > I wonder why nothing blew up or failed strangely by now... The hardware that cared was pretty broken to begin with, and I think it was mainly some really odd graphics cards.

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Borislav Petkov
On Sun, Mar 05, 2017 at 11:19:42AM -0800, Linus Torvalds wrote: > Actually, the "fromio/toio" code should never use regular memcpy(). > There used to be devices that literally broke on 64-bit accesses due > to broken PCI crud. > > We seem to have broken this *really* long ago, though. I wonder

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Borislav Petkov
On Sun, Mar 05, 2017 at 11:19:42AM -0800, Linus Torvalds wrote: > Actually, the "fromio/toio" code should never use regular memcpy(). > There used to be devices that literally broke on 64-bit accesses due > to broken PCI crud. > > We seem to have broken this *really* long ago, though. I wonder

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Linus Torvalds
On Sun, Mar 5, 2017 at 1:50 AM, Borislav Petkov wrote: > > gcc can't possibly know on what targets is that kernel going to be > booted on. So it probably does some universally optimal things, like in > the dmi_scan_machine() case: > > memcpy_fromio(buf, p, 32); > > turns

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Linus Torvalds
On Sun, Mar 5, 2017 at 1:50 AM, Borislav Petkov wrote: > > gcc can't possibly know on what targets is that kernel going to be > booted on. So it probably does some universally optimal things, like in > the dmi_scan_machine() case: > > memcpy_fromio(buf, p, 32); > > turns into: > >

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Borislav Petkov
On Sun, Mar 05, 2017 at 12:18:23PM +0100, Borislav Petkov wrote: > Also, I need to check what vmlinuz size bloat we're talking: with the > diff below, we do add padding which looks like this: Yeah, even a tailored config adds ~67K: textdata bss dec hex filename 7567290 4040894

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Borislav Petkov
On Sun, Mar 05, 2017 at 12:18:23PM +0100, Borislav Petkov wrote: > Also, I need to check what vmlinuz size bloat we're talking: with the > diff below, we do add padding which looks like this: Yeah, even a tailored config adds ~67K: textdata bss dec hex filename 7567290 4040894

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Borislav Petkov
On Sun, Mar 05, 2017 at 10:50:59AM +0100, Borislav Petkov wrote: > On Sat, Mar 04, 2017 at 04:56:38PM -0800, h...@zytor.com wrote: > > That's what the -march= and -mtune= option do! > > How does that even help with a distro kernel built with -mtune=generic ? > > gcc can't possibly know on what

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Borislav Petkov
On Sun, Mar 05, 2017 at 10:50:59AM +0100, Borislav Petkov wrote: > On Sat, Mar 04, 2017 at 04:56:38PM -0800, h...@zytor.com wrote: > > That's what the -march= and -mtune= option do! > > How does that even help with a distro kernel built with -mtune=generic ? > > gcc can't possibly know on what

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Borislav Petkov
On Sat, Mar 04, 2017 at 09:58:14PM -0700, Logan Gunthorpe wrote: > So, I've found that my kernel config had the OPTIMIZE_FOR_SIZE selected > instead of OPTIMIZE_FOR_PERFORMANCE. I'm not sure why that is but > switching to the latter option fixes my problem. A memcpy call is used > instead of the

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Borislav Petkov
On Sat, Mar 04, 2017 at 09:58:14PM -0700, Logan Gunthorpe wrote: > So, I've found that my kernel config had the OPTIMIZE_FOR_SIZE selected > instead of OPTIMIZE_FOR_PERFORMANCE. I'm not sure why that is but > switching to the latter option fixes my problem. A memcpy call is used > instead of the

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Borislav Petkov
On Sat, Mar 04, 2017 at 04:56:38PM -0800, h...@zytor.com wrote: > That's what the -march= and -mtune= option do! How does that even help with a distro kernel built with -mtune=generic ? gcc can't possibly know on what targets is that kernel going to be booted on. So it probably does some

Re: Question Regarding ERMS memcpy

2017-03-05 Thread Borislav Petkov
On Sat, Mar 04, 2017 at 04:56:38PM -0800, h...@zytor.com wrote: > That's what the -march= and -mtune= option do! How does that even help with a distro kernel built with -mtune=generic ? gcc can't possibly know on what targets is that kernel going to be booted on. So it probably does some

Re: Question Regarding ERMS memcpy

2017-03-04 Thread Logan Gunthorpe
Hey, On 04/03/17 05:33 PM, Borislav Petkov wrote: > On Sat, Mar 04, 2017 at 04:23:17PM -0800, h...@zytor.com wrote: >> What are the compilation flags? It may be that gcc still does TRT >> depending on this call site. I'd check what gcc6 or 7 generates, >> though. > Hmm, I wish we were able to

Re: Question Regarding ERMS memcpy

2017-03-04 Thread Logan Gunthorpe
Hey, On 04/03/17 05:33 PM, Borislav Petkov wrote: > On Sat, Mar 04, 2017 at 04:23:17PM -0800, h...@zytor.com wrote: >> What are the compilation flags? It may be that gcc still does TRT >> depending on this call site. I'd check what gcc6 or 7 generates, >> though. > Hmm, I wish we were able to

Re: Question Regarding ERMS memcpy

2017-03-04 Thread hpa
On March 4, 2017 4:33:49 PM PST, Borislav Petkov wrote: >On Sat, Mar 04, 2017 at 04:23:17PM -0800, h...@zytor.com wrote: >> What are the compilation flags? It may be that gcc still does TRT >> depending on this call site. I'd check what gcc6 or 7 generates, >> though. > >Well, I

Re: Question Regarding ERMS memcpy

2017-03-04 Thread hpa
On March 4, 2017 4:33:49 PM PST, Borislav Petkov wrote: >On Sat, Mar 04, 2017 at 04:23:17PM -0800, h...@zytor.com wrote: >> What are the compilation flags? It may be that gcc still does TRT >> depending on this call site. I'd check what gcc6 or 7 generates, >> though. > >Well, I don't think that

Re: Question Regarding ERMS memcpy

2017-03-04 Thread Borislav Petkov
On Sat, Mar 04, 2017 at 04:23:17PM -0800, h...@zytor.com wrote: > What are the compilation flags? It may be that gcc still does TRT > depending on this call site. I'd check what gcc6 or 7 generates, > though. Well, I don't think that matters: if you're building a kernel on one machine to boot on

Re: Question Regarding ERMS memcpy

2017-03-04 Thread Borislav Petkov
On Sat, Mar 04, 2017 at 04:23:17PM -0800, h...@zytor.com wrote: > What are the compilation flags? It may be that gcc still does TRT > depending on this call site. I'd check what gcc6 or 7 generates, > though. Well, I don't think that matters: if you're building a kernel on one machine to boot on

Re: Question Regarding ERMS memcpy

2017-03-04 Thread hpa
On March 4, 2017 4:14:47 PM PST, Borislav Petkov wrote: >On Sat, Mar 04, 2017 at 03:55:27PM -0800, h...@zytor.com wrote: >> For newer processors, as determined by -mtune=, it is actually the >> best option for an arbitrary copy. > >So his doesn't have ERMS - it is a SNB - so if for

Re: Question Regarding ERMS memcpy

2017-03-04 Thread hpa
On March 4, 2017 4:14:47 PM PST, Borislav Petkov wrote: >On Sat, Mar 04, 2017 at 03:55:27PM -0800, h...@zytor.com wrote: >> For newer processors, as determined by -mtune=, it is actually the >> best option for an arbitrary copy. > >So his doesn't have ERMS - it is a SNB - so if for SNB REP_GOOD

Re: Question Regarding ERMS memcpy

2017-03-04 Thread Borislav Petkov
On Sat, Mar 04, 2017 at 03:55:27PM -0800, h...@zytor.com wrote: > For newer processors, as determined by -mtune=, it is actually the > best option for an arbitrary copy. So his doesn't have ERMS - it is a SNB - so if for SNB REP_GOOD is the best option for memcpy, then we should probably build

Re: Question Regarding ERMS memcpy

2017-03-04 Thread Borislav Petkov
On Sat, Mar 04, 2017 at 03:55:27PM -0800, h...@zytor.com wrote: > For newer processors, as determined by -mtune=, it is actually the > best option for an arbitrary copy. So his doesn't have ERMS - it is a SNB - so if for SNB REP_GOOD is the best option for memcpy, then we should probably build

Re: Question Regarding ERMS memcpy

2017-03-04 Thread hpa
On March 4, 2017 3:46:44 PM PST, Logan Gunthorpe wrote: >Hi Borislav, > >Thanks for the help. > >On 04/03/17 03:43 PM, Borislav Petkov wrote: >> You can boot with "debug-alternative" and look for those strings >where > >Here's the symbols for memcpy and the corresponding

Re: Question Regarding ERMS memcpy

2017-03-04 Thread hpa
On March 4, 2017 3:46:44 PM PST, Logan Gunthorpe wrote: >Hi Borislav, > >Thanks for the help. > >On 04/03/17 03:43 PM, Borislav Petkov wrote: >> You can boot with "debug-alternative" and look for those strings >where > >Here's the symbols for memcpy and the corresponding apply_alternatives

Re: Question Regarding ERMS memcpy

2017-03-04 Thread Borislav Petkov
On Sat, Mar 04, 2017 at 01:08:15PM -0700, Logan Gunthorpe wrote: > So my question is: how do I find out what version of memcpy my actual > machine is using and fix it if it is wrong? You can boot with "debug-alternative" and look for those strings where it says "feat:" [0.261386]

Re: Question Regarding ERMS memcpy

2017-03-04 Thread Borislav Petkov
On Sat, Mar 04, 2017 at 01:08:15PM -0700, Logan Gunthorpe wrote: > So my question is: how do I find out what version of memcpy my actual > machine is using and fix it if it is wrong? You can boot with "debug-alternative" and look for those strings where it says "feat:" [0.261386]

Question Regarding ERMS memcpy

2017-03-04 Thread Logan Gunthorpe
Hi, I'm trying to chase down a performance issue with a driver I'm working on that does a repeated memcpy_fromio of about 1KB from a PCI device. I made a small change from a fixed size copy to a variable size only to be surprised with a performance decrease of about 1/3. I've looked through the

Question Regarding ERMS memcpy

2017-03-04 Thread Logan Gunthorpe
Hi, I'm trying to chase down a performance issue with a driver I'm working on that does a repeated memcpy_fromio of about 1KB from a PCI device. I made a small change from a fixed size copy to a variable size only to be surprised with a performance decrease of about 1/3. I've looked through the