Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-12-13 Thread Peter Geoghegan
On Mon, Nov 30, 2015 at 1:04 PM, Peter Geoghegan wrote: > Perhaps we can consider more selectively applying prefetching in the > context of writing out tuples. It may still be worth selectively applying these techniques to writing out tuples, per my previous remarks (and the

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-11-30 Thread Peter Geoghegan
On Sun, Nov 29, 2015 at 10:14 PM, Peter Geoghegan wrote: > I'm currently running some benchmarks on my external sorting patch on > the POWER7 machine that Robert Haas and a few other people have been > using for some time now [1]. So far, the benchmarks look very good > across a

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-11-30 Thread Peter Geoghegan
On Mon, Nov 30, 2015 at 1:04 PM, Peter Geoghegan wrote: > For example, the 50 million tuple test has over 8% of its runtime > shaved off. This seems to be a consistent pattern. I included the nitty-gritty details of this case in something attached to a mail I just sent, over in

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-11-29 Thread Peter Geoghegan
On Sun, Nov 29, 2015 at 8:52 PM, David Rowley wrote: > You're right, gcc did not include the prefetch instructions. > I've tested again on the same machine but with clang 3.7 instead of gcc > 4.8.3 Thanks for going to the trouble of investigating this. These

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-10-10 Thread Peter Geoghegan
On Thu, Sep 3, 2015 at 5:35 PM, David Rowley wrote: > My test cases are: Note that my text caching and unsigned integer comparison patches have moved the baseline down quite noticeably. I think that my mobile processor out-performs the Xeon you used for this, which

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-03 Thread David Rowley
On 3 September 2015 at 16:50, Peter Geoghegan wrote: > On Wed, Sep 2, 2015 at 9:43 PM, David Rowley > wrote: > > Peter, would you be able to share the test case which you saw the speedup > > on. So far I've been unable to see much of an

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread Peter Geoghegan
On Wed, Sep 2, 2015 at 7:12 AM, Andres Freund wrote: > On 2015-07-19 16:34:52 -0700, Peter Geoghegan wrote: > Hm. Is a compiler test actually test anything reliably here? Won't this > just throw a warning during compile time about an unknown function? I'll need to look into

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread Andres Freund
On 2015-09-02 16:23:13 -0700, Peter Geoghegan wrote: > On Wed, Sep 2, 2015 at 4:12 PM, Andres Freund wrote: > >> Well, still needs to work for tuplestore, which does not have a SortTuple. > > > > Isn't it even more trivial there? It's just an array of void*'s? So > >

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread Peter Geoghegan
On Wed, Sep 2, 2015 at 4:26 PM, Andres Freund wrote: > I'm not following? Just write pg_read_prefetch(state->memtuples + 3 + > readptr->current) and the corresponding version for tuplesort in the > callsites? Oh, I see. Maybe I'll do it that way when I pick this up in a

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread Andres Freund
On 2015-09-02 17:14:12 -0700, Peter Geoghegan wrote: > On Wed, Sep 2, 2015 at 12:24 PM, Peter Geoghegan wrote: > > On Wed, Sep 2, 2015 at 7:12 AM, Andres Freund wrote: > >> On 2015-07-19 16:34:52 -0700, Peter Geoghegan wrote: > >> Hm. Is a compiler test

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread Peter Geoghegan
On Wed, Sep 2, 2015 at 4:12 PM, Andres Freund wrote: >> Well, still needs to work for tuplestore, which does not have a SortTuple. > > Isn't it even more trivial there? It's just an array of void*'s? So > prefetch(state->memtuples + 3 + readptr->current)? All I meant is that

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread Peter Geoghegan
On Wed, Sep 2, 2015 at 12:24 PM, Peter Geoghegan wrote: > On Wed, Sep 2, 2015 at 7:12 AM, Andres Freund wrote: >> On 2015-07-19 16:34:52 -0700, Peter Geoghegan wrote: >> Hm. Is a compiler test actually test anything reliably here? Won't this >> just throw a

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread Peter Geoghegan
On Wed, Sep 2, 2015 at 9:43 PM, David Rowley wrote: > Peter, would you be able to share the test case which you saw the speedup > on. So far I've been unable to see much of an improvement. The case I tested was an internal sort CREATE INDEX. I don't recall the exact

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread David Rowley
On 3 September 2015 at 07:24, Peter Geoghegan wrote: > On Wed, Sep 2, 2015 at 7:12 AM, Andres Freund wrote: > > What worries me about adding explicit prefetching is that it's *highly* > > platform and even micro-architecture dependent. Why is looking three >

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread Andres Freund
On 2015-09-02 12:24:33 -0700, Peter Geoghegan wrote: > "Read fetch". One argument past to the intrinsic here specifies that > the variable will be read only. I did things this way because I > imagined that there would be very limited uses for the macro only. I > probably cover almost all

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread Peter Geoghegan
On Wed, Sep 2, 2015 at 3:13 PM, Andres Freund wrote: > I'd be less brief in this case then, no need to be super short here. Okay. >> It started out that way, but Tom felt that it was better to have a >> USE_MEM_PREFETCH because of the branch below... > > That doesn't mean we

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread Andres Freund
On 2015-09-02 16:02:00 -0700, Peter Geoghegan wrote: > On Wed, Sep 2, 2015 at 3:13 PM, Andres Freund wrote: > > That's just a question of how to formulate this though? > > > > pg_rfetch(((char *) state->memtuples ) + 3 * sizeof(SortTuple) + > > offsetof(SortTuple, tuple))? >

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-09-02 Thread Andres Freund
On 2015-07-19 16:34:52 -0700, Peter Geoghegan wrote: > +# PGAC_C_BUILTIN_PREFETCH > +# - > +# Check if the C compiler understands __builtin_prefetch(), > +# and define HAVE__BUILTIN_PREFETCH if so. > +AC_DEFUN([PGAC_C_BUILTIN_PREFETCH], > +[AC_CACHE_CHECK(for

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-07-19 Thread Peter Geoghegan
On Thu, Jul 16, 2015 at 8:49 PM, Tom Lane t...@sss.pgh.pa.us wrote: Meh. I don't like the assumption that non-GCC compilers will be smart enough to optimize away the useless-to-them if() tests this adds. Please refactor that so that there is exactly 0 new code when the intrinsic doesn't

Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-07-16 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes: Attached patch adds a portable Postgres wrapper on the GCC intrinsic. Meh. I don't like the assumption that non-GCC compilers will be smart enough to optimize away the useless-to-them if() tests this adds. Please refactor that so that there is exactly 0

[HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-07-16 Thread Peter Geoghegan
I've always thought that GCC intrinsics for software-based memory prefetching are a very sharp tool. While it's really hard to use GCC's __builtin_prefetch() effectively (I've tried enough times to know), I always thought that there'd probably end up being a handful of cases where using it