On 04/14/2017 08:19 AM, Peter Geoghegan wrote:
BTW, I'm skeptical of the idea of Heikki's around killing polyphase
merge itself at this point. I think that keeping most tapes active per
pass is useful now that our memory accounting involves handing over an
even share to each maybe-active tape
On Fri, Apr 14, 2017 at 5:57 AM, Robert Haas wrote:
> I don't think there's any one fixed answer, because increasing the
> number of tapes reduces I/O by adding CPU cost, and visca versa.
Sort of, but if you have to merge hundreds of runs (a situation that
should be quite
On Fri, Apr 14, 2017 at 1:19 AM, Peter Geoghegan wrote:
> Interestingly enough, I think that Knuth was pretty much spot on with
> his "sweet spot" of 7 tapes, even if you have modern hardware. Commit
> df700e6 (where the sweet spot of merge order 7 was no longer always
> used) was
On Thu, Apr 13, 2017 at 10:19 PM, Peter Geoghegan wrote:
> I actually think Heikki's work here would particularly help on
> spinning rust, especially when less memory is available. He
> specifically justified it on the basis of it resulting in a more
> sequential read pattern,
On Thu, Apr 13, 2017 at 9:51 PM, Tom Lane wrote:
> I'm fairly sure that the point was exactly what it said, ie improve
> locality of access within the temp file by sequentially reading as many
> tuples in a row as we could, rather than grabbing one here and one there.
>
> It
I wrote:
> Heikki Linnakangas writes:
>> I'm talking about the code that reads a bunch of from each tape, loading
>> them into the memtuples array. That code was added by Tom Lane, back in
>> 1999:
>> So apparently there was a benefit back then, but is it still worthwhile?
>
Heikki Linnakangas writes:
> I'm talking about the code that reads a bunch of from each tape, loading
> them into the memtuples array. That code was added by Tom Lane, back in
> 1999:
> commit cf627ab41ab9f6038a29ddd04dd0ff0ccdca714e
> Author: Tom Lane
>
On Mon, Oct 3, 2016 at 3:39 AM, Heikki Linnakangas wrote:
>> Can't you just use state->tapeRange, and remove the "continue"? I
>> recommend referring to "now-exhausted input tapes" here, too.
>
>
> Don't think so. result_tape == tapeRange only when the merge was done in a
>
On 09/30/2016 04:08 PM, Peter Geoghegan wrote:
On Thu, Sep 29, 2016 at 4:10 PM, Heikki Linnakangas wrote:
Bah, I fumbled the initSlabAllocator() function, attached is a fixed
version.
This looks much better. It's definitely getting close. Thanks for
being considerate of my
On Thu, Sep 29, 2016 at 4:10 PM, Heikki Linnakangas wrote:
> Bah, I fumbled the initSlabAllocator() function, attached is a fixed
> version.
This looks much better. It's definitely getting close. Thanks for
being considerate of my more marginal concerns. More feedback:
* Should
On Thu, Sep 29, 2016 at 11:38 AM, Peter Geoghegan wrote:
> On Thu, Sep 29, 2016 at 2:59 PM, Robert Haas wrote:
>>> Maybe that was the wrong choice of words. What I mean is that it seems
>>> somewhat unprincipled to give over an equal share of memory to
On Thu, Sep 29, 2016 at 2:59 PM, Robert Haas wrote:
>> Maybe that was the wrong choice of words. What I mean is that it seems
>> somewhat unprincipled to give over an equal share of memory to each
>> active-at-least-once tape, ...
>
> I don't get it. If the memory is being
On 09/29/2016 05:41 PM, Heikki Linnakangas wrote:
Here's a new patch version, addressing the points you made. Please have
a look!
Bah, I fumbled the initSlabAllocator() function, attached is a fixed
version.
- Heikki
>From bd74cb9c32b3073637d6932f3b4552598fcdc92a Mon Sep 17 00:00:00 2001
On 09/29/2016 01:52 PM, Peter Geoghegan wrote:
* Variables like maxTapes have a meaning that is directly traceable
back to Knuth's description of polyphase merge. I don't think that you
should do anything to them, on general principle.
Ok. I still think that changing maxTapes would make sense,
On Thu, Sep 29, 2016 at 6:52 AM, Peter Geoghegan wrote:
>> How is it awkward?
>
> Maybe that was the wrong choice of words. What I mean is that it seems
> somewhat unprincipled to give over an equal share of memory to each
> active-at-least-once tape, ...
I don't get it. If the
On Thu, Sep 29, 2016 at 10:49 AM, Heikki Linnakangas wrote:
>> Do I have that right? If so, this seems rather awkward. Hmm.
>
>
> How is it awkward?
Maybe that was the wrong choice of words. What I mean is that it seems
somewhat unprincipled to give over an equal share of memory
On 09/28/2016 07:20 PM, Peter Geoghegan wrote:
On Wed, Sep 28, 2016 at 5:11 PM, Peter Geoghegan wrote:
This is why I never pursued batch memory for non-final merges. Isn't
that what you're doing here? You're pretty much always setting
"state->batchUsed = true".
Wait. I guess
On 09/28/2016 07:11 PM, Peter Geoghegan wrote:
On Wed, Sep 28, 2016 at 5:04 PM, Heikki Linnakangas wrote:
Not sure that I understand. I agree that each merge pass tends to use
roughly the same number of tapes, but the distribution of real runs on
tapes is quite unbalanced in
On Wed, Sep 28, 2016 at 5:11 PM, Peter Geoghegan wrote:
> This is why I never pursued batch memory for non-final merges. Isn't
> that what you're doing here? You're pretty much always setting
> "state->batchUsed = true".
Wait. I guess you feel you have to, since it wouldn't be
On Wed, Sep 28, 2016 at 5:04 PM, Heikki Linnakangas wrote:
>> Not sure that I understand. I agree that each merge pass tends to use
>> roughly the same number of tapes, but the distribution of real runs on
>> tapes is quite unbalanced in earlier merge passes (due to dummy runs).
On 09/28/2016 06:05 PM, Peter Geoghegan wrote:
On Thu, Sep 15, 2016 at 9:51 PM, Heikki Linnakangas wrote:
I don't think it makes much difference in practice, because most merge
passes use all, or almost all, of the available tapes. BTW, I think the
polyphase algorithm prefers
On Thu, Sep 15, 2016 at 9:51 PM, Heikki Linnakangas wrote:
>> I still don't get why you're doing all of this within mergeruns() (the
>> beginning of when we start merging -- we merge all quicksorted runs),
>> rather than within beginmerge() (the beginning of one particular merge
On Thu, Sep 22, 2016 at 4:17 AM, Heikki Linnakangas wrote:
> On 09/22/2016 03:40 AM, Claudio Freire wrote:
>>
>> On Tue, Sep 20, 2016 at 3:34 PM, Claudio Freire
>> wrote:
>>>
>>> The results seem all over the map. Some regressions seem significant
>>>
On 09/22/2016 03:40 AM, Claudio Freire wrote:
On Tue, Sep 20, 2016 at 3:34 PM, Claudio Freire wrote:
The results seem all over the map. Some regressions seem significant
(both in the amount of performance lost and their significance, since
all 4 runs show a similar
On Tue, Sep 20, 2016 at 3:34 PM, Claudio Freire wrote:
> On Fri, Sep 9, 2016 at 9:51 PM, Claudio Freire wrote:
>> On Fri, Sep 9, 2016 at 8:13 AM, Heikki Linnakangas wrote:
>>>
>>> Claudio, if you could also repeat the tests you
On Fri, Sep 9, 2016 at 9:51 PM, Claudio Freire wrote:
> On Fri, Sep 9, 2016 at 8:13 AM, Heikki Linnakangas wrote:
>>
>> Claudio, if you could also repeat the tests you ran on Peter's patch set on
>> the other thread, with these patches, that'd be nice.
On Sat, Sep 17, 2016 at 9:41 AM, Heikki Linnakangas wrote:
> Ok. I'll probably read through it myself once more some time next week, and
> also have a first look at your actual parallel sorting patch. Have a good
> trip!
Thanks! It will be good to get away for a while.
I'd be
On 09/17/2016 07:27 PM, Peter Geoghegan wrote:
On Wed, Sep 14, 2016 at 10:43 AM, Heikki Linnakangas wrote:
Addressed all your comments one way or another, new patch attached.
So, it's clear that this isn't ready today. As I mentioned, I'm going
away for a week now. I ask
On Wed, Sep 14, 2016 at 10:43 AM, Heikki Linnakangas wrote:
> Addressed all your comments one way or another, new patch attached.
So, it's clear that this isn't ready today. As I mentioned, I'm going
away for a week now. I ask that you hold off on committing this until
I return,
On Thu, Sep 15, 2016 at 1:51 PM, Heikki Linnakangas wrote:
> BTW, does a 1-way merge make any sense? I was surprised to see this in the
> log, even without this patch:
>
> LOG: finished 1-way merge step: CPU 0.62s/7.22u sec elapsed 8.43 sec
> STATEMENT: SELECT COUNT(*) FROM
On Thu, Sep 15, 2016 at 1:51 PM, Heikki Linnakangas wrote:
>> I still don't get why you're doing all of this within mergeruns() (the
>> beginning of when we start merging -- we merge all quicksorted runs),
>> rather than within beginmerge() (the beginning of one particular merge
On 09/15/2016 10:12 PM, Peter Geoghegan wrote:
On Wed, Sep 14, 2016 at 10:43 AM, Heikki Linnakangas wrote:
Spotted another issue with this code just now. Shouldn't it be based
on state->tapeRange? You don't want the destTape to get memory, since
you don't use batch memory for
On Wed, Sep 14, 2016 at 10:43 AM, Heikki Linnakangas wrote:
> Addressed all your comments one way or another, new patch attached. Comments
> on some specific points below:
Cool. My response here is written under time pressure, which is not
ideal. I think it's still useful,
Addressed all your comments one way or another, new patch attached.
Comments on some specific points below:
On 09/12/2016 01:13 AM, Peter Geoghegan wrote:
Other things I noticed:
* You should probably point out that typically, access to batch memory
will still be sequential, despite your
On Mon, Sep 12, 2016 at 12:07 PM, Heikki Linnakangas wrote:
> Here's a fixed version. I'll go through Peter's comments and address those,
> but I don't think there was anything there that should affect performance
> much, so I think you can proceed with your benchmarking with
On 09/12/2016 06:47 PM, Claudio Freire wrote:
On Mon, Sep 12, 2016 at 12:02 PM, Claudio Freire wrote:
On Sun, Sep 11, 2016 at 12:47 PM, Heikki Linnakangas wrote:
Here's a new version of these patches, rebased over current master. I
squashed the two
On Mon, Sep 12, 2016 at 8:47 AM, Claudio Freire wrote:
> I spoke too soon, git AM had failed and I didn't notice.
I wrote the regression test that causes Postgres to crash with the
patch applied. It tests, among other things, that CLUSTER tuples are
fixed-up by a routine
On Sun, Sep 11, 2016 at 12:47 PM, Heikki Linnakangas wrote:
> Here's a new version of these patches, rebased over current master. I
> squashed the two patches into one, there's not much point to keep them
> separate.
I don't know what was up with the other ones, but this one
On 12/09/16 12:16, Gavin Flower wrote:
[...]
two blocks would be logically adjacent (which means they are likely
to be physically close together, but not guaranteed!).
[...]
Actual disk layouts are quite complicated, the above is an over
simplification, but the message is still valid.
On 12/09/16 10:13, Peter Geoghegan wrote:
On Sun, Sep 11, 2016 at 8:47 AM, Heikki Linnakangas wrote:
[...]
I don't know what the difference is between accessing 10 pages
randomly, and accessing a random set of 10 single pages sequentially,
in close succession. As Tom would
On Sun, Sep 11, 2016 at 3:13 PM, Peter Geoghegan wrote:
>> + for (tapenum = 0; tapenum < maxTapes; tapenum++)
>> + {
>> + LogicalTapeAssignReadBufferSize(state->tapeset, tapenum,
>> + (per_tape + (tapenum < cutoff ? 1 :
>> 0)) *
On Sun, Sep 11, 2016 at 3:13 PM, Peter Geoghegan wrote:
> * Doesn't this code need to call MemoryContextAllocHuge() rather than
> palloc()?:
>
>> @@ -709,18 +765,19 @@ LogicalTapeRewind(LogicalTapeSet *lts, int tapenum,
>> bool forWrite)
>> Assert(lt->frozen);
>>
On Sun, Sep 11, 2016 at 3:13 PM, Peter Geoghegan wrote:
> * Please make this use the ".., + 1023" natural rounding trick that is
> used in the similar traces that are removed:
>
>> +#ifdef TRACE_SORT
>> + if (trace_sort)
>> + elog(LOG, "using %d kB of memory for read
On Sun, Sep 11, 2016 at 8:47 AM, Heikki Linnakangas wrote:
> Here's a new version of these patches, rebased over current master. I
> squashed the two patches into one, there's not much point to keep them
> separate.
I think I have my head fully around this now. For some reason,
Here's a new version of these patches, rebased over current master. I
squashed the two patches into one, there's not much point to keep them
separate.
- Heikki
>From 6e3813d876cf3efbe5f1b80c45f44ed5494304ab Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas
Date:
On Sat, Sep 10, 2016 at 12:04 AM, Heikki Linnakangas wrote:
>> Did I misunderstand something? I'm applying the first patch of Peter's
>> series (cap number of tapes), then your first one (remove prefetch)
>> and second one (use larger read buffers).
>
>
> Yes. I have been testing
On 09/10/2016 04:21 AM, Claudio Freire wrote:
On Fri, Sep 9, 2016 at 9:51 PM, Claudio Freire wrote:
On Fri, Sep 9, 2016 at 8:13 AM, Heikki Linnakangas wrote:
Claudio, if you could also repeat the tests you ran on Peter's patch set on
the other
On Fri, Sep 9, 2016 at 9:51 PM, Claudio Freire wrote:
> On Fri, Sep 9, 2016 at 8:13 AM, Heikki Linnakangas wrote:
>>
>> Claudio, if you could also repeat the tests you ran on Peter's patch set on
>> the other thread, with these patches, that'd be nice.
On Fri, Sep 9, 2016 at 8:13 AM, Heikki Linnakangas wrote:
>
> Claudio, if you could also repeat the tests you ran on Peter's patch set on
> the other thread, with these patches, that'd be nice. These patches are
> effectively a replacement for
>
On Fri, Sep 9, 2016 at 5:25 AM, Greg Stark wrote:
> Wow, this is really cool. We should do something like this for query
> execution too.
We should certainly do this for tuplestore.c, too. I've been meaning
to adopt it to use batch memory. I did look at it briefly, and recall
that
On Fri, Sep 9, 2016 at 4:55 AM, Heikki Linnakangas wrote:
> I'm happy with the amount of testing I've done now, and the results. Does
> anyone want to throw out any more test cases where there might be a
> regression? If not, let's get these reviewed and committed.
I'll try to
On 09/09/2016 03:25 PM, Greg Stark wrote:
On Fri, Sep 9, 2016 at 1:01 PM, Heikki Linnakangas wrote:
I'm happy with what it looks like. We are in fact getting a more sequential
access pattern with these patches, because we're not expanding the pre-read
tuples into SortTuples.
On Fri, Sep 9, 2016 at 1:01 PM, Heikki Linnakangas wrote:
> I'm happy with what it looks like. We are in fact getting a more sequential
> access pattern with these patches, because we're not expanding the pre-read
> tuples into SortTuples. Keeping densely-packed blocks in memory,
On 09/08/2016 09:59 PM, Heikki Linnakangas wrote:
On 09/06/2016 10:26 PM, Peter Geoghegan wrote:
On Tue, Sep 6, 2016 at 12:08 PM, Peter Geoghegan wrote:
Offhand, I would think that taken together this is very important. I'd
certainly want to see cases in the hundreds of
On 09/06/2016 10:26 PM, Peter Geoghegan wrote:
On Tue, Sep 6, 2016 at 12:08 PM, Peter Geoghegan wrote:
Offhand, I would think that taken together this is very important. I'd
certainly want to see cases in the hundreds of megabytes or gigabytes
of work_mem alongside your 4MB
On Tue, Sep 6, 2016 at 12:08 PM, Peter Geoghegan wrote:
> Offhand, I would think that taken together this is very important. I'd
> certainly want to see cases in the hundreds of megabytes or gigabytes
> of work_mem alongside your 4MB case, even just to be able to talk
>
On Tue, Sep 6, 2016 at 5:20 AM, Heikki Linnakangas wrote:
> I wrote a quick patch to test that, attached. It seems to improve
> performance, at least in this small test case:
>
> create table lotsofints(i integer);
> insert into lotsofints select random() * 10.0 from
>
57 matches
Mail list logo