Hi Andres and Peter,
Please see below for inline responses to your feedback. New patch attached.
On Wed, Nov 8, 2017 at 10:01 AM, Andres Freund wrote:
> +set min_parallel_table_scan_size = 0;
> +set parallel_setup_cost = 0;
> +-- Make a simple relation with well distributed
Hi,
@@ -747,7 +747,7 @@ try_hashjoin_path(PlannerInfo *root,
* never have any output pathkeys, per comments in create_hashjoin_path.
*/
initial_cost_hashjoin(root, , jointype, hashclauses,
- outer_path, inner_path,
extra);
Hi,
* avoids wasting memory on duplicated hash tables
* avoids wasting disk space on duplicated batch files
* avoids wasting CPU executing duplicate subplans
What's the last one referring to?
+static void
+MultiExecParallelHash(HashState *node)
+{
+ switch
Hi Peter,
See responses to a couple of points below. I'll respond to the other
points separately (ie with code/comment changes).
On Wed, Nov 8, 2017 at 10:32 AM, Peter Geoghegan wrote:
> On Tue, Nov 7, 2017 at 1:01 PM, Andres Freund wrote:
>> +/*
>> + *
On Wed, Nov 8, 2017 at 10:32 AM, Robert Haas wrote:
> On Tue, Nov 7, 2017 at 4:01 PM, Andres Freund wrote:
>> diff --git a/src/backend/utils/resowner/resowner.c
>> b/src/backend/utils/resowner/resowner.c
>> index 4c35ccf65eb..8b91d5a6ebe 100644
>> ---
On Tue, Nov 7, 2017 at 1:01 PM, Andres Freund wrote:
> +/*
> + * Build the name for a given segment of a given BufFile.
> + */
> +static void
> +MakeSharedSegmentName(char *name, const char *buffile_name, int segment)
> +{
> + snprintf(name, MAXPGPATH, "%s.%d",
On Tue, Nov 7, 2017 at 4:01 PM, Andres Freund wrote:
> + ResourceOwnerEnlargeFiles(CurrentResourceOwner);
> + ResourceOwnerRememberFile(CurrentResourceOwner, file);
> + VfdCache[file].resowner = CurrentResourceOwner;
>
> So maybe I'm being pedantic here, but
Hi,
Here's a review of v24
+set min_parallel_table_scan_size = 0;
+set parallel_setup_cost = 0;
+-- Make a simple relation with well distributed keys and correctly
+-- estimated size.
+create table simple as
+ select generate_series(1, 2) AS id, 'aa';
+alter
On Mon, Oct 30, 2017 at 1:49 PM, Thomas Munro
wrote:
> A couple of stupid things outstanding:
>
> 1. EXPLAIN ANALYZE for Parallel Hash "actual" shows the complete row
> count, which is interesting to know (or not? maybe I should show it
> somewhere else?), but the
On Tue, Aug 1, 2017 at 9:28 AM, Andres Freund wrote:
> On 2017-07-26 20:12:56 +1200, Thomas Munro wrote:
>> I'll report on performance separately.
>
> Looking forward to that ;)
Here are some experimental results from a Xeon E5-2695 v3 with a ton
of RAM and spinning disks
On Fri, Oct 27, 2017 at 12:24 AM, Rushabh Lathia
wrote:
> While re-basing the parallel-B-tree-index-build patch on top v22 patch
> sets, found cosmetic review:
Thanks!
> 1) BufFileSetEstimate is removed but it's still into buffile.h
>
> +extern size_t
While re-basing the parallel-B-tree-index-build patch on top v22 patch
sets, found cosmetic review:
1) BufFileSetEstimate is removed but it's still into buffile.h
+extern size_t BufFileSetEstimate(int stripes);
2) BufFileSetCreate is renamed with BufFileSetInit, but used at below
place in
On Tue, Oct 24, 2017 at 10:10 PM, Thomas Munro
wrote:
> Here is an updated patch set that does that ^.
It's a bit hard to understand what's going on with the v21 patch set I
posted yesterday because EXPLAIN ANALYZE doesn't tell you anything
interesting. Also, if
On Tue, Sep 19, 2017 at 8:06 AM, Robert Haas wrote:
> On Thu, Sep 14, 2017 at 10:01 AM, Thomas Munro
> wrote:
>> 3. Gather Merge and Parallel Hash Join may have a deadlock problem.
>
> [...]
>
> Thomas and I spent about an hour and a half
Hi Thomas,
I was testing this feature with v20 patch, and I got a crash while doing
large joins with small work_mem, and lots of workers as below.
-- Machine Configuration: (d1.xlarge) CUPs : 8 , RAM : 16GB , SIze : 640GB
-- postgres.conf setting as below:
work_mem = *64kB*
On Thu, Oct 5, 2017 at 7:07 PM, Rushabh Lathia wrote:
> v20 patch set (I was trying 0008, 0009 patch) not getting cleanly apply on
> latest commit also getting compilation error due to refactor in below
> commit.
>
> commit 0c5803b450e0cc29b3527df3f352e6f18a038cc6
Hi
v20 patch set (I was trying 0008, 0009 patch) not getting cleanly apply on
latest commit also getting compilation error due to refactor in below
commit.
commit 0c5803b450e0cc29b3527df3f352e6f18a038cc6
Author: Peter Eisentraut
Date: Sat Sep 23 09:49:22 2017 -0400
Refactor
On Thu, Sep 21, 2017 at 5:49 AM, Peter Geoghegan wrote:
> Graefe's "Query Evaluation Techniques for Large Databases" has several
> pages on deadlock avoidance strategies. It was written almost 25 years
> ago, but still has some good insights IMV (you'll recall that Graefe
> is the
On Mon, Sep 18, 2017 at 1:06 PM, Robert Haas wrote:
> If we don't adopt some approach along these lines, then I think we've
> got to articulate some alternative deadlock-avoidance rule and make
> sure every parallel query facility follows it. I welcome ideas on
> that
On Thu, Sep 14, 2017 at 10:01 AM, Thomas Munro
wrote:
> 3. Gather Merge and Parallel Hash Join may have a deadlock problem.
> Since Gather Merge needs to block waiting for tuples, but workers wait
> for all participants (including the leader) to reach barriers.
On Thu, Sep 14, 2017 at 11:57 AM, Thomas Munro
wrote:
> On Thu, Sep 14, 2017 at 12:51 AM, Prabhat Sahu
> wrote:
>> Setting with lower "shared_buffers" and "work_mem" as below, query getting
>> crash but able to see explain plan.
>
>
On Thu, Sep 14, 2017 at 12:51 AM, Prabhat Sahu
wrote:
> Setting with lower "shared_buffers" and "work_mem" as below, query getting
> crash but able to see explain plan.
Thanks Prabhat. A small thinko in the batch reset code means that it
sometimes thinks the
Hi Thomas,
Setting with lower "shared_buffers" and "work_mem" as below, query getting
crash but able to see explain plan.
shared_buffers = 1MB
work_mem = 1MB
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
enable_mergejoin = off
enable_nestloop = off
enable_hashjoin = on
On Thu, Aug 31, 2017 at 6:23 PM, Thomas Munro wrote:
> Here's a new rebased and debugged patch set.
Hi Thomas,
I have applied the recent patch (v19) and started testing on this feature
and i got a crash with below testcase.
with default setting on
On Fri, Sep 1, 2017 at 7:42 PM, Thomas Munro
wrote:
>> I'm thinking about something like this:
>>
>> Gather
>> -> Nested Loop
>> -> Parallel Seq Scan
>> -> Hash Join
>> -> Seq Scan
>> -> Parallel Hash
>> -> Parallel Seq Scan
>>
>> The hash join has
On Sat, Sep 2, 2017 at 10:45 AM, Robert Haas wrote:
> On Fri, Sep 1, 2017 at 6:32 PM, Thomas Munro
> wrote:
>> On Sat, Sep 2, 2017 at 5:13 AM, Robert Haas wrote:
>>> On Thu, Aug 31, 2017 at 8:53 AM, Thomas Munro
>>>
On Fri, Sep 1, 2017 at 6:32 PM, Thomas Munro
wrote:
> On Sat, Sep 2, 2017 at 5:13 AM, Robert Haas wrote:
>> On Thu, Aug 31, 2017 at 8:53 AM, Thomas Munro
>> wrote:
>>> Check out ExecReScanGather(): it shuts
On Sat, Sep 2, 2017 at 5:13 AM, Robert Haas wrote:
> On Thu, Aug 31, 2017 at 8:53 AM, Thomas Munro
> wrote:
>> Check out ExecReScanGather(): it shuts down and waits for all workers
>> to complete, which makes the assumptions in
On Thu, Aug 31, 2017 at 8:53 AM, Thomas Munro
wrote:
> Solution 2: Teach tuple queues to spill to disk instead of blocking
> when full. I think this behaviour should probably only be activated
> while the leader is running the plan rather than draining tuple
>
Here's a new rebased and debugged patch set.
On Tue, Aug 1, 2017 at 1:11 PM, Andres Freund wrote:
> - Echoing concerns from other threads (Robert: ping): I'm doubtful that
> it makes sense to size the number of parallel workers solely based on
> the parallel scan node's
On Wed, Aug 2, 2017 at 10:06 PM, Thomas Munro
wrote:
> On Tue, Aug 1, 2017 at 1:11 PM, Andres Freund wrote:
>> WRT the main patch:
>
> Thanks for the review. I will respond soon, but for now I just wanted
> to post a rebased version (no
On Tue, Aug 1, 2017 at 1:11 PM, Andres Freund wrote:
> WRT the main patch:
Thanks for the review. I will respond soon, but for now I just wanted
to post a rebased version (no changes) because v16 no longer applies.
--
Thomas Munro
http://www.enterprisedb.com
On Mon, Jul 31, 2017 at 9:11 PM, Andres Freund wrote:
> - Echoing concerns from other threads (Robert: ping): I'm doubtful that
> it makes sense to size the number of parallel workers solely based on
> the parallel scan node's size. I don't think it's this patch's job to
From: Thomas Munro
Date: Wed 26 Jul 2017 19:58:20 NZST
Subject: [PATCH] Add support for parallel-aware hash joins.
Hi,
WRT the main patch:
- Echoing concerns from other threads (Robert: ping): I'm doubtful that
it makes sense to size the number of parallel
On Tue, Aug 1, 2017 at 9:28 AM, Andres Freund wrote:
> On 2017-07-26 20:12:56 +1200, Thomas Munro wrote:
>> 2. Simplified costing. There is now just one control knob
>> "parallel_synchronization_cost", which I charge for each time the
>> participants will wait for each other
Hi,
On 2017-07-26 20:12:56 +1200, Thomas Munro wrote:
> Here is a new version of my parallel-aware hash join patchset.
Yay!
Working on reviewing this. Will send separate emails for individual
patch reviews.
> 2. Simplified costing. There is now just one control knob
>
Hi hackers,
Here is a new version of my parallel-aware hash join patchset. I've
dropped 'shared' from the feature name and EXPLAIN output since that's
now implied by the word "Parallel" (that only made sense in earlier
versions that had Shared Hash and Parallel Shared Hash, but a Shared
Hash
37 matches
Mail list logo