On Mon, Mar 20, 2017 at 8:21 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Fri, Mar 17, 2017 at 8:10 PM, Robert Haas <robertmh...@gmail.com> wrote: >> While I was studying what you did with reparameterize_path_by_child(), >> I started to wonder whether reparameterize_path() doesn't need to >> start handling join paths. I think it only handles scan paths right >> now because that's the only thing that can appear under an appendrel >> created by inheritance expansion, but you're changing that. Maybe >> it's not critical -- I think the worst consequences of missing some >> handling there is that we won't consider a parameterized path in some >> case where it would be advantageous to do so. Still, you might want >> to investigate a bit. > > I spent a fair amount of time this weekend musing over > reparameterize_path_by_child(). I think a key question for this patch > - as you already pointed out - is whether we're happy with that > approach. When we discover that we want to perform a partitionwise > parameterized nestloop, and therefore that we need the paths for each > inner appendrel to get their input values from the corresponding outer > appendrel members rather than from the outer parent, we've got two > choices. The first is to do what the patch actually does, which is to > build a new path tree for the nestloop inner path parameterized by the > appropriate childrel. The second is to use the existing paths, which > are parameterized by the parent rel, and then somehow allow make that > work. For example, you can imagine that create_plan_recurse() could > pass down a list of parameterized nestloops above the current point in > the path tree, and a parent-child mapping for each, and then we could > try to substitute everything while actually generating the plan > instead of creating paths sooner. Which is better? > > It would be nice to hear opinions from anyone else who cares, but > after some thought I think the approach you've picked is probably > better, because it's more like what we do already. We have existing > precedent for reparameterizing a path, but none for allowing a Var for > one relation (the parent) to in effect refer to another relation (the > child). > > That having been said, having try_nestloop_path() perform the > reparameterization at the very top of the function seems quite > undesirable. You're creating a new path there before you know whether > it's going to be rejected by the invalid-parameterization test and > also before you know whether initial_cost_nestloop is going to reject > it. It would be much better if you could find a way to postpone the > reparameterization until after those steps, and only do it if you're > going to try add_path().
On a further testing of this patch I find another case when it is showing regression, the time taken with patch is around 160 secs and without it is 125 secs. Another minor thing to note that is planning time is almost twice with this patch, though I understand that this is for scenarios with really big 'big data' so this may not be a serious issue in such cases, but it'd be good if we can keep an eye on this that it doesn't exceed the computational bounds for a really large number of tables.. Please find the attached .out file to check the output I witnessed and let me know if anymore information is required Schema and data was similar to the preciously shared schema with the addition of more data for this case, parameter settings used were: work_mem = 1GB random_page_cost = seq_page_cost = 0.1 -- Regards, Rafia Sabih EnterpriseDB: http://www.enterprisedb.com/
Description: Binary data
-- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers