I missed the example plan cache revalidation patch in the previous mail. Sorry. Here it is.
On Wed, Jan 20, 2016 at 7:20 PM, Ashutosh Bapat < ashutosh.ba...@enterprisedb.com> wrote: > > > On Wed, Jan 20, 2016 at 4:58 AM, Robert Haas <robertmh...@gmail.com> > wrote: > >> On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat >> <ashutosh.ba...@enterprisedb.com> wrote: >> > Thanks Thom for bringing it to my notice quickly. Sorry for the same. >> > >> > Here are the patches. >> > >> > 1. pg_fdw_core_v2.patch: changes in core related to user mapping >> handling, >> > GUC >> > enable_foreignjoin >> >> I tried to whittle this patch down to something that I'd be >> comfortable committing and ended up with nothing left. >> >> First, I removed the enable_foreignjoin GUC. I still think an >> FDW-specific GUC is better, and I haven't heard anybody make a strong >> argument the other way. Your argument that this might be inconvenient >> if somebody is using a bunch of join-pushdown-enabled FDWs sounds like >> a strictly theoretical problem, considering how much difficulty we're >> having getting even one FDW to support join pushdown. And if it does >> happen, the user can script it. I'm willing to reconsider this point >> if there is a massive chorus of support for having this be a core >> option rather than an FDW option, but to me it seems that we've gone >> to a lot of trouble to make the system extensible and might as well >> get some benefit from it. >> > > Ok. Removed. > > >> >> Second, I removed the documentation for GetForeignTable(). That >> function is already documented and doesn't need re-documenting. >> > > Removed. > > >> >> Third, I removed GetUserMappingById(). As mentioned in the email to >> which I replied earlier, that doesn't actually produce the same result >> as the way we're doing it now, and might leave the user ID invalid. >> > > The comment you quoted was my comment :). I never got a reply from > Hanada-san on that comment. A bit of investigation revealed this: A pushed > down foreign join which involves N foreign tables, might have different > effective userid for each of them. > userid = OidIsValid(rte->checkAsUser) ? rte->checkAsUser : GetUserId() > In such a case, AFAIU, the join will be pushed down only if none of those > have user mapping and there is public user mapping. Is that right? In such > a case, which userid should be stored in UserMapping structure?It might > look like setting GetUserId() as userid in UserMapping is wise, but not > really. All the foreign tables might have different effective userids, each > different from GetUserId() and what GetUserId() would return might have a > user mapping different from the effective userids. What userid should > UserMapping structure have in that case? I thought, that's why Hanada-san > used invalid userid there, so left as it is. > > But that has an undesirable effect of setting it to invalid, even in case > of base relation or when all the joining relations have same effective > userid. We may cache "any" of the effective userids of joining relations if > the user mapping matches in build_join_rel() (we will need to add another > field userid in RelOptInfo along with umid). While creating the plan use > this userid to get user mapping. Does that sound good? This clubbed with > the plan cache invalidation should be full proof. We need a way to get user > mapping whether from umid (in which case public user mapping will have -1) > or serverid and some userid. > > >> Even if that were no issue, it doesn't seem to add anything. The only >> caller of the new function is postgresBeginForeignScan(), and that >> function already has a way of getting the user mapping. The new way >> doesn't seem to be better or faster, so why bother changing it? >> > >> At this point, I was down to just the changes to store the user >> mapping ID (umid) in the RelOptInfo, and to consider join pushdown >> only if the user mapping IDs match. One observation I made is that if >> the code to initialize the FDW-related fields were lifted from >> get_relation_info() up to build_simple_rel(), we would not need to use >> planner_rt_fetch(), because the caller already has that information. >> That seems like it might be worth considering. But then I realized a >> more fundamental problem: making the plan depend on the user ID is a >> problem, because the user ID can be changed, and the plan might be >> cached. The same issue arises for RLS, but there is provision for >> that in RevalidateCachedQuery. This patch makes no similar provision. >> > > Thanks for the catch. Please see attached patch for a quick fix in > RevalidateCachedQuery(). Are you thinking on similar lines? However, I am > not sure of planUserId - that field actually puzzles me. It's set when the > first time we create a plan and it never changes then. This seems to be a > problem, even for RLS, in following scene > > prepare statement using RLS feature > execute statement -- now planUserId is set to say user1 > set session role user2; > execute statement - RevalidateCachedQuery() detects that the user has > changed and invalidates the plan and recreates it (including possibly the > query analyze and rewrite). Note planUserId is unchanged here; it's still > user1 > reset role; -- now GetUserId() returns user1 > execute statement - RevalidateCachedQuery() detects that the planUserId > and GetUserId is same and doesn't invalidate the plan. Looks like it will > execute wrong plan. I am not very familiar with RLS feature, so this > analysis can be completely wrong. > > If planUserId is not good for our purpose we can always have a different > field there, say foreignJoinUserId. > > There might be another issue here. A user mapping can change and the plan > is cached. Public user mapping was used while planning but before the > (cached) plan was executed again, the user mapping for one of the effective > users involved was added with different credentials. We should expect the > new user mapping to be used for fetching data from foreign tables and thus > join becomes unsafe to be pushed down. If this issue exists we should add > code to invalidate the plan cache, at least the plans which have pushed > down joins. > > >> >> I think there are two ways forward here. One is to figure out a way >> for the plancache to invalidate queries using FDW join pushdown when >> the user ID changes. > > > I think this is better since it provides avenue for join pushdown even in > the changed conditions, thus giving better performance if permitted under > the changed conditions. > > >> The other is to recheck at execution time >> whether the user mapping IDs still match, and if not, fall back to >> using the "backup" plan that we need anyway for EvalPlanQual rechecks. >> This would of course mean that the backup plan would need to be >> something decently efficient, not just whatever we had nearest to >> hand. But that might not be too hard to manage. >> >> > > In this case, we will need to create the backup plan even in those cases > where there is no EvalPlanQual involved. I am hesitant to do it that way, > unless we are left with no other option. > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
-- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers