On Wed, Mar 15, 2017 at 10:52 AM, Dilip Kumar <dilipbal...@gmail.com> wrote: > I have reviewed the patch, I have some questions. > > @@ -3031,7 +3031,7 @@ exec_stmt_return_query(PLpgSQL_execstate *estate, > Assert(stmt->dynquery != NULL); > portal = exec_dynquery_with_params(estate, stmt->dynquery, > stmt->params, NULL, > - CURSOR_OPT_PARALLEL_OK); > + 0); > > After this patch, I have noticed that In exec_stmt_return_query, we > allow parallel query if it's a static query > but not for the dynamic query. Any specific reason for different > handling for these 2 cases? > The reason for such behaviour is given upthread, in exec_stmt_return_query, the query may be executed by either exec_run_select which then uses SPI_execute_plan_with_paramlist(), which calls _SPI_execute_plan(), which calls _SPI_pquery(), hence if parallelOK is set then passing CURSOR_OPT_PARALLEL_OK is safe. Otherwise, exec_stmt_return_query executes with a call to SPI_cursor_fetch() with a fetch count of 50, and that calls _SPI_cursor_operation() which calls PortalRunFetch(), hence, passing CURSOR_OPT_PARALLEL_OK would not be safe in this case hence not passed. > > @ -314,7 +314,7 @@ SPI_execute(const char *src, bool read_only, long tcount) > > memset(&plan, 0, sizeof(_SPI_plan)); > plan.magic = _SPI_PLAN_MAGIC; > - plan.cursor_options = 0; > + plan.cursor_options = CURSOR_OPT_PARALLEL_OK; > > In SPI_Execute, we are setting cursor_options to > CURSOR_OPT_PARALLEL_OK. I have analysed call to this function from PL > and seems fine to me. But, I have a question have you analyzed all the > calls to this functions? > e.g. build_tuplestore_recursively, get_crosstab_tuplestore. > The thing with SPI_execute is that it calls pg_plan_query through _SPI_execute_plan, there if the query is parallel unsafe then parallelism will be restricted by planner itself otherwise it is enabled if CURSOR_OPT_PARALLEL_OK was set. So, yes I evaluated all the calls for this functions and here is the analysis of why it should be safe passing CURSOR_OPT_PARALLEL_OK, in some of the suspicious looking calls(other calls are directly related to spy functions) In crosstab: ret = SPI_execute(sql, true, 0); In load_categories_hash: ret = SPI_execute(cats_sql, true, 0); In get_crosstab_tuplestore: ret = SPI_execute(sql, true, 0); In build_tuplestore_recursively: ret = SPI_execute(sql.data, true, 0); In query_to_oid_list: SPI_execute(query, true, 0);
In all of these calls, read_only flag is passed to be true, hence enabling parallelism will not cause any anomalous behaviour. In refresh_by_match_merge: if (SPI_execute(querybuf.data, false, 1) != SPI_OK_SELECT) In this case, since read_only is set to false, hence, in SPI_execute when planner will recalled for such a case it will disable parallelism, hence, no issues. if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 && IsUnderPostmaster && dynamic_shared_memory_type != DSM_IMPL_NONE && parse->commandType == CMD_SELECT && !parse->hasModifyingCTE && max_parallel_workers_per_gather > 0 && !IsParallelWorker() && !IsolationIsSerializable()) > > On Fri, Mar 10, 2017 at 5:38 PM, Rafia Sabih > <rafia.sa...@enterprisedb.com> wrote: >> I wanted to clarify a few things here, I noticed that call of ExecutorRun in >> postquel_getnext() uses !es->lazyEval as execute_once, this is confusing, as >> it is true even in cases when a simple query like "select count(*) from t" >> is used in a sql function. Hence, restricting parallelism for cases when it >> shouldn't. It seems to me that es->lazyEval is not set properly or it should >> not be true for simple select statements. I found that in the definition of >> execution_state >> bool lazyEval; /* true if should fetch one row at a time > > Hmm, It seems that es->lazyEval is not set properly, Ideally, it > should be set true only if it's lazy evaluation but in this case, it's > used for identifying the tupcount as well. IMHO, it should be fixed. > > Any other opinion on this? > Hmmm... I tried investigating into this but it looks like there isn't much scope for this. LazyEvalOk is set for SELECT commands in init_execution_state as per, /* * Mark the last canSetTag query as delivering the function result; then, * if it is a plain SELECT, mark it for lazy evaluation. If it's not a * SELECT we must always run it to completion. ... if (lasttages && fcache->junkFilter) { lasttages->setsResult = true; if (lazyEvalOK && lasttages->stmt->commandType == CMD_SELECT && !lasttages->stmt->hasModifyingCTE) fcache->lazyEval = lasttages->lazyEval = true; } and then in postquel_getnext we set execute_once = !es->lazyEval [1], which also makes sense since, /* Run regular commands to completion unless lazyEval */ Hence, this situation looks like it is restricting parallelism for some cases where it might not cause any issues, but clearly modifying lazyEval is not a safe option. Additionally, I think we do not have enough information to ensure that a select query will not cause multiple ExecutorRun calls for these user-defined queries inside SQL functions. Moreover, lazyEval is a kind of may be thing, meaning that it might call ExecutorRun multiple times when set but not ensured that it will as in the case of select count(*) from t queries. Please let me know your views on this. [1] https://www.postgresql.org/message-id/ca+tgmobxehvhbjtwdupzm9bvslitj-kshxqj2um5gpdze9f...@mail.gmail.com -- Regards, Rafia Sabih EnterpriseDB: http://www.enterprisedb.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers