Re: Logging parallel worker draught

Imseih (AWS), Sami Wed, 11 Oct 2023 08:27:21 -0700

>> Currently explain ( analyze ) will give you the "Workers Planned"
>> and "Workers launched". Logging this via auto_explain is possible, so I am
>> not sure we need additional GUCs or debug levels for this info.
>>
>> -> Gather (cost=10430.00..10430.01 rows=2 width=8) (actual tim
>> e=131.826..134.325 rows=3 loops=1)
>> Workers Planned: 2
>> Workers Launched: 2


> I don't think autoexplain is a good substitute for the originally
> proposed log line. The possibility for log bloat is enormous. Some
> explain plans are gigantic, and I doubt people can afford that kind of
> log traffic just in case these numbers don't match.

Correct, that is a downside of auto_explain in general. 

The logging traffic can be controlled by 
auto_explain.log_min_duration/auto_explain.sample_rate/etc.
of course. 

> Well, if you read Benoit's earlier proposal at [1] you'll see that he
> does propose to have some cumulative stats; this LOG line he proposes
> here is not a substitute for stats, but rather a complement.  I don't
> see any reason to reject this patch even if we do get stats.

> Also, we do have a patch on stats, by Sotolongo and Bonne here [2].  I

Thanks. I will review the threads in depth and see if the ideas can be combined
in a comprehensive proposal.

Regarding the current patch, the latest version removes the separate GUC,
but the user should be able to control this behavior. 

Query text is logged when  log_min_error_statement > default level of "error".

This could be especially problematic when there is a query running more than 1 
Parallel
Gather node that is in draught. In those cases each node will end up 
generating a log with the statement text. So, a single query execution could 
end up 
having multiple log lines with the statement text.

i.e.
LOG:  Parallel Worker draught during statement execution: workers spawned 0, 
requested 2
STATEMENT:  select (select count(*) from large) as a, (select count(*) from 
large) as b, (select count(*) from large) as c ;
LOG:  Parallel Worker draught during statement execution: workers spawned 0, 
requested 2
STATEMENT:  select (select count(*) from large) as a, (select count(*) from 
large) as b, (select count(*) from large) as c ;
LOG:  Parallel Worker draught during statement execution: workers spawned 0, 
requested 2
STATEMENT:  select (select count(*) from large) as a, (select count(*) from 
large) as b, (select count(*) from large) as c ;

I wonder if it will be better to accumulate the total # of workers planned and 
# of workers launched and
logging this information at the end of execution?

Regards,

Sami

Re: Logging parallel worker draught

Reply via email to