Dear Horiguchi-san, Thank you for giving your suggestions. I want to confirm your saying.
> FWIW, I'm not sure this feature necessarily requires core support > dedicated to FDWs. The core have USER_TIMEOUT feature already and > FDWs are not necessarily connection based. It seems better if FDWs > can implement health check feature without core support and it seems > possible. Or at least the core feature should be more generic and > simpler. Why don't we just expose InTransactionHealthCheckCallbacks or > something and operating functions on it? I understood that core is too complicated and FDW side is too stupid, right? > Mmm. AFAICS the running command will stop with "canceling statement > due to user request", which is a hoax. We need a more decent message > there. +1 about better messages. > I understand that the motive of this patch is "to avoid wasted long > local work when fdw-connection dies". Yeah your understanding is right. > In regard to the workload in > your first mail, it is easily avoided by ending the transaction as soon > as remote access ends. This feature doesn't work for the case "begin; > <long local query>; <fdw access>". But the same measure also works in > that case. So the only case where this feature is useful is "begin; > <fdw-access>; <some long work>; <fdw-access>; end;". But in the first > place how frequently do you expecting remote-connection close happens? > If that happens so frequently, you might need to recheck the system > health before implementing this feature. Since it is correctly > detected when something really went wrong, I feel that it is a bit too > complex for the usefulness especially for the core part. Thanks for analyzing motivation. Indeed, some cases may be resolved by separating tx and this event rarely happens. > In conclusion, as my humble opinion I would like to propose to reduce > this feature to: > > - Just periodically check health (in any aspect) of all live > connections regardless of the session state. I understood here as removing following mechanism from core: * disable timeout at end of tx. * skip if held off or read commands > - If an existing connection is found to be dead, just try canceling > the query (or sending query cancel). > One issue with it is how to show the decent message for the query > cancel, but maybe we can have a global variable that suggests the > reason for the cancel. Currently I have no good idea for that but I'll try. Best Regards, Hayato Kuroda FUJITSU LIMITED