> The change is a simplification that came out of Jian's v48 review, and it
> does not change the cost the model produces.
> 
> It follows Jian's suggestion from 2026-06-15 [1]:
> 
>> collectPatternVariables is not needed.
>> The parser already ensures every DEFINE variable appears in PATTERN,
>> so there is nothing to filter.
>> Also, we don't really do anything special (like make a dummy Const)
>> regarding PATTERN variables that not appearing in the DEFINE clause.
> 
> The two loops charge exactly the same set, for the following reason:
> 
> - The old loop walked the unique PATTERN variables (collectPatternVariables
>   deduplicates, returning each name once) and, for each, looked up the
>   matching DEFINE entry by resname and charged that DEFINE's cost.
> - Every DEFINE variable is guaranteed to appear in PATTERN -- the parser
>   rejects a DEFINE variable that is not used in PATTERN (errmsg "DEFINE
>   variable \"%s\" is not used in PATTERN" in parse_rpr.c).  So the DEFINE
>   clause is always a subset of the unique PATTERN variables, and each
>   DEFINE resname is unique.
> - A PATTERN variable that has no DEFINE contributes nothing to the old
>   loop, because the inner resname lookup finds no match.
> 
> So the old loop already charged each DEFINE expression exactly once, and
> nothing else.  Iterating defineClause directly, as v50-0006 does, visits
> precisely that same set once each.  The estimate is unchanged; only the
> redundant outer walk over PATTERN and the per-variable resname lookup into
> the DEFINE clause are removed.

Ok, thanks for the explanation.

> This also matches the premise of the cost model we settled on back in
> February: the NFA executor evaluates every DEFINE expression once per row,
> so the natural unit for the per-tuple charge is the DEFINE variable.

BTW, I was thinking about cases where same DEFINE variable appears
twice or more in PATTERN for a same row. For example PATTERN
(A|A). But in this case it would be optimized out to (A). So we don't
need to worry about A appearing twice. So our cost model is correct in
this case.

Regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


Reply via email to