On 2/2/2024 11:06, Richard Guo wrote:

On Fri, Feb 2, 2024 at 11:32 AM Richard Guo <guofengli...@gmail.com <mailto:guofengli...@gmail.com>> wrote:

    On Fri, Feb 2, 2024 at 10:02 AM Tom Lane <t...@sss.pgh.pa.us
    <mailto:t...@sss.pgh.pa.us>> wrote:

        One of the test cases added by this commit has not been very
        stable in the buildfarm.  Latest example is here:

        
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04
 
<https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04>

        and I've seen similar failures intermittently on other machines.

        I'd suggest building this test atop a table that is more stable
        than pg_class.  You're just waving a red flag in front of a bull
        if you expect stable statistics from that during a regression run.
        Nor do I see any particular reason for pg_class to be especially
        suited to the test.


    Yeah, it's not a good practice to use pg_class in this place.  While
    looking through the test cases added by this commit, I noticed some
    other minor issues that are not great.  Such as

    * The table 'btg' is inserted with 10000 tuples, which seems a bit
    expensive for a test.  I don't think we need such a big table to test
    what we want.

    * I don't see why we need to manipulate GUC max_parallel_workers and
    max_parallel_workers_per_gather.

    * I think we'd better write the tests with the keywords being all upper
    or all lower.  A mixed use of upper and lower is not great. Such as in

         explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;

    * Some comments for the test queries are not easy to read.

    * For this statement

         CREATE INDEX idx_y_x_z ON btg(y,x,w);

    I think the index name would cause confusion.  It creates an index on
    columns y, x and w, but the name indicates an index on y, x and z.

    I'd like to write a draft patch for the fixes.


Here is the draft patch that fixes the issues I complained about in
upthread.
I passed through the patch. Looks like it doesn't break anything. Why do you prefer to use count(*) in EXPLAIN instead of plain targetlist, like "SELECT x,y,..."?
Also, according to the test mentioned by Tom:
1. I see, PG uses IndexScan on (x,y). So, column x will be already sorted before the MergeJoin. Why not use Incremental Sort on (x,z,w) instead of full sort? 2. For memo, IMO, this test shows us the future near perspective of this feature: It is cheaper to use grouping order (w,z) instead of (z,w).

--
regards,
Andrei Lepikhov
Postgres Professional



Reply via email to