Re: [I] Eliminate limit when doing scalar aggregation query [datafusion]

via GitHub Mon, 03 Mar 2025 05:30:20 -0800


alamb commented on issue #14974:
URL: https://github.com/apache/datafusion/issues/14974#issuecomment-2694390544


   I agree in this case the limit is not necessary. I tried it in postgres to 
double check
   
   
   
   ```sql
   postgres=# create table foo(x int);
   CREATE TABLE
   postgres=# insert into foo values(1);
   INSERT 0 1
   postgres=# insert into foo values(2);
   INSERT 0 1
   postgres=# insert into foo values(3);
   INSERT 0 1
   postgres=# insert into foo values(4);
   INSERT 0 1
   postgres=# insert into foo values(5);
   INSERT 0 1
   postgres=# select count(*) from foo;
    count
   -------
        5
   (1 row)
   
   postgres=# select count(*) from foo limit 2;
    count
   -------
        5
   (1 row)
   ```
   
   
   However, in this case the limit node is likely not doing much work so 
removing it may not save much time.
   
   Note that postgres doesnt remove it either (the `Limit` is still there)
   
   ```sql
   postgres=# explain select count(*) from foo limit 2;
                               QUERY PLAN
   -------------------------------------------------------------------
    Limit  (cost=41.88..41.88 rows=1 width=8)
      ->  Aggregate  (cost=41.88..41.88 rows=1 width=8)
            ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=0)
   (3 rows)
   
   postgres=#
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Eliminate limit when doing scalar aggregation query [datafusion]

Reply via email to