Hi, client can use CURSOR feature to process data in batches. What is the case where proposed feature solves problem that CURSOR does not?
https://www.postgresql.org/docs/current/plpgsql-cursors.html On Fri, 17 Jan 2025, 16:08 Vladimir Sitnikov, <sitnikov.vladi...@gmail.com> wrote: > Hi, > > Applications often face an "out of memory" condition as they try to fetch > "N rows" from the database. > If N is small, then the execution becomes inefficient due to many > roundtrips. > If N is high, there's a risk that many rows would overflow the client's > memory. > > Note: the client can't stop reading the data at arbitrary times as they > might want to execute another query > while processing the resultset, so "fetch all rows at once and read them > from the socket as needed" is not really an option. > > The problem is the clients can't tell how much memory it would take to > buffer the results of "fetch 100 rows". > The row size might vary. > > Currently, the only case the database returns PortalSuspended seems to be > when max_rows is reached when executing a select. > I wonder if the database can suspend portals earlier in case it knows it > produced a lot of data. > > In other words: > 1) If the user calls execute(fetch all rows), then follow the current > behavior. > 2) If the user provides non-zero max_rows, then optionally suspend the > portal as the result exceeds a pre-configured amount (GUC?). > > Then clients could configure the connection like "ok, please suspend > portals if a single fetch exceeds 50MiB". > To my understanding, > * it should be a backward-compatible change > * it would require no protocol changes, > * it would make applications more robust when it comes to "out of memory > while fetching many rows" > * it might improve the performance overall as the applications could > increase their pessimistic "max_rows" fetch size > > Is it ok to contribute a patch like that? > > Vladimir >