Re: [PATCHES] Space management for PGresult
On Thu, 2005-11-24 at 12:32 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > We really ought to be streaming the result back to the user, not > > downloading it all into a massive client side chunk of memory. > > Have you been paying any attention to the multiple previous discussions > of that point? (Latest was on pgsql-interfaces within the past week.) Clearly not. Thanks for the heads up. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Space management for PGresult
Simon Riggs <[EMAIL PROTECTED]> writes: > We really ought to be streaming the result back to the user, not > downloading it all into a massive client side chunk of memory. Have you been paying any attention to the multiple previous discussions of that point? (Latest was on pgsql-interfaces within the past week.) regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Space management for PGresult
On Wed, 2005-11-23 at 16:21 +0900, Atsushi Ogawa wrote: > In space management for PGresult of libpq, the block size of PGresult > is always PGRESULT_DATA_BLOCKSIZE(2048bytes). Therefore, when a large > result of query is received, malloc is executed many times. > > My proposal is to enlarge the size of the block whenever the block is > allocated. The size of first block is PGRESULT_DATA_BLOCKSIZE. And the > size of the following blocks will be doubled until it reaches > PGRESULT_MAX_DATA_BLOCKSIZE. > > PGRESULT_MAX_DATA_BLOCKSIZE is new constants. I think that 2Mbytes is > good enough for this constants. > > The test result is the following: > > Test SQL: > select * from accounts; (It is pgbench's table. scale factor is 10.) > > The number of malloc calls at pqResultAlloc: > 8.1.0 : 80542 > patched:86 > > Execution time: > 8.1.0 : 6.80 sec > patched: 6.73 sec > What this highlights for me is that we have (IMHO) a strange viewpoint on allocating result memory, not an optimization issue. We really ought to be streaming the result back to the user, not downloading it all into a massive client side chunk of memory. It ought to be possible to do this with very low memory, and would probably have the side-effect of reducing time-to-first-row. Then we wouldn't have a memory allocation issue at all. Consider what will happen if you do "select * from too_big_table". We'll just run for ages, then blow memory and fail. (That's what it used to do, does it still? he asks lazily). Best Regards, Simon Riggs ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Space management for PGresult
Atsushi Ogawa <[EMAIL PROTECTED]> writes: > (1)accounts table (4 columns, 1,000,000 tuples) > malloc calls allocated(bytes) unused(bytes) execution time > -- > 8.1.0 80,542164,950,016 2,946,402 6.80 sec. > patched 86161,478,656 177,650 6.73 sec. This hardly seems credible --- your patch would result in more wasted memory, not less. It looks to me like the instrumentation you added assumes that extra space in a malloc block will never be used later, which of course is not true ... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Space management for PGresult
Alvaro Herrera wrote: > Tom Lane wrote: > > Atsushi Ogawa <[EMAIL PROTECTED]> writes: > > > The number of malloc calls at pqResultAlloc: > > > 8.1.0 : 80542 > > > patched:86 > > > > > Execution time: > > > 8.1.0 : 6.80 sec > > > patched: 6.73 sec > > > > This hardly seems worth adding any complexity for ... > > What about memory usage? Is there a notorious difference? Well, I measured memory usage by attached patch. An allocated(bytes) is total amounts of allocated memory by pqResultAlloc. An unused(bytes) is total amounts of PGresult->spaceLeft. (1)accounts table (4 columns, 1,000,000 tuples) malloc calls allocated(bytes) unused(bytes) execution time -- 8.1.0 80,542164,950,016 2,946,402 6.80 sec. patched 86161,478,656 177,650 6.73 sec. (2)another teble (50 columns, 100,000 tuples) malloc calls allocated(bytes) unused(bytes) execution time -- 8.1.0 55,557113,780,736 8,561,518 6.26 sec. patched 86104,855,552 83,307 6.21 sec. The unused memory increases when the number of columns increases. The tuple size of PGresult is proportional to the number of columns. getAnotherTuple() at fe-protocol3.c: -- conn->curTuple = (PGresAttValue *) pqResultAlloc(result, nfields * sizeof(PGresAttValue), TRUE); -- regards, --- Atsushi Ogawa libpq_memusage.patch Description: Binary data ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] Space management for PGresult
Tom Lane wrote: > Atsushi Ogawa <[EMAIL PROTECTED]> writes: > > The number of malloc calls at pqResultAlloc: > > 8.1.0 : 80542 > > patched:86 > > > Execution time: > > 8.1.0 : 6.80 sec > > patched: 6.73 sec > > This hardly seems worth adding any complexity for ... What about memory usage? Is there a notorious difference? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Space management for PGresult
Atsushi Ogawa <[EMAIL PROTECTED]> writes: > The number of malloc calls at pqResultAlloc: > 8.1.0 : 80542 > patched:86 > Execution time: > 8.1.0 : 6.80 sec > patched: 6.73 sec This hardly seems worth adding any complexity for ... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org