Re: [PATCHES] Space management for PGresult

2005-11-24 Thread Simon Riggs
On Thu, 2005-11-24 at 12:32 -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > We really ought to be streaming the result back to the user, not
> > downloading it all into a massive client side chunk of memory.
> 
> Have you been paying any attention to the multiple previous discussions
> of that point?  (Latest was on pgsql-interfaces within the past week.)

Clearly not. Thanks for the heads up.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Space management for PGresult

2005-11-24 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> We really ought to be streaming the result back to the user, not
> downloading it all into a massive client side chunk of memory.

Have you been paying any attention to the multiple previous discussions
of that point?  (Latest was on pgsql-interfaces within the past week.)

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Space management for PGresult

2005-11-24 Thread Simon Riggs
On Wed, 2005-11-23 at 16:21 +0900, Atsushi Ogawa wrote:
> In space management for PGresult of libpq, the block size of PGresult
> is always PGRESULT_DATA_BLOCKSIZE(2048bytes). Therefore, when a large
> result of query is received, malloc is executed many times.
> 
> My proposal is to enlarge the size of the block whenever the block is
> allocated. The size of first block is PGRESULT_DATA_BLOCKSIZE. And the
> size of the following blocks will be doubled until it reaches
> PGRESULT_MAX_DATA_BLOCKSIZE.
> 
> PGRESULT_MAX_DATA_BLOCKSIZE is new constants. I think that 2Mbytes is
> good enough for this constants.
> 
> The test result is the following:
> 
> Test SQL:
> select * from accounts; (It is pgbench's table. scale factor is 10.)
> 
> The number of malloc calls at pqResultAlloc:
>  8.1.0  : 80542
>  patched:86
> 
> Execution time:
>  8.1.0  : 6.80 sec
>  patched: 6.73 sec
> 

What this highlights for me is that we have (IMHO) a strange viewpoint
on allocating result memory, not an optimization issue. 

We really ought to be streaming the result back to the user, not
downloading it all into a massive client side chunk of memory. It ought
to be possible to do this with very low memory, and would probably have
the side-effect of reducing time-to-first-row. Then we wouldn't have a
memory allocation issue at all.

Consider what will happen if you do "select * from too_big_table". We'll
just run for ages, then blow memory and fail. (That's what it used to
do, does it still? he asks lazily).

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Space management for PGresult

2005-11-24 Thread Tom Lane
Atsushi Ogawa <[EMAIL PROTECTED]> writes:
> (1)accounts table (4 columns, 1,000,000 tuples)
>   malloc calls   allocated(bytes)   unused(bytes)  execution time
> --
> 8.1.0   80,542164,950,016   2,946,402   6.80 sec.
> patched 86161,478,656 177,650   6.73 sec.

This hardly seems credible --- your patch would result in more wasted
memory, not less.  It looks to me like the instrumentation you added
assumes that extra space in a malloc block will never be used later,
which of course is not true ...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Space management for PGresult

2005-11-24 Thread Atsushi Ogawa
Alvaro Herrera wrote:
> Tom Lane wrote:
> > Atsushi Ogawa <[EMAIL PROTECTED]> writes:
> > > The number of malloc calls at pqResultAlloc:
> > >  8.1.0  : 80542
> > >  patched:86
> >
> > > Execution time:
> > >  8.1.0  : 6.80 sec
> > >  patched: 6.73 sec
> >
> > This hardly seems worth adding any complexity for ...
>
> What about memory usage?  Is there a notorious difference?

Well, I measured memory usage by attached patch. An allocated(bytes)
is total amounts of allocated memory by pqResultAlloc. An unused(bytes)
is total amounts of PGresult->spaceLeft.

(1)accounts table (4 columns, 1,000,000 tuples)
  malloc calls   allocated(bytes)   unused(bytes)  execution time
--
8.1.0   80,542164,950,016   2,946,402   6.80 sec.
patched 86161,478,656 177,650   6.73 sec.

(2)another teble (50 columns, 100,000 tuples)
  malloc calls   allocated(bytes)   unused(bytes)  execution time
--
8.1.0   55,557113,780,736   8,561,518   6.26 sec.
patched 86104,855,552  83,307   6.21 sec.

The unused memory increases when the number of columns increases. The
tuple size of PGresult is proportional to the number of columns.

getAnotherTuple() at fe-protocol3.c:
--
conn->curTuple = (PGresAttValue *)
pqResultAlloc(result, nfields * sizeof(PGresAttValue), TRUE);
--

regards,

--- Atsushi Ogawa


libpq_memusage.patch
Description: Binary data

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Space management for PGresult

2005-11-23 Thread Alvaro Herrera
Tom Lane wrote:
> Atsushi Ogawa <[EMAIL PROTECTED]> writes:
> > The number of malloc calls at pqResultAlloc:
> >  8.1.0  : 80542
> >  patched:86
> 
> > Execution time:
> >  8.1.0  : 6.80 sec
> >  patched: 6.73 sec
> 
> This hardly seems worth adding any complexity for ...

What about memory usage?  Is there a notorious difference?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Space management for PGresult

2005-11-23 Thread Tom Lane
Atsushi Ogawa <[EMAIL PROTECTED]> writes:
> The number of malloc calls at pqResultAlloc:
>  8.1.0  : 80542
>  patched:86

> Execution time:
>  8.1.0  : 6.80 sec
>  patched: 6.73 sec

This hardly seems worth adding any complexity for ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org