Re: [HACKERS] using a lot of maintenance_work_mem

2011-04-14 Thread Jim Nasby
On Apr 9, 2011, at 9:23 PM, Stephen Frost wrote:
 Actually, Tom has a point in that work_mem can be set above 1GB (which
 is where I had it set previously..).  I didn't think it'd actually do
 anything given the MaxAlloc limit, but suprisingly, it does (at least,
 under 8.4).  I'm currently trying to see if we've got anything that's
 going to *break* with work_mem set up that high; right now I have a
 hashagg plan running across this data set which has 2.4G allocted to
 it so far.
 
 I'll update this thread with whatever I find out.  I'm trying to
 remember the other issues that I ran in to with this limit (beyond the
 whole sort limit, which I do think would be helped by allowing a larger
 value, but it's not as big a deal).

FWIW, I regularly set maintenance_work_mem to 8G for index builds. Presumably 
that's equivalent to running a sort in a regular query with work_mem set that 
high...
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-04-09 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 If we were actually trying to support such large allocations,
 what I'd be inclined to do is introduce a separate call along the lines
 of MemoryContextAllocLarge() that lacks the safety check.  

 This sounds like the right approach to me.  Basically, I'd like to have
 MemoryContextAllocLarge(), on 64bit platforms, and have it be used for
 things like sorts and hash tables.  We'd need to distinguish that usage
 from things which allocate varlena's and the like.

Yes, but ...

 But before
 expending time on that, I'd want to see some evidence that it's actually
 helpful for production situations.  I'm a bit dubious that you're going
 to gain much here.

 I waited ~26hrs for a rather simple query:

The fact that X is slow does not prove anything about whether Y will
make it faster.  In particular I see nothing here showing that this
query is bumping up against the 1GB-for-sort-pointers limit, or that
if it is, any significant gain would result from increasing that.
I think the only real way to prove that is to hack the source code
to remove the limit and see what happens.  (You could try using malloc
directly, not palloc at all, to have a non-production-quality but
very localized patch to test.)

BTW, it sounded like your argument had to do with whether it would use
HashAgg or not -- that is *not* dependent on the per-palloc limit, and
never has been.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-04-09 Thread Greg Stark
On Sat, Apr 9, 2011 at 6:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 BTW, it sounded like your argument had to do with whether it would use
 HashAgg or not -- that is *not* dependent on the per-palloc limit, and
 never has been.


His point was he wanted to be allowed to set work_mem  1GB. This is
going to become a bigger and bigger problem with 72-128GB and larger
machines already becoming quite standard.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-04-09 Thread Joshua D. Drake
On Sun, 2011-04-10 at 03:05 +0100, Greg Stark wrote:
 On Sat, Apr 9, 2011 at 6:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  BTW, it sounded like your argument had to do with whether it would use
  HashAgg or not -- that is *not* dependent on the per-palloc limit, and
  never has been.
 
 
 His point was he wanted to be allowed to set work_mem  1GB. This is
 going to become a bigger and bigger problem with 72-128GB and larger
 machines already becoming quite standard.
 

Yes it is, it even came up at East. 1GB just doesn't cut it anymore... 

JD

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-04-09 Thread Stephen Frost
* Greg Stark (gsst...@mit.edu) wrote:
 On Sat, Apr 9, 2011 at 6:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  BTW, it sounded like your argument had to do with whether it would use
  HashAgg or not -- that is *not* dependent on the per-palloc limit, and
  never has been.
 
 His point was he wanted to be allowed to set work_mem  1GB. This is
 going to become a bigger and bigger problem with 72-128GB and larger
 machines already becoming quite standard.

Actually, Tom has a point in that work_mem can be set above 1GB (which
is where I had it set previously..).  I didn't think it'd actually do
anything given the MaxAlloc limit, but suprisingly, it does (at least,
under 8.4).  I'm currently trying to see if we've got anything that's
going to *break* with work_mem set up that high; right now I have a
hashagg plan running across this data set which has 2.4G allocted to
it so far.

I'll update this thread with whatever I find out.  I'm trying to
remember the other issues that I ran in to with this limit (beyond the
whole sort limit, which I do think would be helped by allowing a larger
value, but it's not as big a deal).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] using a lot of maintenance_work_mem

2011-04-08 Thread Stephen Frost
Tom, all,

Having run into issues caused by small work_mem, again, I felt the need
to respond to this.

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 You would break countless things.  It might be okay anyway in a trusted
 environment, ie, one without users trying to crash the system, but there
 are a lot of security-critical implications of that test.

I really don't see work_mem or maintenance_work_mem as security-related
parameters.  Amusingly, the Postgres95 1.01 release apparently attmpted
to make the cap 16GB (but failed and made it 256M instead).  After a bit
of poking around, I found this commit:

commit 85c17dbff8ade0c5237e3ac1ece7cacacfdde399
Author: Tom Lane t...@sss.pgh.pa.us
Date:   Tue Feb 6 01:53:53 2001 +

Out-of-bounds memory allocation request sizes should be treated as just
elog(ERROR) not an Assert trap, since we've downgraded out-of-memory to
elog(ERROR) not a fatal error.  Also, change the hard boundary from 256Mb
to 1Gb, just so that anyone who's actually got that much memory to spare
can play with TOAST objects approaching a gigabyte.

If we want to implement a system to limit what users can request with
regard to work_mem then we can do that, but a smart user could probably
circumvent such a system by building huge queries..  A system which
monitered actual usage and ERROR'd out would probably be better to
address that concern.

 If we were actually trying to support such large allocations,
 what I'd be inclined to do is introduce a separate call along the lines
 of MemoryContextAllocLarge() that lacks the safety check.  

This sounds like the right approach to me.  Basically, I'd like to have
MemoryContextAllocLarge(), on 64bit platforms, and have it be used for
things like sorts and hash tables.  We'd need to distinguish that usage
from things which allocate varlena's and the like.

 But before
 expending time on that, I'd want to see some evidence that it's actually
 helpful for production situations.  I'm a bit dubious that you're going
 to gain much here.

I waited ~26hrs for a rather simple query:

explain select
  bunch-of-columns,
  bunch-of-aggregates
from really_big_table
where customer_code ~ '^CUST123'
group by
  bunch-of-columns
;

QUERY PLAN
---
 GroupAggregate  (cost=37658456.68..42800117.89 rows=10546998 width=146)
   -  Sort  (cost=37658456.68..37922131.61 rows=105469973 width=146)
 Sort Key: bunch-of-columns
 -  Seq Scan on really_big_table  (cost=0.00..15672543.00 
rows=105469973 width=146)
   Filter: ((customer_code)::text ~ '^CUST123'::text)
(5 rows)

This query ran for ~26 hours, where ~20 hours was spent sorting the ~30G
which resulted from the Seq-Scan+filter (the raw table is ~101G).  The
resulting table (after the GroupAgg) was only 30MB in size (~80k rows
instead of the estimated 10M above).  Another query against the same
101G table, which used a HashAgg, completed just a bit faster than the
26 hours:

 QUERY PLAN 


 HashAggregate  (cost=19627666.99..19631059.80 rows=90475 width=116) (actual 
time=1435604.737..1435618.293 rows=4869 loops=1)
   -  Seq Scan on really_big_table  (cost=0.00..15672543.00 rows=105469973 
width=116) (actual time=221029.805..804802.329 rows=104616597 loops=1)
 Filter: ((agency_hierarchy_code)::text ~ '^CUST123'::text)
 Total runtime: 1435625.388 ms
(4 rows)

Now, this query had fewer columns in the group by (required to convince
PG to use a HashAgg), but, seriously, it only took 23 minutes to scan
through the entire table.  It could have taken 3 hours and I would have
been happy.

Admittedly, part of the problem here is the whole cross-column
correllation stats problem, but I wouldn't care if the stats were right
and I ended up with a 1.5G hash table and 10M records result, I'm pretty
sure generating that would be a lot faster using a HashAgg than a
sort+GroupAgg.  Also, I feel like we're pretty far from having the
cross-column statistics fixed and I'm not 100% convinced that it'd
actually come up with a decent result for this query anyway (there's 18
columns in the group by clause for the first query...).

Anyhow, I just wanted to show that there are definitely cases where the
current limit is making things difficult for real-world PG users on
production systems.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-20 Thread Devrim GÜNDÜZ
On Wed, 2011-02-16 at 23:24 +0200, Peter Eisentraut wrote:
 
  But before expending time on that, I'd want to see some evidence
 that
  it's actually helpful for production situations.  I'm a bit dubious
  that you're going to gain much here.
 
 If you want to build an index on a 500GB table and you have 1TB RAM,
 then being able to use 1GB maintenance_work_mem can only be good,
 no? 

That would also probably speed up Slony (or similar) replication engines
in initial replication phase. I know that I had to wait a lot while
creating big indexes on a machine which had enough ram.
-- 
Devrim GÜNDÜZ
EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-20 Thread Bruce Momjian
Devrim G?ND?Z wrote:
 On Wed, 2011-02-16 at 23:24 +0200, Peter Eisentraut wrote:
  
   But before expending time on that, I'd want to see some evidence
  that
   it's actually helpful for production situations.  I'm a bit dubious
   that you're going to gain much here.
  
  If you want to build an index on a 500GB table and you have 1TB RAM,
  then being able to use 1GB maintenance_work_mem can only be good,
  no? 
 
 That would also probably speed up Slony (or similar) replication engines
 in initial replication phase. I know that I had to wait a lot while
 creating big indexes on a machine which had enough ram.

Well, I figure it will be hard to allow larger maximums, but can we make
the GUC variable maximums be more realistic?  Right now it is
MAX_KILOBYTES (INT_MAX).

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-20 Thread Bernd Helmle



--On 20. Februar 2011 09:32:02 -0500 Bruce Momjian br...@momjian.us wrote:


Well, I figure it will be hard to allow larger maximums, but can we make
the GUC variable maximums be more realistic?  Right now it is
MAX_KILOBYTES (INT_MAX).


This is something i proposed some time ago, too. At least, it will stop us 
from promising something which is maintenance_work_mem not able to deliver.


--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-20 Thread Bernd Helmle



--On 20. Februar 2011 15:48:06 +0100 Bernd Helmle maili...@oopsware.de 
wrote:



Well, I figure it will be hard to allow larger maximums, but can we make
the GUC variable maximums be more realistic?  Right now it is
MAX_KILOBYTES (INT_MAX).


This is something i proposed some time ago, too. At least, it will stop
us from promising something which is maintenance_work_mem not able to
deliver.


Hmm, on further reflection a better option might be to just document this 
behavior more detailed. I could imagine that making maintenance_work_mem 
having a hard upper limit would break countless SQL scripts, where it was 
set just high enough in the hope of speed increase...


--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-20 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Well, I figure it will be hard to allow larger maximums, but can we make
 the GUC variable maximums be more realistic?  Right now it is
 MAX_KILOBYTES (INT_MAX).

You seem to be confusing one limitation in one code path with the
overall meaning of maintenance_work_mem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-20 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 You seem to be confusing one limitation in one code path with the
 overall meaning of maintenance_work_mem.

 Oh, OK, so sorts are limited, but not hash sizes?  Are there any other
 uses?  Should this be documented somehow?  What is the actual sort
 limit?

The particular complaint that's being made here is about tuplesort.c's
array of SortTuples, which isn't all (or even the largest part) of its
memory consumption.  The tuples themselves eat significantly more in
nearly all cases.  I don't think there's any very easy way to document
what the largest useful maintenance_work_mem for sorting is based on
that --- you'd have to pull a number for tuple size out of the air.
But it's certainly possible to use up lots of gigabytes when sorting
wide tuples.  I think the original complaint in this thread was about
building an index, which probably had relatively small tuples so the
SortTuple constraint was more pressing.

In any case, this is the sort of thing that'd be far better to fix than
document.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-20 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Well, I figure it will be hard to allow larger maximums, but can we make
  the GUC variable maximums be more realistic?  Right now it is
  MAX_KILOBYTES (INT_MAX).
 
 You seem to be confusing one limitation in one code path with the
 overall meaning of maintenance_work_mem.

Oh, OK, so sorts are limited, but not hash sizes?  Are there any other
uses?  Should this be documented somehow?  What is the actual sort
limit?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-20 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Tom Lane wrote:
  You seem to be confusing one limitation in one code path with the
  overall meaning of maintenance_work_mem.
 
  Oh, OK, so sorts are limited, but not hash sizes?  Are there any other
  uses?  Should this be documented somehow?  What is the actual sort
  limit?
 
 The particular complaint that's being made here is about tuplesort.c's
 array of SortTuples, which isn't all (or even the largest part) of its
 memory consumption.  The tuples themselves eat significantly more in
 nearly all cases.  I don't think there's any very easy way to document
 what the largest useful maintenance_work_mem for sorting is based on
 that --- you'd have to pull a number for tuple size out of the air.
 But it's certainly possible to use up lots of gigabytes when sorting
 wide tuples.  I think the original complaint in this thread was about
 building an index, which probably had relatively small tuples so the
 SortTuple constraint was more pressing.
 
 In any case, this is the sort of thing that'd be far better to fix than
 document.

Added to TODO:

Allow sorts to use more available memory

* http://archives.postgresql.org/pgsql-hackers/2007-11/msg01026.php
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg01123.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg01957.php 

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-18 Thread Frederik Ramm

Tom  Kevin,

   thank you for your replies. Kevin, I had already employed all the 
tricks you mention, except using temporary tables which would be hard 
for me due to the structure of my application (but I could try using 
something like pgbouncer or so), but thanks a lot for sharing the ideas.


Tom Lane wrote:
If I were to either (a) increase MaxAllocSize to, say, 48 GB instead of 
1 GB, or (b) hack tuplesort.c to ignore MaxAllocSize, just for my local 
setup - would that likely be viable in my situation, or would I break 
countless things?


You would break countless things.


Indeed I did. I tried to raise the MaxAllocSize from 1 GB to a large 
number, but immediately got strange memory allocation errors during the 
regression test (something that looked like a wrapped integer in a 
memory allocation request).


I reduced the number in steps, and found I could compile and run 
PostgreSQL 8.3 with a MaxAllocSize of 4 GB, and PostgreSQL 9.0 with 2 GB 
without breakage.


In a completely un-scientific test run, comprising 42 individual SQL 
statements aimed at importing and indexing a large volume of data, I got 
the following results:


pg8.3 with normal MaxAllocSize .. 15284s
pg8.3 with MaxAllocSize increased to 4 GB ... 14609s (-4.5%)
pg9.0 with normal MaxAllocSize .. 12969s (-15.2%)
pg9.0 with MaxAllocSize increased to 2 GB ... 13211s (-13.5%)


I'd want to see some evidence that it's actually
helpful for production situations.  I'm a bit dubious that you're going
to gain much here.


So, on the whole it seems you were right; the performance, at least with 
that small memory increase I managed to build in without breaking 
things, doesn't increase a lot, or not at all for PostgreSQL 9.0.


The single query that gained most from the increase in memory was an 
ALTER TABLE statement to add a BIGINT primary key to a table with about 
50 million records - this was 75% faster on the both 8.3 and 9.0 but 
since it took only 120 seconds to begin with, didn't change the result a 
lot.


The single query where pg9.0 beat pg8.3 by a country mile was a CREATE 
INDEX statement on a BIGINT column to a table with about 500 million 
records - this cost 2679 seconds on normal 8.3, 2443 seconds on 
large-memory 8.3, and aroung 1650 seconds on 9.0, large memory or not.


The query that, on both 8.3 and 9.0, took about 10% longer with more 
memory was a CREATE INDEX statement on a TEXT column.


All this, as I said, completely un-scientific - I did take care to flush 
caches and not run anything in parallel, but that was about all I did so 
it might come out differently when run often.


My result of all of this? Switch to 9.0 of course ;)

Bye
Frederik

--
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-18 Thread Tom Lane
Frederik Ramm frede...@remote.org writes:
 The single query where pg9.0 beat pg8.3 by a country mile was a CREATE 
 INDEX statement on a BIGINT column to a table with about 500 million 
 records - this cost 2679 seconds on normal 8.3, 2443 seconds on 
 large-memory 8.3, and aroung 1650 seconds on 9.0, large memory or not.

FWIW, that's probably due to bigint having become pass-by-value on
64-bit platforms.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-16 Thread Peter Eisentraut
On mån, 2011-02-14 at 10:11 -0500, Tom Lane wrote:
 But before expending time on that, I'd want to see some evidence that
 it's actually helpful for production situations.  I'm a bit dubious
 that you're going to gain much here.

If you want to build an index on a 500GB table and you have 1TB RAM,
then being able to use 1GB maintenance_work_mem can only be good, no?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] using a lot of maintenance_work_mem

2011-02-14 Thread Frederik Ramm

Hi,

I am (ab)using a PostgreSQL database (with PostGIS extension) in a 
large data processing job - each day, I load several GB of data, run a 
lot of analyses on it, and then throw everything away again. Loading, 
running, and dumping the results takes about 18 hours every day.


The job involves a lot of index building and sorting, and is run on a 
64-bit machine with 96 GB of RAM.


Naturally I would like the system to use as much RAM as possible before 
resorting to disk-based operations, but no amount of 
maintenance_work_mem setting seems to make it do my bidding.


I'm using PostgreSQL 8.3 but would be willing and able to upgrade to any 
later version.


Some googling has unearthed the issue - which is likely known to all of 
you, just repeating it to prove I've done my homework - that tuplesort.c 
always tries to double its memory allocation, and will refuse to do so 
if that results in an allocation greater than MaxAllocSize:


 if ((Size) (state-memtupsize * 2) = MaxAllocSize / sizeof(SortTuple))
 return false;

And MaxAllocSize is hardcoded to 1 GB in memutils.h.

(All this based on Postgres 9.1alpha source - I didn't want to bring 
something up that has been fixed already.)


Now I assume that there are reasons that you're doing this. memutils.h 
has the (for me) cryptic comment about MaxAllocSize: XXX This is 
deliberately chosen to correspond to the limiting size of varlena 
objects under TOAST. See VARATT_MASK_SIZE in postgres.h., but 
VARATT_MASK_SIZE has zero other occurences in the source code.


If I were to either (a) increase MaxAllocSize to, say, 48 GB instead of 
1 GB, or (b) hack tuplesort.c to ignore MaxAllocSize, just for my local 
setup - would that likely be viable in my situation, or would I break 
countless things?


I can afford some experimentation; as I said, I'm throwing away the 
database every day anyway. I just thought I'd solicit your advice before 
I do anything super stupid. - If I can use my setup to somehow 
contribute to further PostgreSQL development by trying out some things, 
I'll be more than happy to do so. I do C/C++ but apart from building 
packages for several platforms, I haven't worked with the PostgreSQL 
source code.


Of course the cop-out solution would be to just create a huge RAM disk 
and instruct PostgreSQL to use that for disk-based sorting. I'll do that 
if all of you say OMG don't touch MaxAllocSize ;)


Bye
Frederik

--
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-14 Thread Tom Lane
Frederik Ramm frede...@remote.org writes:
 Now I assume that there are reasons that you're doing this. memutils.h 
 has the (for me) cryptic comment about MaxAllocSize: XXX This is 
 deliberately chosen to correspond to the limiting size of varlena 
 objects under TOAST. See VARATT_MASK_SIZE in postgres.h., but 
 VARATT_MASK_SIZE has zero other occurences in the source code.

Hm, I guess that comment needs updated then.

 If I were to either (a) increase MaxAllocSize to, say, 48 GB instead of 
 1 GB, or (b) hack tuplesort.c to ignore MaxAllocSize, just for my local 
 setup - would that likely be viable in my situation, or would I break 
 countless things?

You would break countless things.  It might be okay anyway in a trusted
environment, ie, one without users trying to crash the system, but there
are a lot of security-critical implications of that test.

If we were actually trying to support such large allocations,
what I'd be inclined to do is introduce a separate call along the lines
of MemoryContextAllocLarge() that lacks the safety check.  But before
expending time on that, I'd want to see some evidence that it's actually
helpful for production situations.  I'm a bit dubious that you're going
to gain much here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-14 Thread Kevin Grittner
Frederik Ramm frede...@remote.org wrote:
 
 I am (ab)using a PostgreSQL database (with PostGIS extension) in
 a large data processing job - each day, I load several GB of data,
 run a lot of analyses on it, and then throw everything away again.
 Loading, running, and dumping the results takes about 18 hours
 every day.
 
 The job involves a lot of index building and sorting, and is run
 on a 64-bit machine with 96 GB of RAM.
 
 Naturally I would like the system to use as much RAM as possible
 before resorting to disk-based operations, but no amount of 
 maintenance_work_mem setting seems to make it do my bidding.
 
If you can tolerate some risk that for a given day you might fail to
generate the analysis, or you might need to push the schedule back
to get it, you could increase performance by compromising
recoverability.  You seem to be willing to consider such risk based
on your mention of a RAM disk.
 
 - If a single session can be maintained for loading and using the
data, you might be able to use temporary tables and a large
temp_buffers size.  Of course, when the connection closes, the
tables are gone.
 
 - You could turn off fsync and full_page_writes, but on a crash
your database might be corrupted beyond usability.
 
 - You could turn off synchronous_commit.
 
 - Make sure you have archiving turned off.
 
 - If you are not already doing so, load the data into each table
within the same database transaction which does CREATE TABLE or
TRUNCATE TABLE.
 
Other than the possibility that the temp table might keep things in
RAM, these suggestions don't directly address your question, but I
thought they might be helpful.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers