subject:"\[HACKERS\] PostgreSQL Process memory architecture"

Re: [HACKERS] PostgreSQL Process memory architecture

2013-06-04 Thread Merlin Moncure

On Tue, Jun 4, 2013 at 12:57 AM, Ben Zeev, Lior lior.ben-z...@hp.com wrote:
 No matter how I try to redesign the schema the indexes consume large amount 
 of memory,
 About 8KB per index.

8KB per index -- is that a typo?  that doesn't seem like a lot to me.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-06-04 Thread Ben Zeev, Lior

No it isn't a typo,
All the tables are empty and all the indexes are empty

-Original Message-
From: Merlin Moncure [mailto:mmonc...@gmail.com] 
Sent: Tuesday, June 04, 2013 16:10
To: Ben Zeev, Lior
Cc: Atri Sharma; Stephen Frost; Pg Hackers
Subject: Re: [HACKERS] PostgreSQL Process memory architecture

On Tue, Jun 4, 2013 at 12:57 AM, Ben Zeev, Lior lior.ben-z...@hp.com wrote:
 No matter how I try to redesign the schema the indexes consume large 
 amount of memory, About 8KB per index.

8KB per index -- is that a typo?  that doesn't seem like a lot to me.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-06-03 Thread Ben Zeev, Lior

Hi Stephen,

I have some basic question -  How do I add this flags CATCACHE_STATS and 
CATCACHE_FORCE_RELEASE when building postgresql?

I added it to src/Makefile.global in this line:
CPPFLAGS =  -D_GNU_SOURCE -DCATCACHE_STATS -DCATCACHE_FORCE_RELEASE

And changed log level to debug2, but it doesn't log the catcache statistcs

Lior


-Original Message-
From: Stephen Frost [mailto:sfr...@snowman.net] 
Sent: Monday, May 27, 2013 16:44
To: Ben Zeev, Lior
Cc: Atri Sharma; Pg Hackers
Subject: Re: [HACKERS] PostgreSQL Process memory architecture

* Ben Zeev, Lior (lior.ben-z...@hp.com) wrote:
 Each query is running in a separate transaction.

Interesting.  You might also compile with CATCACHE_STATS (and not 
CATCACHE_FORCE_RELEASE, or perhaps with and without) and then check out your 
logs after the process ends (you might need to increase the logging level to 
DEBUG2 if you don't see anything initially).

 Why does portioning is done better rather than using partial index?

There's a couple of reasons, but for one thing, you can do parallel loading of 
data into partitioned tables (particularly if you refer to the individual 
partitions directly rather than going through the top-level table with a 
trigger or similar).  Trying to parallel load into one table with 500 indexes 
would be pretty painful, I expect.

Thanks,

Stephen


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-06-03 Thread Ben Zeev, Lior

No matter how I try to redesign the schema the indexes consume large amount of 
memory,
About 8KB per index.

Is there a way to invalidated this cache?
Is there a way to limit the amount of memory and use some kind of LRU/LFU 
algorithm to clean old cache?



-Original Message-
From: Atri Sharma [mailto:atri.j...@gmail.com] 
Sent: Monday, May 27, 2013 17:24
To: Stephen Frost
Cc: Ben Zeev, Lior; Pg Hackers
Subject: Re: [HACKERS] PostgreSQL Process memory architecture

  We may still be able to do better than what we're doing
 today, but I'm still suspicious that you're going to run into other 
 issues with having 500 indexes on a table anyway.

+1. I am suspicious that the large number of indexes is the problem
here,even if the problem is not with book keeping associated with those indexes.

Regards,

Atri


--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-28 Thread Robert Haas

On Mon, May 27, 2013 at 10:23 AM, Atri Sharma atri.j...@gmail.com wrote:
   We may still be able to do better than what we're doing
 today, but I'm still suspicious that you're going to run into other
 issues with having 500 indexes on a table anyway.

 +1. I am suspicious that the large number of indexes is the problem
 here,even if the problem is not with book keeping associated with
 those indexes.

Right.  The problem seems likely to be that each additional index
requires a relcache entry, which uses some backend-local memory.  But
NOT having those backend-local relcache entries would likely be
devastating for performance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-28 Thread Merlin Moncure

On Mon, May 27, 2013 at 7:29 AM, Stephen Frost sfr...@snowman.net wrote:
 * Atri Sharma (atri.j...@gmail.com) wrote:
 Yes, too many indexes wont hurt much.BTW,wont making too many indexes
 on columns that probably dont have as many values as to deserve
 them(so,essentially,indiscriminately making indexes) hurt the
 performance/memory usage?

 I'd expect the performance issue would be from planner time more than
 memory usage- but if there is a serious memory usage issue here, then
 it'd be valuable to have a test case showing what's happening.  We may
 not be releasing the sys cache in some cases or otherwise have a bug in
 this area.

Note, backends do use private memory to cache various things
(relcache, etc).   Absolutely pathological workloads (tons of tables,
tons of (especially) views, etc can exercise this into the gigabytes
and there is no effective way to monitor and control it.  Normally,
it's not a very big deal though.

So, to be a bit more specific, the index *data* (like all on disk
structures) is buffered in shared memory.  But certain plans/metadata
stuff is in private memory.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Ben Zeev, Lior

Hi,

I have a question regarding the memory consumption per process in PostgreSQL 9.2

Does each PostgreSQL process allocating in its own memory (Not shared memory) a 
cache of all the database catalog which it access during the SQL execution?
I mean does each process holds all the catalog indexes data which it accessed, 
all the catalog index statistics etc' accessed

If yes is there a way to avoid this behavior?

(I asked Josh Berkus from PGExperts and he said that each process holds memory 
for sorts, hashes, temp tables, vaccum, etc')

Thanks,
Lior

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Atri Sharma

 Does each PostgreSQL process allocating in its own memory (Not shared
 memory) a cache of all the database catalog which it access during the SQL
 execution?

 I mean does each process holds all the catalog indexes data which it
 accessed, all the catalog index statistics etc’ accessed

AFAIK, the shared disk buffers are the only part shared between the processes.

Regards,

Atri




--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Atri Sharma

On Mon, May 27, 2013 at 3:41 PM, Ben Zeev, Lior lior.ben-z...@hp.com wrote:
 Hi Atri,

 Thanks for your answer!
 Do you have idea what may be the reason that PostgreSQL process consume more 
 memory when there are more partial indexes on the DB table?


Well, I am not too sure, but indexes always take up more space, so if
your backend has a lot of indexes, it will cause the process to
consume more memory.

Indexes should be used with care, as too many indexes can cause a
memory overhead,which can cause performance degradations.

Regards,

Atri

--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Ben Zeev, Lior

Thanks Atri!

Do you know why PostgreSQL store the indexes in memory per process and not in 
the shared memory?
Is there a way to prevent it store the indexes data per process, and force it 
storing it in the shared memory?

Lior



-Original Message-
From: Atri Sharma [mailto:atri.j...@gmail.com] 
Sent: Monday, May 27, 2013 13:19
To: Ben Zeev, Lior; Pg Hackers
Subject: Re: [HACKERS] PostgreSQL Process memory architecture

On Mon, May 27, 2013 at 3:41 PM, Ben Zeev, Lior lior.ben-z...@hp.com wrote:
 Hi Atri,

 Thanks for your answer!
 Do you have idea what may be the reason that PostgreSQL process consume more 
 memory when there are more partial indexes on the DB table?


Well, I am not too sure, but indexes always take up more space, so if your 
backend has a lot of indexes, it will cause the process to consume more memory.

Indexes should be used with care, as too many indexes can cause a memory 
overhead,which can cause performance degradations.

Regards,

Atri

--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Atri Sharma

On Mon, May 27, 2013 at 3:55 PM, Ben Zeev, Lior lior.ben-z...@hp.com wrote:
 Thanks Atri!

 Do you know why PostgreSQL store the indexes in memory per process and not in 
 the shared memory?
 Is there a way to prevent it store the indexes data per process, and force it 
 storing it in the shared memory?


Ok, sorry for a bit of a confusion here.

I am assuming that the multiple processes are accessing the same database.

What happens essentially is that each index is stored as a separate
file in the data directory of the database in the directory of the
cluster in which your database belongs.

So,indexes are essentially stored the same way as tables, in form of
files which are accessed in 8K blocks.

If your index is big/you have too many indexes in your database, it
should affect *all* backends accessing that specific database.

So,my point is that,there is no question of indexes being stored in
shared memory or individually. You can treat indexes the same as your
tables,from the point of view of physical storage.

For more details,you can see

http://www.postgresql.org/docs/8.1/static/storage.html

Regards,

Atri
--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Amit Langote

On Mon, May 27, 2013 at 7:25 PM, Ben Zeev, Lior lior.ben-z...@hp.com wrote:
 Thanks Atri!

 Do you know why PostgreSQL store the indexes in memory per process and not in 
 the shared memory?
 Is there a way to prevent it store the indexes data per process, and force it 
 storing it in the shared memory?


An index is built in backend process's local memory, but, when
accessing, index pages are stored in shared memory. That is, for
example, when an index scan is performed, index pages are brought into
shared memory and accessed from there.


--
Amit Langote


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Atri Sharma

 An index is built in backend process's local memory, but, when
 accessing, index pages are stored in shared memory. That is, for
 example, when an index scan is performed, index pages are brought into
 shared memory and accessed from there.



Yes, brought into the shared disk buffers and read,just like tables are read.

Regards,

Atri


--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

* Atri Sharma (atri.j...@gmail.com) wrote:
  Does each PostgreSQL process allocating in its own memory (Not shared
  memory) a cache of all the database catalog which it access during the SQL
  execution?

This information is pulled into a backend-local cache, but it should
only be cached while it's needed and then purged out to allow for new
data coming in.  It would be great if we could understand what the issue
is that you're seeing.

  I mean does each process holds all the catalog indexes data which it
  accessed, all the catalog index statistics etc’ accessed

Each backend shouldn't try to hold all the data, if there is pressure
for that memory.

 AFAIK, the shared disk buffers are the only part shared between the processes.

There's a bit of other information shared, but disk buffers are
certainly the bulk of it.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

* Atri Sharma (atri.j...@gmail.com) wrote:
 On Mon, May 27, 2013 at 3:41 PM, Ben Zeev, Lior lior.ben-z...@hp.com wrote:
  Do you have idea what may be the reason that PostgreSQL process consume 
  more memory when there are more partial indexes on the DB table?

It might use a bit more, but it shouldn't be excessive..  What, exactly,
are you seeing and would it be possible for you to provide a repeatable
test case with a small-ish set of data?

 Well, I am not too sure, but indexes always take up more space, so if
 your backend has a lot of indexes, it will cause the process to
 consume more memory.

Indexes require additional disk space, certainly.  Having a lot of
indexes, by itself, shouldn't seriously increase memory usage.

 Indexes should be used with care, as too many indexes can cause a
 memory overhead,which can cause performance degradations.

This is not generally a reason to avoid indexes.  Indexes require more
disk space and must be kept up to date, making them expensive to
maintain due to increased disk i/o.  Building an index uses as much
memory as it's allowed to- it uses maintenance_work_mem to limit itself.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Atri Sharma

 AFAIK, the shared disk buffers are the only part shared between the 
 processes.

 There's a bit of other information shared, but disk buffers are
 certainly the bulk of it.

The other information being locks?

Regards,

Atri

--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

Lior,

* Ben Zeev, Lior (lior.ben-z...@hp.com) wrote:
 Do you know why PostgreSQL store the indexes in memory per process and not in 
 the shared memory?

The disk blocks from an index are not stored per-process, they are kept
in shared memory.  When building an index, PG can only use one process
and so there isn't any point having that be in shared memory.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

Lior,

* Ben Zeev, Lior (lior.ben-z...@hp.com) wrote:
 Does each PostgreSQL process allocating in its own memory (Not shared memory) 
 a cache of all the database catalog which it access during the SQL execution?

PG will look up and cache the catalog information regarding all of the
relations involved, yes.  In *accessing* those relations, PG will pull
needed blocks into shared buffers.  PG will use backend-local memory to
process through the data (generally on a per-tuple basis).

 I mean does each process holds all the catalog indexes data which it 
 accessed, all the catalog index statistics etc' accessed

Catalog information (eg: information in pg_class) is kept, but the
*data* will only be pulled through shared buffers and then processed.
Anything in shared buffers (eg: the data in the tables or indexes) will
be cleaned up as new blocks are needed which push out old ones.

 If yes is there a way to avoid this behavior?

Catalog information is only cached- if the information isn't being used
then it should get purged out in favor of new data which is needed.  Can
you explain a bit more exactly what the issue is..?

 (I asked Josh Berkus from PGExperts and he said that each process holds 
 memory for sorts, hashes, temp tables, vaccum, etc')

Correct, most backend local usage of memory is for running queries and
doing what is required in those queries.  Regarding temp tables, you can
control how much memory is used for those with the temp_buffers
parameter.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

* Atri Sharma (atri.j...@gmail.com) wrote:
 If your index is big/you have too many indexes in your database, it
 should affect *all* backends accessing that specific database.

More indexes will require more disk space, certainly, but tablespaces
can be used to seperate databases, or tables, or indexes on to different
partitions on the host server.

 So,my point is that,there is no question of indexes being stored in
 shared memory or individually. You can treat indexes the same as your
 tables,from the point of view of physical storage.

Correct.

 For more details,you can see
 
 http://www.postgresql.org/docs/8.1/static/storage.html

A better place to look would be the documentation for the release of PG
which you are on, or the latest release otherwise, which is:

http://www.postgresql.org/docs/9.2/static/storage.html

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Atri Sharma

 A better place to look would be the documentation for the release of PG
 which you are on, or the latest release otherwise, which is:

 http://www.postgresql.org/docs/9.2/static/storage.html

Oops,yes,sorry about that.

Thanks a ton for pointing that out.

Regards,

Atri


--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Atri Sharma

 This is not generally a reason to avoid indexes.  Indexes require more
 disk space and must be kept up to date, making them expensive to
 maintain due to increased disk i/o.  Building an index uses as much
 memory as it's allowed to- it uses maintenance_work_mem to limit itself.

Yes, too many indexes wont hurt much.BTW,wont making too many indexes
on columns that probably dont have as many values as to deserve
them(so,essentially,indiscriminately making indexes) hurt the
performance/memory usage?

Regards,

Atri

--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Amit Langote

On Mon, May 27, 2013 at 9:16 PM, Atri Sharma atri.j...@gmail.com wrote:
 AFAIK, the shared disk buffers are the only part shared between the 
 processes.

 There's a bit of other information shared, but disk buffers are
 certainly the bulk of it.

 The other information being locks?

CreateSharedMemoryAndSemaphores() (src/backend/storage/ipc/ipci.c)
seems to be the place where we can see what all things reside in
shared memory, since at the beginning of the function, you can see
size being computed for shared memory to hold all the things that need
to be in shared memory.


--
Amit Langote


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

* Atri Sharma (atri.j...@gmail.com) wrote:
  There's a bit of other information shared, but disk buffers are
  certainly the bulk of it.
 
 The other information being locks?

Depends, but yes.  Per-row locks are actually in the disk cache portion
of shared buffers, but heavyweight locks have their own area.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

* Atri Sharma (atri.j...@gmail.com) wrote:
 Yes, too many indexes wont hurt much.BTW,wont making too many indexes
 on columns that probably dont have as many values as to deserve
 them(so,essentially,indiscriminately making indexes) hurt the
 performance/memory usage?

I'd expect the performance issue would be from planner time more than
memory usage- but if there is a serious memory usage issue here, then
it'd be valuable to have a test case showing what's happening.  We may
not be releasing the sys cache in some cases or otherwise have a bug in
this area.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Atri Sharma

 I'd expect the performance issue would be from planner time more than
 memory usage- but if there is a serious memory usage issue here, then
 it'd be valuable to have a test case showing what's happening.  We may
 not be releasing the sys cache in some cases or otherwise have a bug in
 this area.

Right, this does sound interesting. Thanks a ton!

Regards,

Atri


--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Ben Zeev, Lior

Hi Stephen,

The case which I'm seeing is that I have an empty table without any rows,
Create table test (
  Num Integer,
  C1 character varying(512),
  C2 character varying(512),
  C3 character varying(512));

I create several partial indexes on this table:

Create index(index_1_c1) on test(c1) where Num=1;
Create index(index_2_c1) on test(c1) where Num=2;
Create index(index_1_c2) on test(c1) where Num=1;
Create index(index_2_c2) on test(c1) where Num=2;
...

This doesn't consume much memory on the PostgreSQL backend process,
But if I create 500 indexes It consume several MB of memory.

If I have 10 tables with 500 indexes each PostgreSql backend process consume 
20MB,
If I have 100 tables with 500 indexes each PostgreSQL backend process consume 
200MB

All tables are empty without data.

If have Connection pool of 100 connections then All this processes consume 
100*200MB = 20GB of memory

What is the reason to consume so much memory for empty indexes?

Thanks,
Lior


-Original Message-
From: Stephen Frost [mailto:sfr...@snowman.net] 
Sent: Monday, May 27, 2013 15:16
To: Atri Sharma
Cc: Ben Zeev, Lior; Pg Hackers
Subject: Re: [HACKERS] PostgreSQL Process memory architecture

* Atri Sharma (atri.j...@gmail.com) wrote:
 On Mon, May 27, 2013 at 3:41 PM, Ben Zeev, Lior lior.ben-z...@hp.com wrote:
  Do you have idea what may be the reason that PostgreSQL process consume 
  more memory when there are more partial indexes on the DB table?

It might use a bit more, but it shouldn't be excessive..  What, exactly, are 
you seeing and would it be possible for you to provide a repeatable test case 
with a small-ish set of data?

 Well, I am not too sure, but indexes always take up more space, so if 
 your backend has a lot of indexes, it will cause the process to 
 consume more memory.

Indexes require additional disk space, certainly.  Having a lot of indexes, by 
itself, shouldn't seriously increase memory usage.

 Indexes should be used with care, as too many indexes can cause a 
 memory overhead,which can cause performance degradations.

This is not generally a reason to avoid indexes.  Indexes require more disk 
space and must be kept up to date, making them expensive to maintain due to 
increased disk i/o.  Building an index uses as much memory as it's allowed to- 
it uses maintenance_work_mem to limit itself.

Thanks,

Stephen


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Atri Sharma

On Mon, May 27, 2013 at 6:02 PM, Ben Zeev, Lior lior.ben-z...@hp.com wrote:
 Hi Stephen,

 The case which I'm seeing is that I have an empty table without any rows,
 Create table test (
   Num Integer,
   C1 character varying(512),
   C2 character varying(512),
   C3 character varying(512));

 I create several partial indexes on this table:

 Create index(index_1_c1) on test(c1) where Num=1;
 Create index(index_2_c1) on test(c1) where Num=2;
 Create index(index_1_c2) on test(c1) where Num=1;
 Create index(index_2_c2) on test(c1) where Num=2;


It is just a hunch, but all of your attributes are character varying.
Could TOAST be an issue here?

Regards,

Atri


--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Ben Zeev, Lior

Hi Atri,

But TOAST only occur if the tuple size exceed 2KB, doesn't it?

Lior


-Original Message-
From: Atri Sharma [mailto:atri.j...@gmail.com] 
Sent: Monday, May 27, 2013 15:39
To: Ben Zeev, Lior
Cc: Stephen Frost; Pg Hackers
Subject: Re: [HACKERS] PostgreSQL Process memory architecture

On Mon, May 27, 2013 at 6:02 PM, Ben Zeev, Lior lior.ben-z...@hp.com wrote:
 Hi Stephen,

 The case which I'm seeing is that I have an empty table without any 
 rows, Create table test (
   Num Integer,
   C1 character varying(512),
   C2 character varying(512),
   C3 character varying(512));

 I create several partial indexes on this table:

 Create index(index_1_c1) on test(c1) where Num=1; Create 
 index(index_2_c1) on test(c1) where Num=2; Create index(index_1_c2) on 
 test(c1) where Num=1; Create index(index_2_c2) on test(c1) where 
 Num=2;


It is just a hunch, but all of your attributes are character varying.
Could TOAST be an issue here?

Regards,

Atri


--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

Lior,

* Ben Zeev, Lior (lior.ben-z...@hp.com) wrote:
 The case which I'm seeing is that I have an empty table without any rows,
 Create table test (
   Num Integer,
   C1 character varying(512),
   C2 character varying(512),
   C3 character varying(512));
 
 I create several partial indexes on this table:
 
 Create index(index_1_c1) on test(c1) where Num=1;
 Create index(index_2_c1) on test(c1) where Num=2;
 Create index(index_1_c2) on test(c1) where Num=1;
 Create index(index_2_c2) on test(c1) where Num=2;
 ...
 
 This doesn't consume much memory on the PostgreSQL backend process,
 But if I create 500 indexes It consume several MB of memory.

When are you seeing this memory utilization..?  When running a query
against that table?  At backend start?

 If I have 10 tables with 500 indexes each PostgreSql backend process consume 
 20MB,
 If I have 100 tables with 500 indexes each PostgreSQL backend process consume 
 200MB
 
 All tables are empty without data.

Are you accessing all of those tables inside of one query?  Or one
transaction, or..?

 What is the reason to consume so much memory for empty indexes?

I'm curious what you would expect to be happening here.  We need to pull
in information about the index in order to consider it during planning.
Special-caseing empty indexes might be possible, but what's the point of
having hundreds of empty indexes against a table in the first place?

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

* Atri Sharma (atri.j...@gmail.com) wrote:
 It is just a hunch, but all of your attributes are character varying.
 Could TOAST be an issue here?

TOAST tables are only created when needed.  In addition, I believe
Lior's concerned about memory utilization and not disk usage; memory
utilization should not be impacted by TOAST at all unless large values
in the tables (which had to be moved to a TOAST table due to size) are
actually being queried against.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Hannu Krosing

On 05/27/2013 01:25 PM, Ben Zeev, Lior wrote:
 Thanks Atri!

 Do you know why PostgreSQL store the indexes in memory per process and not in 
 the shared memory?
From shared_buffers point of view tables and indexes are identical, both
use the
same shared memory in (usually) 8KB pages
 Is there a way to prevent it store the indexes data per process, and force it 
 storing it in the shared memory?
It already does.

Per-query sorts and hashtables are stored in local memory, ordinary
tables and indexes are in shared.


-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Ben Zeev, Lior

Hi Stephen,

Yes, The memory utilization per PostgreSQL backend process is when running 
queries against this tables,
For example: select * from test where num=2 and c2='abc'
When It start it doesn't consume to much memory,
But as it execute against more and more indexes the memory consumption grows

This tables should contain data, But I truncate the data of the tables because 
I wanted to make sure that the memory consumption is not relate to the data 
inside the table, but rather to the structure of the tables 

Thanks,
Lior


-Original Message-
From: Stephen Frost [mailto:sfr...@snowman.net] 
Sent: Monday, May 27, 2013 15:43
To: Ben Zeev, Lior
Cc: Atri Sharma; Pg Hackers
Subject: Re: [HACKERS] PostgreSQL Process memory architecture

Lior,

* Ben Zeev, Lior (lior.ben-z...@hp.com) wrote:
 The case which I'm seeing is that I have an empty table without any 
 rows, Create table test (
   Num Integer,
   C1 character varying(512),
   C2 character varying(512),
   C3 character varying(512));
 
 I create several partial indexes on this table:
 
 Create index(index_1_c1) on test(c1) where Num=1; Create 
 index(index_2_c1) on test(c1) where Num=2; Create index(index_1_c2) on 
 test(c1) where Num=1; Create index(index_2_c2) on test(c1) where 
 Num=2; ...
 
 This doesn't consume much memory on the PostgreSQL backend process, 
 But if I create 500 indexes It consume several MB of memory.

When are you seeing this memory utilization..?  When running a query against 
that table?  At backend start?

 If I have 10 tables with 500 indexes each PostgreSql backend process 
 consume 20MB, If I have 100 tables with 500 indexes each PostgreSQL 
 backend process consume 200MB
 
 All tables are empty without data.

Are you accessing all of those tables inside of one query?  Or one transaction, 
or..?

 What is the reason to consume so much memory for empty indexes?

I'm curious what you would expect to be happening here.  We need to pull in 
information about the index in order to consider it during planning.
Special-caseing empty indexes might be possible, but what's the point of having 
hundreds of empty indexes against a table in the first place?

Thanks,

Stephen


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

Lior,

* Ben Zeev, Lior (lior.ben-z...@hp.com) wrote:
 Yes, The memory utilization per PostgreSQL backend process is when running 
 queries against this tables,
 For example: select * from test where num=2 and c2='abc'
 When It start it doesn't consume to much memory,
 But as it execute against more and more indexes the memory consumption grows

Are these all running in one transaction, or is this usage growth across
multiple transactions?  If this is all in the same transaction, what
happens when you do these queries in independent transactions?

 This tables should contain data, But I truncate the data of the tables 
 because I wanted to make sure that the memory consumption is not relate to 
 the data inside the table, but rather to the structure of the tables 

If you actually have sufficient data to make having 500 indexes on a
table sensible, it strikes me that this memory utilization may not be
the biggest issue you run into.  If you're looking for partitioning,
that's much better done, in PG at least, by using inheiritance and
constraint exclusion.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Ben Zeev, Lior

Hi Stephen,

Each query is running in a separate transaction.

Why does portioning is done better rather than using partial index?

Thanks,
Lior


-Original Message-
From: Stephen Frost [mailto:sfr...@snowman.net] 
Sent: Monday, May 27, 2013 16:15
To: Ben Zeev, Lior
Cc: Atri Sharma; Pg Hackers
Subject: Re: [HACKERS] PostgreSQL Process memory architecture

Lior,

* Ben Zeev, Lior (lior.ben-z...@hp.com) wrote:
 Yes, The memory utilization per PostgreSQL backend process is when 
 running queries against this tables, For example: select * from test where 
 num=2 and c2='abc'
 When It start it doesn't consume to much memory, But as it execute 
 against more and more indexes the memory consumption grows

Are these all running in one transaction, or is this usage growth across 
multiple transactions?  If this is all in the same transaction, what happens 
when you do these queries in independent transactions?

 This tables should contain data, But I truncate the data of the tables 
 because I wanted to make sure that the memory consumption is not 
 relate to the data inside the table, but rather to the structure of 
 the tables

If you actually have sufficient data to make having 500 indexes on a table 
sensible, it strikes me that this memory utilization may not be the biggest 
issue you run into.  If you're looking for partitioning, that's much better 
done, in PG at least, by using inheiritance and constraint exclusion.

Thanks,

Stephen


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

Lior,

* Ben Zeev, Lior (lior.ben-z...@hp.com) wrote:
 Yes, The memory utilization per PostgreSQL backend process is when running 
 queries against this tables,
 For example: select * from test where num=2 and c2='abc'
 When It start it doesn't consume to much memory,
 But as it execute against more and more indexes the memory consumption grows

It might be interesting, if possible for you, to recompile PG with
-DCATCACHE_FORCE_RELEASE, which should cause PG to immediately release
cached information when it's no longer being used.  You'll be trading
memory usage for CPU cycles, of course, but it might be better for your
situation.  We may still be able to do better than what we're doing
today, but I'm still suspicious that you're going to run into other
issues with having 500 indexes on a table anyway.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Ben Zeev, Lior

Great, Thanks !!!
I will try and let you update 

-Original Message-
From: Stephen Frost [mailto:sfr...@snowman.net] 
Sent: Monday, May 27, 2013 16:29
To: Ben Zeev, Lior
Cc: Atri Sharma; Pg Hackers
Subject: Re: [HACKERS] PostgreSQL Process memory architecture

Lior,

* Ben Zeev, Lior (lior.ben-z...@hp.com) wrote:
 Yes, The memory utilization per PostgreSQL backend process is when 
 running queries against this tables, For example: select * from test where 
 num=2 and c2='abc'
 When It start it doesn't consume to much memory, But as it execute 
 against more and more indexes the memory consumption grows

It might be interesting, if possible for you, to recompile PG with 
-DCATCACHE_FORCE_RELEASE, which should cause PG to immediately release cached 
information when it's no longer being used.  You'll be trading memory usage for 
CPU cycles, of course, but it might be better for your situation.  We may still 
be able to do better than what we're doing today, but I'm still suspicious that 
you're going to run into other issues with having 500 indexes on a table anyway.

Thanks,

Stephen


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Stephen Frost

* Ben Zeev, Lior (lior.ben-z...@hp.com) wrote:
 Each query is running in a separate transaction.

Interesting.  You might also compile with CATCACHE_STATS (and not
CATCACHE_FORCE_RELEASE, or perhaps with and without) and then check out
your logs after the process ends (you might need to increase the logging
level to DEBUG2 if you don't see anything initially).

 Why does portioning is done better rather than using partial index?

There's a couple of reasons, but for one thing, you can do parallel
loading of data into partitioned tables (particularly if you refer to
the individual partitions directly rather than going through the
top-level table with a trigger or similar).  Trying to parallel load
into one table with 500 indexes would be pretty painful, I expect.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Process memory architecture

2013-05-27 Thread Atri Sharma

  We may still be able to do better than what we're doing
 today, but I'm still suspicious that you're going to run into other
 issues with having 500 indexes on a table anyway.

+1. I am suspicious that the large number of indexes is the problem
here,even if the problem is not with book keeping associated with
those indexes.

Regards,

Atri


--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

38 matches

Mail list logo