Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Craig Ringer
On 09/20/2011 09:21 AM, Jon Nelson wrote: Isn't the WAL basically COW? Nope, it's a lot more like a filesystem journal - though it includes all data, not just metadata like filesystem journals usually do. Now, you could argue that PostgreSQL uses a copy-on-write like system to maintain row

Re: [PERFORM] where is max_fsm_pages in PG9.0?

2011-09-19 Thread Scott Marlowe
On Mon, Sep 19, 2011 at 10:28 PM, Anibal David Acosta wrote: > I have a lot of wasted bytes in some tables. > > Somewhere I read that maybe auto-vacuum can’t release space due to a low > max_fsm_pages setting. It's no longer there, as fsm was moved from memory (which was limited by max fsm pages)

[PERFORM] where is max_fsm_pages in PG9.0?

2011-09-19 Thread Anibal David Acosta
I have a lot of wasted bytes in some tables. Somewhere I read that maybe auto-vacuum can't release space due to a low max_fsm_pages setting. I want to increase it, but I don't found the param in the postgres.conf. This param exists? If not? How can I deal with bloated tables? I have ma

Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Stephen Frost
* Jon Nelson (jnelson+pg...@jamponi.net) wrote: > Isn't the WAL basically COW? eh..? No.. The WAL is used to record what changes are made to the various files in the database, it certainly isn't an kind of "copy-on-write" system, where we wait until a change is made to data before copying it..

Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Jon Nelson
On Mon, Sep 19, 2011 at 7:53 PM, Stephen Frost wrote: > Igor, > > * Igor Chudov (ichu...@gmail.com) wrote: >> Would the time that it takes, differ a great deal, depending on whether the >> table has only 100,000 or 5,000,000 records? > > Yes, because PostgreSQL is going to copy the data.  If you d

Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Stephen Frost
* Igor Chudov (ichu...@gmail.com) wrote: > Well, my question is, rather, whether the time to do a bulk INSERT of N > records into a large table, would take substantially longer than a bulk > insert of N records into a small table. In other words, does the populating > time grow as the table gets mo

Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Igor Chudov
On Mon, Sep 19, 2011 at 6:32 PM, Scott Marlowe wrote: > On Mon, Sep 19, 2011 at 4:11 PM, Igor Chudov wrote: > > Let's say that I want to do INSERT SELECT of 1,000 items into a table. > The > > table has some ints, varchars, TEXT and BLOB fields. > > Would the time that it takes, differ a great de

Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Stephen Frost
Igor, * Igor Chudov (ichu...@gmail.com) wrote: > Would the time that it takes, differ a great deal, depending on whether the > table has only 100,000 or 5,000,000 records? Yes, because PostgreSQL is going to copy the data. If you don't need or want it to be copied, just use a view. I've never h

Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Scott Marlowe
On Mon, Sep 19, 2011 at 4:11 PM, Igor Chudov wrote: > Let's say that I want to do INSERT SELECT of 1,000 items into a table. The > table has some ints, varchars, TEXT and BLOB fields. > Would the time that it takes, differ a great deal, depending on whether the > table has only 100,000 or 5,000,00

[PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Igor Chudov
Let's say that I want to do INSERT SELECT of 1,000 items into a table. The table has some ints, varchars, TEXT and BLOB fields. Would the time that it takes, differ a great deal, depending on whether the table has only 100,000 or 5,000,000 records? Thanks i

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Claudio Freire
On Mon, Sep 19, 2011 at 3:43 PM, Merlin Moncure wrote: > To make the test into i/o bound, I change the setrandom from 10 to > 1000; this produced some unexpected results. The hash index is > pulling about double the tps (~80 vs ~ 40) over the hybrid version. > Well, unless my methodology i

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Merlin Moncure
On Mon, Sep 19, 2011 at 10:19 AM, Robert Klemme wrote: > On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure wrote: >> On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller wrote: >>> Merlin and Jeff, >>> >>> General remark again:It's hard for me to imagine that btree is >>> superior for all the issues men

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Jeff Janes
On Mon, Sep 19, 2011 at 8:19 AM, Robert Klemme wrote: > On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure wrote: > >> The other way to go of course is to try and fix up the existing hash >> index code -- add wal logging, etc. In theory, a customized hash >> structure should be able to beat btree al

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Claudio Freire
On Mon, Sep 19, 2011 at 12:54 PM, Vitalii Tymchyshyn wrote: > 19.09.11 18:19, Robert Klemme написав(ла): >> >> I still haven't seen a solution to locking when a hash table needs >> resizing.  All hashing algorithms I can think of at the moment would >> require a lock on the whole beast during the

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Vitalii Tymchyshyn
19.09.11 18:19, Robert Klemme написав(ла): I still haven't seen a solution to locking when a hash table needs resizing. All hashing algorithms I can think of at the moment would require a lock on the whole beast during the resize which makes this type of index impractical for certain loads (hea

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Tom Lane
Robert Klemme writes: > I still haven't seen a solution to locking when a hash table needs > resizing. All hashing algorithms I can think of at the moment would > require a lock on the whole beast during the resize which makes this > type of index impractical for certain loads (heavy updating).

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Vitalii Tymchyshyn
19.09.11 18:19, Robert Klemme написав(ла): On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure wrote: Postgres's hash index implementation used to be pretty horrible -- it stored the pre-hashed datum in the index which, while making it easier to do certain things, made it horribly slow, and, for

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Robert Klemme
On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure wrote: > On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller wrote: >> Merlin and Jeff, >> >> General remark again:It's hard for me to imagine that btree is >> superior for all the issues mentioned before. I still believe in hash >> index for primary key

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Merlin Moncure
On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller wrote: > Merlin and Jeff, > > General remark again:It's hard for me to imagine that btree is > superior for all the issues mentioned before. I still believe in hash > index for primary keys and certain unique constraints where you need > equality sear

[PERFORM] Re: PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Ivan Voras
On 17/09/2011 22:01, Stefan Keller wrote: > 2011/9/17 Tomas Vondra wrote: > (...) >> We've been asked by a local university for PostgreSQL-related topics of >> theses and seminary works > > I'm also interested in such proposals or ideas! > > Here's some list of topics: > * Adding WAL-support to

Re: [PERFORM] PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Vitalii Tymchyshyn
Hello. I did read and AFAIR sometimes responded on this long discussions. The main point for me is that many DBAs dont want to have even more random plans with postgresql knowing what's in memory now and using this information directly in runtime. I also think this point is valid. What I would

[PERFORM] Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Thomas Kellerer
Robert Klemme, 19.09.2011 13:13: On Sun, Sep 18, 2011 at 9:31 PM, Stefan Keller wrote: I'm simply referring to literature (like the intro Ramakrishnan& Gehrke). I just know that Oracle an Mysql actually do have them too and use it without those current implementation specific restrictions in P

Re: [PERFORM] PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Cédric Villemain
2011/9/19 Vitalii Tymchyshyn : > 17.09.11 23:01, Stefan Keller написав(ла): >> >> * more... ? > > What I miss from my DB2 UDB days are buffer pools. In PostgreSQL terms this > would be part of shared buffers dedicated to a relation or a set of > relations. When you have a big DB (not fitting in mem

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Robert Klemme
On Sun, Sep 18, 2011 at 9:31 PM, Stefan Keller wrote: > I'm simply referring to literature (like the intro Ramakrishnan & Gehrke). > I just know that Oracle an Mysql actually do have them too and use it > without those current implementation specific restrictions in > Postgres. Where exactly do y

[PERFORM] Re: PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Thomas Kellerer
Stefan Keller, 17.09.2011 22:01: I'm also interested in such proposals or ideas! Here's some list of topics: * Time in PostgreSQL * Fast Bulk Data Inserting in PostgreSQL with Unlogged tables I don't understand these two items. Postgres does have a time data type and it has unlogged tables si

Re: [PERFORM] PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Vitalii Tymchyshyn
17.09.11 23:01, Stefan Keller написав(ла): * more... ? What I miss from my DB2 UDB days are buffer pools. In PostgreSQL terms this would be part of shared buffers dedicated to a relation or a set of relations. When you have a big DB (not fitting in memory) you also usually want some small tabl