Re: [PERFORM] PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Vitalii Tymchyshyn

17.09.11 23:01, Stefan Keller написав(ла):

* more... ?
What I miss from my DB2 UDB days are buffer pools. In PostgreSQL terms 
this would be part of shared buffers dedicated to a relation or a set of 
relations. When you have a big DB (not fitting in memory) you also 
usually want some small tables/indexes be in memory, no matter what 
other load DB has.

Complimentary features are:
1) Relations preloading at startup - ensure this relation are in memory.
2) Per buffer pool (or relation) page costs - tell it that this 
indexes/tables ARE in memory


Best regards, Vitalii Tymchyshyn.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Thomas Kellerer

Stefan Keller, 17.09.2011 22:01:

I'm also interested in such proposals or ideas!

Here's some list of topics:
* Time in PostgreSQL
* Fast Bulk Data Inserting in PostgreSQL with Unlogged tables


I don't understand these two items. Postgres does have a time data type and it 
has unlogged tables since 9.1

Regards
Thomas


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Robert Klemme
On Sun, Sep 18, 2011 at 9:31 PM, Stefan Keller sfkel...@gmail.com wrote:
 I'm simply referring to literature (like the intro Ramakrishnan  Gehrke).
 I just know that Oracle an Mysql actually do have them too and use it
 without those current implementation specific restrictions in
 Postgres.

Where exactly do you take that from that Oracle has hash indexes?  I
can't seem to find them:
http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/indexiot.htm#sthref293

Are you mixing this up with hash partitioning?
http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/schemaob.htm#sthref443

Or am I missing something?

 IMHO by design Hash Index (e.g. linear hashing) work best when:
 1. only equal (=) tests are used (on whole values)
 2. columns (key values) have very-high cardinality

 And ideally but not necessarily when index values do not change and
 number of rows are known ahead of time (avoiding O(N) worst case - but
 there are approaches to chaining with dynamic resizing).

 I just collected this to encourage ourselves that enhancing hash
 indexes could be worthwhile.

There's still the locking issue Jeff mentioned.  At least every time a
table resize occurs the whole index must be locked.  Or is there a
more fine granular locking strategy which I am overlooking?

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Cédric Villemain
2011/9/19 Vitalii Tymchyshyn tiv...@gmail.com:
 17.09.11 23:01, Stefan Keller написав(ла):

 * more... ?

 What I miss from my DB2 UDB days are buffer pools. In PostgreSQL terms this
 would be part of shared buffers dedicated to a relation or a set of
 relations. When you have a big DB (not fitting in memory) you also usually
 want some small tables/indexes be in memory, no matter what other load DB
 has.
 Complimentary features are:
 1) Relations preloading at startup - ensure this relation are in memory.

you can use pgfincore extension to achieve that, for the OS cache. It
does not look interesting to do that for shared_buffers of postgresql
(the subject has been discussed and can be discussed again, please
check mailling list archieve first)

 2) Per buffer pool (or relation) page costs - tell it that this
 indexes/tables ARE in memory

you can use tablespace parameters (*_cost) for that, it has been
rejected for tables in the past.
I did propose something to start to work in this direction.
See [WIP] cache estimates, cache access cost in postgresql-hackers
mailling list.

This proposal let inform the planner of the table memory usage and
take that into account.



 Best regards, Vitalii Tymchyshyn.

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance




-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Thomas Kellerer

Robert Klemme, 19.09.2011 13:13:

On Sun, Sep 18, 2011 at 9:31 PM, Stefan Kellersfkel...@gmail.com  wrote:

I'm simply referring to literature (like the intro Ramakrishnan  Gehrke).
I just know that Oracle an Mysql actually do have them too and use it
without those current implementation specific restrictions in
Postgres.


Where exactly do you take that from that Oracle has hash indexes?  I
can't seem to find them:
http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/indexiot.htm#sthref293

Are you mixing this up with hash partitioning?
http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/schemaob.htm#sthref443

Or am I missing something?


Maybe he was referring to a hash cluster:
http://download.oracle.com/docs/cd/E11882_01/server.112/e17118/statements_5001.htm

This is a storage option where you can store related rows (e.g. in a 
parent/child relationship) in the same phyiscal database block based on a hash 
value. That enables the databse to read parent and child rows with just a 
single IO.

In the background Oracle probably has something like a hash index to support 
that.

Thomas


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Vitalii Tymchyshyn

Hello.

I did read and AFAIR sometimes responded on this long discussions. The 
main point for me is that many DBAs dont want to have even more random 
plans with postgresql knowing what's in memory now and using this 
information directly in runtime. I also think this point is valid.
What I would like to have is to force some relations to be in memory by 
giving them fixed part of shared buffers and to tell postgresql they are 
in memory (lowering page costs) to have fixed optimal plans.


Best regards, Vitalii Tymchyshyn.

19.09.11 14:57, Cédric Villemain написав(ла):

2011/9/19 Vitalii Tymchyshyntiv...@gmail.com:

17.09.11 23:01, Stefan Keller написав(ла):

* more... ?

What I miss from my DB2 UDB days are buffer pools. In PostgreSQL terms this
would be part of shared buffers dedicated to a relation or a set of
relations. When you have a big DB (not fitting in memory) you also usually
want some small tables/indexes be in memory, no matter what other load DB
has.
Complimentary features are:
1) Relations preloading at startup - ensure this relation are in memory.

you can use pgfincore extension to achieve that, for the OS cache. It
does not look interesting to do that for shared_buffers of postgresql
(the subject has been discussed and can be discussed again, please
check mailling list archieve first)


2) Per buffer pool (or relation) page costs - tell it that this
indexes/tables ARE in memory

you can use tablespace parameters (*_cost) for that, it has been
rejected for tables in the past.
I did propose something to start to work in this direction.
See [WIP] cache estimates, cache access cost in postgresql-hackers
mailling list.

This proposal let inform the planner of the table memory usage and
take that into account.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: PostgreSQL-related topics of theses and seminary works sought (Was: Hash index use presently(?) discouraged...)

2011-09-19 Thread Ivan Voras
On 17/09/2011 22:01, Stefan Keller wrote:
 2011/9/17 Tomas Vondra t...@fuzzy.cz wrote:
 (...)
 We've been asked by a local university for PostgreSQL-related topics of
 theses and seminary works
 
 I'm also interested in such proposals or ideas!
 
 Here's some list of topics:
 * Adding WAL-support to hash indexes in PostgreSQL (see ex-topic)
 * Time in PostgreSQL
 * Storing (Weather) Sensor Data in PostgreSQL
 * Fast Bulk Data Inserting in PostgreSQL with Unlogged tables (incl.
 adding GiST support)
 * Performance Tuning of Read-Only a PostgreSQL Database
 * Materialized Views in PostgreSQL: Experiments around Jonathan
 Gardner's Proposal
 * more... ?

 * Covering indexes
 * Controllable record compression
 * Memory tables



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Merlin Moncure
On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller sfkel...@gmail.com wrote:
 Merlin and Jeff,

 General remark again:It's hard for me to imagine that btree is
 superior for all the issues mentioned before. I still believe in hash
 index for primary keys and certain unique constraints where you need
 equality search and don't need ordering or range search.

It is -- but please understand I'm talking about int32 tree vs hash.
Hashing as a technique of course is absolutely going to cream btree
for all kinds of data because of the advantages of working with
decomposed data -- we are all taught that in comp-sci 101 :-).  The
debate here is not about the advantages of hashing, but the specific
implementation of the hash index used.

Postgres's hash index implementation used to be pretty horrible -- it
stored the pre-hashed datum in the index which, while making it easier
to do certain things,  made it horribly slow, and, for all intents and
purposes, useless.  Somewhat recently,a lot of work was put in to fix
that -- the index now packs the hash code only which made it
competitive with btree and superior for larger keys.  However, certain
technical limitations like lack of WAL logging and uniqueness hold
hash indexing back from being used like it really should be.  In cases
where I really *do* need hash indexing, I do it in userland.

create table foo
(
  a_long_field text;
);
create index on foo(hash(a_long_field));

select * from foo where hash(a_long_field) = hash(some_value) and
a_long_field = some_value;

This technique works fine -- the main disadvantage is that enforcing
uniqueness is a PITA but since the standard index doesn't support it
either it's no great loss.  I also have the option of getting
'uniqueness' and being able to skip the equality operation if I
sacrifice some performance and choose a strong digest.  Until the hash
index issues are worked out, I submit that this remains the go-to
method to do this.

Now, my point here is that I've noticed that even with the latest
optimizations btree seems to still be superior to the hash indexing by
most metrics, so that:
create table foo
(
  an_int_field int;
  a_long_field text;
);

create index on foo(an_int_field);
create index on foo using hash(a_long_field);

On performance grounds alone, the btree index seems to be (from my
very limited testing) a better bet.  So I'm conjecturing that the
current hash implementation should be replaced with a formalization of
the userland technique shown above -- when you request a hash index,
the database will silently hash your field and weave it into a btree.
It's a hybrid: a hashed btree.  To truly demonstrate if the technique
was effective though, it would have to be coded up -- it's only fair
to compare if the btree based hash is also double checking the value
in the heap which the standard hash index must do.

The other way to go of course is to try and fix up the existing hash
index code -- add wal logging, etc. In theory, a customized hash
structure should be able to beat btree all day long which argues to
continue in this direction.

@ jeff:
The way you created the table, I think the rows are basically going to be
in order in the table, which means the btree index accesses are going to
visit the same block over and over again before going to the next block.

This does not explain the behavior.  Yeah -- it may take longer but
your computer should not be sitting idle during create index
operations :-).  Unfortunately, I was not able to reproduce it on
linux.  I have to bite the bullet and get the mingw up if I want to
try and diagnose -- perhaps it is stalling in the semop calls.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Robert Klemme
On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller sfkel...@gmail.com wrote:
 Merlin and Jeff,

 General remark again:It's hard for me to imagine that btree is
 superior for all the issues mentioned before. I still believe in hash
 index for primary keys and certain unique constraints where you need
 equality search and don't need ordering or range search.

 It is -- but please understand I'm talking about int32 tree vs hash.
 Hashing as a technique of course is absolutely going to cream btree
 for all kinds of data because of the advantages of working with
 decomposed data -- we are all taught that in comp-sci 101 :-).  The
 debate here is not about the advantages of hashing, but the specific
 implementation of the hash index used.

 Postgres's hash index implementation used to be pretty horrible -- it
 stored the pre-hashed datum in the index which, while making it easier
 to do certain things,  made it horribly slow, and, for all intents and
 purposes, useless.  Somewhat recently,a lot of work was put in to fix
 that -- the index now packs the hash code only which made it
 competitive with btree and superior for larger keys.  However, certain
 technical limitations like lack of WAL logging and uniqueness hold
 hash indexing back from being used like it really should be.  In cases
 where I really *do* need hash indexing, I do it in userland.

 create table foo
 (
  a_long_field text;
 );
 create index on foo(hash(a_long_field));

 select * from foo where hash(a_long_field) = hash(some_value) and
 a_long_field = some_value;

 This technique works fine -- the main disadvantage is that enforcing
 uniqueness is a PITA but since the standard index doesn't support it
 either it's no great loss.  I also have the option of getting
 'uniqueness' and being able to skip the equality operation if I
 sacrifice some performance and choose a strong digest.  Until the hash
 index issues are worked out, I submit that this remains the go-to
 method to do this.

Is this approach (storing the hash code in a btree) really faster than
a regular btree index on a_long_field?  And if so, for which kind of
data and load?

 Now, my point here is that I've noticed that even with the latest
 optimizations btree seems to still be superior to the hash indexing by
 most metrics, so that:
 create table foo
 (
  an_int_field int;
  a_long_field text;
 );

 create index on foo(an_int_field);
 create index on foo using hash(a_long_field);

 On performance grounds alone, the btree index seems to be (from my
 very limited testing) a better bet.  So I'm conjecturing that the
 current hash implementation should be replaced with a formalization of
 the userland technique shown above -- when you request a hash index,
 the database will silently hash your field and weave it into a btree.
 It's a hybrid: a hashed btree.

I'd rather call it a btreefied hash because you are storing a hash
but in a btree structure. :-) But that's a detail.  What I find
worrying is that then there is a certain level of obscuring the real
nature since create index ... using hash is not exactly creating a
hash table.

 To truly demonstrate if the technique
 was effective though, it would have to be coded up -- it's only fair
 to compare if the btree based hash is also double checking the value
 in the heap which the standard hash index must do.

Right.

 The other way to go of course is to try and fix up the existing hash
 index code -- add wal logging, etc. In theory, a customized hash
 structure should be able to beat btree all day long which argues to
 continue in this direction.

I still haven't seen a solution to locking when a hash table needs
resizing.  All hashing algorithms I can think of at the moment would
require a lock on the whole beast during the resize which makes this
type of index impractical for certain loads (heavy updating).

One solution would be to apply partitioning to the hash table itself
(e.g. have four partitions for the two least significant bits or 16
for the four lest significant bits) and treat them independently.  How
that would interact with WAL I have no idea though.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Vitalii Tymchyshyn

19.09.11 18:19, Robert Klemme написав(ла):

On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncuremmonc...@gmail.com  wrote:


Postgres's hash index implementation used to be pretty horrible -- it
stored the pre-hashed datum in the index which, while making it easier
to do certain things,  made it horribly slow, and, for all intents and
purposes, useless.  Somewhat recently,a lot of work was put in to fix
that -- the index now packs the hash code only which made it
competitive with btree and superior for larger keys.  However, certain
technical limitations like lack of WAL logging and uniqueness hold
hash indexing back from being used like it really should be.  In cases
where I really *do* need hash indexing, I do it in userland.

create table foo
(
  a_long_field text;
);
create index on foo(hash(a_long_field));

select * from foo where hash(a_long_field) = hash(some_value) and
a_long_field = some_value;

This technique works fine -- the main disadvantage is that enforcing
uniqueness is a PITA but since the standard index doesn't support it
either it's no great loss.  I also have the option of getting
'uniqueness' and being able to skip the equality operation if I
sacrifice some performance and choose a strong digest.  Until the hash
index issues are worked out, I submit that this remains the go-to
method to do this.

Is this approach (storing the hash code in a btree) really faster than
a regular btree index on a_long_field?  And if so, for which kind of
data and load?


Actually sometimes the field in [potentially] so long, you can't use 
regular b-tree because it won't fit in the page. Say, it is text type. 
If you will create regular index, you will actually limit column value 
size to few KB. I am using md5(text) indexes in this case coupled with 
rather ugly queries (see above). Native support would be nice.


Best regards, Vitalii Tymchyshyn.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Tom Lane
Robert Klemme shortcut...@googlemail.com writes:
 I still haven't seen a solution to locking when a hash table needs
 resizing.  All hashing algorithms I can think of at the moment would
 require a lock on the whole beast during the resize which makes this
 type of index impractical for certain loads (heavy updating).

That seems rather drastically overstated.  The existing hash index code
only needs to hold an index-scope lock for a short interval while it
updates the bucket mapping information after a bucket split.  All other
locks are per-bucket or per-page.  The conflicting share-lockers of the
index-wide lock also only need to hold it for a short time, not for
their whole indexscans.  So that doesn't seem to me to be materially
worse than the locking situation for a btree, where we also sometimes
need exclusive lock on the btree root page, thus blocking incoming
indexscans for a short time.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Vitalii Tymchyshyn

19.09.11 18:19, Robert Klemme написав(ла):


I still haven't seen a solution to locking when a hash table needs
resizing.  All hashing algorithms I can think of at the moment would
require a lock on the whole beast during the resize which makes this
type of index impractical for certain loads (heavy updating).
Sorry for the second reply, I should have not start writing until I've 
read all your post. Anyway.
Do you need read lock? I'd say readers could use old copy of hash 
table up until the moment new bigger copy is ready. This will simply 
look like the update is not started yet, which AFAIK is OK for MVCC.

Yep, all the writers will wait.

Another option could be to start background build of larger hash - for 
some time your performance will be degraded since you are writing to two 
indexes instead of one plus second one is rebuilding, but I'd say low 
latency solution is possible here.


One more: I don't see actually why can't you have a rolling expand of 
hash table. I will try to describe it correct me if I am wrong:
1) The algorithm I am talking about will take n bits from hash code to 
for hash table. So, during expansion it will double number of baskets.
2) Say, we are going from 2^n = n1 to 2^(n+1) = n2 = n1 * 2 baskets. 
Each new pair of baskets will take data from single source basket 
depending on the value of new hash bit used. E.g. if n were 2, we've had 
4 baskets and new table will have 8 baskets. Everything from old basket 
#1 will go into new baskets #2 and #3 depending on hash value.
3) So, we can have a counter on number of baskets processed. Any 
operation on any lower numbered basket will go to new set. Any 
operation on any higher numbered basket will go to old set. Any 
operation on currently converting basket will block until conversion is 
done.


P.S. Sorry for a lot of possibly dumb thoughts, I don't know why I've 
got such a though stream on this topic :)


Best regards, Vitalii Tymchyshyn.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Claudio Freire
On Mon, Sep 19, 2011 at 12:54 PM, Vitalii Tymchyshyn tiv...@gmail.com wrote:
 19.09.11 18:19, Robert Klemme написав(ла):

 I still haven't seen a solution to locking when a hash table needs
 resizing.  All hashing algorithms I can think of at the moment would
 require a lock on the whole beast during the resize which makes this
 type of index impractical for certain loads (heavy updating).

 Sorry for the second reply, I should have not start writing until I've read
 all your post. Anyway.
 Do you need read lock? I'd say readers could use old copy of hash table up
 until the moment new bigger copy is ready. This will simply look like the
 update is not started yet, which AFAIK is OK for MVCC.
 Yep, all the writers will wait.

All this would get solved if there's no automatic hash index resizing.

DBAs would have to recreate (possibly concurrently) the hash to make it bigger.

Still, hash has lots of issues. I'm not sure how the hash is
implemented in PG, but usually, for low collision rates pseudorandom
walks are used to traverse collision chains. But pseudorandom
collision chains mean random I/O which is awful for a DB. Those
techniques have not been designed to work with secondary memory.

So, they would have to be adapted to working with secondary memory,
and that means a lot of RD. It's not impossible, it's just a lot of
work.

I subscribe to the idea that, *in the meanwhile*, without scrapping
the hash index and in parallel to improving it, an option for
transparently-hashed btrees would be valuable.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Jeff Janes
On Mon, Sep 19, 2011 at 8:19 AM, Robert Klemme
shortcut...@googlemail.com wrote:
 On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure mmonc...@gmail.com wrote:

 The other way to go of course is to try and fix up the existing hash
 index code -- add wal logging, etc. In theory, a customized hash
 structure should be able to beat btree all day long which argues to
 continue in this direction.

 I still haven't seen a solution to locking when a hash table needs
 resizing.  All hashing algorithms I can think of at the moment would
 require a lock on the whole beast during the resize which makes this
 type of index impractical for certain loads (heavy updating).

The current implementation doesn't EX lock the whole beast during
resizing, except a brief one at the beginning (and maybe at the end?)
of the split.  It does EX lock the entire bucket being split for the
duration of the split, though.  The main problem that I see is that
due to the potential for deadlocks, the locks involved have to be
heavy-weight.  Which means the shared locks which non-splitting
processes have to use to block against those EX locks have to be
heavy-weight too, and getting even shared heavy-weight locks means
exclusive light-weight locks, which means contention.

One way out would be to have a special process (probably vacuum) do
all the resizing/splitting, rather than having regular backends doing
it.  It should be possible to make this dead-lock free.

Another would be to make hash indexes only support bit-map scans.
Then the scanning hash-code would never have a need to pass control
back to the executor while still holding a bucket lock.

Another would be to make the current position of the hash index scan
be refindable if a split should occur during the scan, so that you
don't need to inhibit splits during a scan of the same bucket.  This
would probably be easy if there were no overflow pages.   But the
overflow pages get shuffled in with each other and with the main
bucket page during a split.  It would take quite some gymnastics to
get around that.



Cheers,

Jeff

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Merlin Moncure
On Mon, Sep 19, 2011 at 10:19 AM, Robert Klemme
shortcut...@googlemail.com wrote:
 On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller sfkel...@gmail.com wrote:
 Merlin and Jeff,

 General remark again:It's hard for me to imagine that btree is
 superior for all the issues mentioned before. I still believe in hash
 index for primary keys and certain unique constraints where you need
 equality search and don't need ordering or range search.

 It is -- but please understand I'm talking about int32 tree vs hash.
 Hashing as a technique of course is absolutely going to cream btree
 for all kinds of data because of the advantages of working with
 decomposed data -- we are all taught that in comp-sci 101 :-).  The
 debate here is not about the advantages of hashing, but the specific
 implementation of the hash index used.

 Postgres's hash index implementation used to be pretty horrible -- it
 stored the pre-hashed datum in the index which, while making it easier
 to do certain things,  made it horribly slow, and, for all intents and
 purposes, useless.  Somewhat recently,a lot of work was put in to fix
 that -- the index now packs the hash code only which made it
 competitive with btree and superior for larger keys.  However, certain
 technical limitations like lack of WAL logging and uniqueness hold
 hash indexing back from being used like it really should be.  In cases
 where I really *do* need hash indexing, I do it in userland.

 create table foo
 (
  a_long_field text;
 );
 create index on foo(hash(a_long_field));

 select * from foo where hash(a_long_field) = hash(some_value) and
 a_long_field = some_value;

 This technique works fine -- the main disadvantage is that enforcing
 uniqueness is a PITA but since the standard index doesn't support it
 either it's no great loss.  I also have the option of getting
 'uniqueness' and being able to skip the equality operation if I
 sacrifice some performance and choose a strong digest.  Until the hash
 index issues are worked out, I submit that this remains the go-to
 method to do this.

 Is this approach (storing the hash code in a btree) really faster than
 a regular btree index on a_long_field?  And if so, for which kind of
 data and load?

 Now, my point here is that I've noticed that even with the latest
 optimizations btree seems to still be superior to the hash indexing by
 most metrics, so that:
 create table foo
 (
  an_int_field int;
  a_long_field text;
 );

 create index on foo(an_int_field);
 create index on foo using hash(a_long_field);

 On performance grounds alone, the btree index seems to be (from my
 very limited testing) a better bet.  So I'm conjecturing that the
 current hash implementation should be replaced with a formalization of
 the userland technique shown above -- when you request a hash index,
 the database will silently hash your field and weave it into a btree.
 It's a hybrid: a hashed btree.

 I'd rather call it a btreefied hash because you are storing a hash
 but in a btree structure. :-) But that's a detail.  What I find
 worrying is that then there is a certain level of obscuring the real
 nature since create index ... using hash is not exactly creating a
 hash table.

 To truly demonstrate if the technique
 was effective though, it would have to be coded up -- it's only fair
 to compare if the btree based hash is also double checking the value
 in the heap which the standard hash index must do.

 Right.

so, i was curious, and decided to do some more performance testing. I
created a table like this:

create table foo as select r, r::text || 'acbdefghijklmnop' as b from
generate_series(1,1000) r;
create index on foo(r);
create index on foo using hash(b);

to simulate the btree/hash hybrid, I cut a pgbench file like so (btree.sql):
\setrandom x 1 10
select * from foo where r = :x  and b=:x::text || 'acbdefghijklmnop'

and this: for the standard hash (hash.sql):
\setrandom x 1 10
select * from foo where b=:x::text || 'acbdefghijklmnop'

pgbench -n -c2 -T 10 -f hash.sql etc

On my test machine, hybrid hash eeks out a slight win on in-cache tests:
merlin@mmoncure-ubuntu:~$ pgbench -n -c2 -T 100 -f btree.sql -p 5490
tps = 3250.793656 (excluding connections establishing)

vs
merlin@mmoncure-ubuntu:~$ pgbench -n -c2 -T 100 -f hash.sql -p 5490
tps = 3081.730400 (excluding connections establishing)


To make the test into i/o bound, I change the setrandom from 10 to
1000; this produced some unexpected results. The hash index is
pulling about double the tps (~80 vs ~ 40) over the hybrid version.
Well, unless my methodology is wrong, it's unfair to claim btree is
beating hash in 'all cases'. hm.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Claudio Freire
On Mon, Sep 19, 2011 at 3:43 PM, Merlin Moncure mmonc...@gmail.com wrote:
 To make the test into i/o bound, I change the setrandom from 10 to
 1000; this produced some unexpected results. The hash index is
 pulling about double the tps (~80 vs ~ 40) over the hybrid version.
 Well, unless my methodology is wrong, it's unfair to claim btree is
 beating hash in 'all cases'. hm.

Is this only selects?
Hash performs badly with updates, IIRC.
I haven't tried in a long while, though.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Igor Chudov
Let's say that I want to do INSERT SELECT of 1,000 items into a table. The
table has some ints, varchars, TEXT and BLOB fields.

Would the time that it takes, differ a great deal, depending on whether the
table has only 100,000 or 5,000,000 records?

Thanks

i


Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Scott Marlowe
On Mon, Sep 19, 2011 at 4:11 PM, Igor Chudov ichu...@gmail.com wrote:
 Let's say that I want to do INSERT SELECT of 1,000 items into a table. The
 table has some ints, varchars, TEXT and BLOB fields.
 Would the time that it takes, differ a great deal, depending on whether the
 table has only 100,000 or 5,000,000 records?

Depends.  Got any indexes?  The more indexes you have to update the
slower it's gonna be.  You can test this, it's easy to create test
rows like so:

insert into test select generate_series(1,1);

etc.  There's lots of examples floating around on how to do that.  So,
make yourself a couple of tables and test it.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Stephen Frost
Igor,

* Igor Chudov (ichu...@gmail.com) wrote:
 Would the time that it takes, differ a great deal, depending on whether the
 table has only 100,000 or 5,000,000 records?

Yes, because PostgreSQL is going to copy the data.  If you don't need or
want it to be copied, just use a view.  I've never heard of any
relational database implementing 'copy on write' type semantics, if
that's what you're asking about.  Databases, unlike applications with
code in memory that's constantly copied, are typically focused around
minimizing duplication of data (since it all has to end up on disk at
some point).  Not much point in having the overhead of COW for that kind
of environment, I wouldn't think.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Igor Chudov
On Mon, Sep 19, 2011 at 6:32 PM, Scott Marlowe scott.marl...@gmail.comwrote:

 On Mon, Sep 19, 2011 at 4:11 PM, Igor Chudov ichu...@gmail.com wrote:
  Let's say that I want to do INSERT SELECT of 1,000 items into a table.
 The
  table has some ints, varchars, TEXT and BLOB fields.
  Would the time that it takes, differ a great deal, depending on whether
 the
  table has only 100,000 or 5,000,000 records?

 Depends.  Got any indexes?  The more indexes you have to update the
 slower it's gonna be.  You can test this, it's easy to create test
 rows like so:

 insert into test select generate_series(1,1);

 etc.  There's lots of examples floating around on how to do that.  So,
 make yourself a couple of tables and test it.


Well, my question is, rather, whether the time to do a bulk INSERT of N
records into a large table, would take substantially longer than a bulk
insert of N records into a small table. In other words, does the populating
time grow as the table gets more and more rows?

i


Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Stephen Frost
* Igor Chudov (ichu...@gmail.com) wrote:
 Well, my question is, rather, whether the time to do a bulk INSERT of N
 records into a large table, would take substantially longer than a bulk
 insert of N records into a small table. In other words, does the populating
 time grow as the table gets more and more rows?

Oh, in that regard, the answer would generally be 'no'.  PostgreSQL
maintains a table known as the 'free space map', where it keeps track of
where there is 'free space' to insert data into a table.  As someone
else mentioned, if there's a lot of indexes then it's possible that the
increased depth in the index due to the larger number of tuples might
mean the larger table is slower, but I don't think it'd make a huge
difference, to be honest...

Are you seeing that behavior?  There's nothing like testing it to see
exactly what happens, of course..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Jon Nelson
On Mon, Sep 19, 2011 at 7:53 PM, Stephen Frost sfr...@snowman.net wrote:
 Igor,

 * Igor Chudov (ichu...@gmail.com) wrote:
 Would the time that it takes, differ a great deal, depending on whether the
 table has only 100,000 or 5,000,000 records?

 Yes, because PostgreSQL is going to copy the data.  If you don't need or
 want it to be copied, just use a view.  I've never heard of any
 relational database implementing 'copy on write' type semantics, if
 that's what you're asking about.  Databases, unlike applications with
 code in memory that's constantly copied, are typically focused around
 minimizing duplication of data (since it all has to end up on disk at
 some point).  Not much point in having the overhead of COW for that kind
 of environment, I wouldn't think.

Isn't the WAL basically COW?

-- 
Jon

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Stephen Frost
* Jon Nelson (jnelson+pg...@jamponi.net) wrote:
 Isn't the WAL basically COW?

eh..?  No..  The WAL is used to record what changes are made to the
various files in the database, it certainly isn't an kind of
copy-on-write system, where we wait until a change is made to data
before copying it..

If you INSERT .. SELECT, you're going to get the real data in the WAL,
and also in the heap of the new table..

Thanks,

Stephen


signature.asc
Description: Digital signature


[PERFORM] where is max_fsm_pages in PG9.0?

2011-09-19 Thread Anibal David Acosta
I have a lot of wasted bytes in some tables.

Somewhere I read that maybe auto-vacuum can't release space due to a low
max_fsm_pages setting.

 

I want to increase it, but I don't found the param in the postgres.conf.

 

This param exists? If not? How can I deal with bloated tables?

 

I have many tables with lot of insert/updates/delete (aprox. 5,000,000 per
day)

 

Thanks!

 

 



Re: [PERFORM] where is max_fsm_pages in PG9.0?

2011-09-19 Thread Scott Marlowe
On Mon, Sep 19, 2011 at 10:28 PM, Anibal David Acosta a...@devshock.com wrote:
 I have a lot of wasted bytes in some tables.

 Somewhere I read that maybe auto-vacuum can’t release space due to a low
 max_fsm_pages setting.

It's no longer there, as fsm was moved from memory (which was limited
by max fsm pages) to disk, which is limited by the size of your disks.

Most likely you aren't vacuuming aggresively enough.  Lower
autovacuum_vacuum_cost_delay and raise autovacuum_vacuum_cost_limit,
and possibly raise max threads for vacuum works (forget the name)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres INSERT performance and scalability

2011-09-19 Thread Craig Ringer

On 09/20/2011 09:21 AM, Jon Nelson wrote:

Isn't the WAL basically COW?


Nope, it's a lot more like a filesystem journal - though it includes all 
data, not just metadata like filesystem journals usually do.


Now, you could argue that PostgreSQL uses a copy-on-write like system to 
maintain row versions so that already-running statements (or 
SERIALIZABLE transactions) don't see data from the future and to 
maintain rollback data for uncommitted transactions. It's the *new* data 
that gets written to the WAL, though, not the old data.


(OK, if you have full_page_writes enabled you might get a mix of old and 
new data written to WAL, but that's an implementation detail).


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance