Re: [PERFORM] Using index for bitwise operations?

2009-06-02 Thread Matthew Wakeling

On Mon, 1 Jun 2009, Shaul Dar wrote:

Our typical query select 300 random rows (could be located in different 
blocks) from the table based on another
column+index, and then filters them down to ~50 based on this the bit field.


So it seems that you're already using an index to fetch 300 rows from a 
big table, and then filtering that down to ~50 based on the über-complex 
stuff.


That's the right way to do it. There isn't really an appropriate place to 
add another index into this query plan. Filtering 300 rows is peanuts for 
Postgres.


You quite probably won't get any benefit from having a bitwise index, 
unless you can make a multi-column index with the existing index stuff 
first and then the bitwise stuff as a second column. However, that sounds 
like more effort than benefit.


If I have my analysis wrong, perhaps you could post your EXPLAIN ANALYSE 
results so we can see what you mean.


Matthew

--
What goes up must come down. Ask any system administrator.
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Using index for bitwise operations?

2009-06-01 Thread Richard Huxton

Shaul Dar wrote:

Hi,

I have at column that is a bit array of 16, each bit specifying if a certain
property, out of 16, is present or not. Our typical query select 300
random rows (could be located in different blocks) from the table based on
another column+index, and then filters them down to ~50 based on this the
bit field.

[snip]
 W/o an index this might be overly expensive,
 even as a filter (on selected 300 rows).

Have you _tried_ just not having an index at all? Since you are only 
accessing a relatively small number of rows to start with, even an 
infinitely efficient index isn't going to make that much difference. 
Combine that with the fact that you're going to have the indexes 
competing with the table for cache space and I'd see how much difference 
it makes just not having it.


Failing that, perhaps have an index on a single bit if there is one you 
always/mostly check against.


The relational way to do this would be one or more property tables 
joined to your main table. If the majority of your properties are not 
set then this could be faster too.


--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Using index for bitwise operations?

2009-06-01 Thread Tom Lane
Shaul Dar shaul...@gmail.com writes:
 I have at column that is a bit array of 16, each bit specifying if a certain
 property, out of 16, is present or not. Our typical query select 300
 random rows (could be located in different blocks) from the table based on
 another column+index, and then filters them down to ~50 based on this the
 bit field. Currently we have 16 separate indexes built on each bit, and on
 our 25M rows table each index takes about 880MB for a total of 14GB!

Ouch.  One possibility is to replace the bitarray with an integer array
(k is in the int[] array iff bit k was set in the bitarray) and then use
the GIST or GIN indexing capabilities of contrib/intarray.  I also seem
to recall having seen a module that provides GIST indexing for bitops
on plain integers --- have you looked on pgfoundry?

This isn't necessarily better than what you're doing, as btree indexes
are a lot better optimized than GIST/GIN.  But it would be worth looking
into.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance