[SQL] Question on partitioning
Hello , All. I am not sure if this is the right mailing list to place this question. If it doesn't, please kindly redirect me to the right list. I have a giant table with about 10,6 million records. Queries on it are usually slow, and if I try to do something more elaborate like an INNER JOIN with itself it becomes unnacceptably slow. I am looking for a way to improve performance. One of the columns is of type date. Each "day" includes about a few tens of thousands records And the older a date is the less likely I am to do queries on it. The objective of the "self join" is to compare data from two different days, looking for diferences. Ive read that one of the benefits of partitioning is to speed up queries by separating less used records. My question is if partitioning can be a good way to make the queries faster (specially the self joins) or if it isn't worth trying because it doesn't help on my particular situation. Please kindly advice me on this Many thanks in advance for your kind help Best, Oliveiros
Re: [SQL] Question on partitioning
On Thu, Aug 21, 2008 at 8:25 AM, Oliveiros Cristina <[EMAIL PROTECTED]> wrote: > Hello , All. > > I am not sure if this is the right mailing list to place this question. > If it doesn't, please kindly redirect me to the right list. > > I have a giant table with about 10,6 million records. 10.6 million rows isn't really gigantic. It's a good size, but unless you're hitting a lot of the records at once, you should be able to get pretty good performance if it's indexed properly. > Queries on it are usually slow, and if I try to do something more elaborate > like an INNER JOIN with itself it becomes unnacceptably slow. > > I am looking for a way to improve performance. > One of the columns is of type date. Each "day" includes about a few tens of > thousands records > And the older a date is the less likely I am to do queries on it. Well, first let's look at your indexes and if they're being used. Got an explain analyze output of your slow queries for us to look at? -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Question on partitioning
On Thu, 2008-08-21 at 15:25 +0100, Oliveiros Cristina wrote: > Hello , All. > > I am not sure if this is the right mailing list to place this > question. > If it doesn't, please kindly redirect me to the right list. > > I have a giant table with about 10,6 million records. > > Queries on it are usually slow, and if I try to do something more > elaborate like an INNER JOIN with itself it becomes unnacceptably > slow. > > I am looking for a way to improve performance. > One of the columns is of type date. Each "day" includes about a few > tens of thousands records > And the older a date is the less likely I am to do queries on it. > > The objective of the "self join" is to compare data from two different > days, looking for diferences. > > Ive read that one of the benefits of partitioning is to speed up > queries by separating less used records. > > My question is if partitioning can be a good way to make the queries > faster (specially the self joins) or if it isn't worth trying because > it doesn't help on my particular situation. > > Please kindly advice me on this > > Many thanks in advance for your kind help > > Best, > Oliveiros I would expect partitioning to work. I've heard tell that fine grained partitioning coupled with check constraints can even eliminate the need for certain indexes. I do know that check constraints on the date will help you tremendously if you decide to partition. -Mark -- Sent via pgsql-sql mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Re: Efficiently determining the number of bits set in the contents of, a VARBIT field
Thank you TJ and everyone else for the advise and the c code. Today I
did finally return to the 'number of bits set challenge' and managed to
compile and link the nbits c function which went smoothly. However the
function does crash my postgres server installation (8.3.3) with a
segmentation fault each time I call it for example SELECT
nbits_set(B'1101');
My C skills are very sparse and am unable to debug the function, I have
included the C code of this function. Is there something I may have left
out?
#include "postgres.h"
#include "utils/varbit.h"
#include "fmgr.h"
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif
PG_FUNCTION_INFO_V1(nbits_set);
Datum
nbits_set(PG_FUNCTION_ARGS)
{
/* how many bits are set in a bitstring? */
VarBit *a = PG_GETARG_VARBIT_P(0);
int n=0;
int i;
unsigned char *ap = VARBITS(a);
unsigned char aval;
for (i=0; i < VARBITBYTES(a); ++i) {
aval = *ap; ++ap;
if (aval == 0) continue;
if (aval & 1) ++n;
if (aval & 2) ++n;
if (aval & 4) ++n;
if (aval & 8) ++n;
if (aval & 16) ++n;
if (aval & 32) ++n;
if (aval & 64) ++n;
if (aval & 128) ++n;
}
PG_RETURN_INT32(n);
}
Allan
Bruce Momjian wrote:
Jean-David Beyer wrote:
TJ O'Donnell wrote:
I use a c function, nbits_set that will do what you need.
I've posted the code in this email.
TJ O'Donnell
http://www.gnova.com
#include "postgres.h"
#include "utils/varbit.h"
Datum nbits_set(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(nbits_set);
Datum
nbits_set(PG_FUNCTION_ARGS)
{
/* how many bits are set in a bitstring? */
VarBit *a = PG_GETARG_VARBIT_P(0);
int n=0;
int i;
unsigned char *ap = VARBITS(a);
unsigned char aval;
for (i=0; i < VARBITBYTES(a); ++i) {
aval = *ap; ++ap;
if (aval == 0) continue;
if (aval & 1) ++n;
if (aval & 2) ++n;
if (aval & 4) ++n;
if (aval & 8) ++n;
if (aval & 16) ++n;
if (aval & 32) ++n;
if (aval & 64) ++n;
if (aval & 128) ++n;
}
PG_RETURN_INT32(n);
}
Hi all,
Am looking for a fast and efficient way to count the number of bits set
(to 1) in a VARBIT field. I am currently using
"LENGTH(REGEXP_REPLACE(CAST(a.somefield_bit_code AS TEXT),'0','','g'))".
Allan.
When I had to do that, in days with smaller amounts of RAM, but very long
bit-vectors, I used a faster function sort-of like this:
static char table[256] = {
0,1,1,2,1,2,2,3,1,.
};
Then like above, but instead of the loop,
n+= table[aval];
You get the idea.
Uh, I was kind of confused by this, even when I saw a full
implementation:
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable
Actually, this looks even better:
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan
--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql
