[PERFORM] How to create an index for type timestamp column using rtree?

2004-07-13 Thread Chris Cheston
Hi all,

I'm storing some timestamps as integers (UTF) in a table and I want to
query by = and = for times between a certain period.  The table has
gotten quite large and is now very slow in querying.

I think it's time to create an index for the timestamp column.

I tried using an rtree (for = and = optimization):

create INDEX logs_timestamp ON logs using rtree (timestamp);

but I get 

ERROR:  data type integer has no default operator class for access
method rtree
You must specify an operator class for the index or define a
default operator class for the data type

Do I have to create an rtree type for my timestamp integer column?  

Existing rtree columns are below.

Pls help.

Thanks,
Chris

server= select am.amname as acc_method, opc.opcname as ops_name from
pg_am am, pg_opclass opc where opc.opcamid = am.oid order by
acc_method, ops_name;
 acc_method |ops_name 
+-
 btree  | abstime_ops
 btree  | bit_ops
 btree  | bool_ops
 btree  | bpchar_ops
 btree  | bytea_ops
 btree  | char_ops
 btree  | cidr_ops
 btree  | date_ops
 btree  | float4_ops
 btree  | float8_ops
 btree  | inet_ops
 btree  | int2_ops
 btree  | int4_ops
 btree  | int8_ops
 btree  | interval_ops
 btree  | macaddr_ops
 btree  | name_ops
 btree  | numeric_ops
 btree  | oid_ops
 btree  | oidvector_ops
 btree  | text_ops
 btree  | time_ops
 btree  | timestamp_ops
 btree  | timestamptz_ops
 btree  | timetz_ops
 btree  | varbit_ops
 btree  | varchar_ops
 hash   | bpchar_ops
 hash   | char_ops
 hash   | cidr_ops
 hash   | date_ops
 hash   | float4_ops
 hash   | float8_ops
 hash   | inet_ops
 hash   | int2_ops
 hash   | int4_ops
 hash   | int8_ops
 hash   | interval_ops
 hash   | macaddr_ops
 hash   | name_ops
 hash   | oid_ops
 hash   | oidvector_ops
 hash   | text_ops
 hash   | time_ops
 hash   | timestamp_ops
 hash   | timestamptz_ops
 hash   | timetz_ops
 hash   | varchar_ops
 rtree  | bigbox_ops
 rtree  | box_ops
 rtree  | poly_ops
(51 rows)

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] How to create an index for type timestamp column using rtree?

2004-07-13 Thread Tom Lane
Chris Cheston [EMAIL PROTECTED] writes:
 I'm storing some timestamps as integers (UTF) in a table and I want to
 query by = and = for times between a certain period.

btree can handle range queries nicely; why do you think you need an
rtree?  rtree is for 2-dimensional datums which a timestamp is not ...

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] How to create an index for type timestamp column using

2004-07-13 Thread Christopher Kings-Lynne
I'm storing some timestamps as integers (UTF) in a table and I want to
query by = and = for times between a certain period.  The table has
gotten quite large and is now very slow in querying.
I think it's time to create an index for the timestamp column.
Uh, yeah.
I tried using an rtree (for = and = optimization):
Bad idea.
Do I have to create an rtree type for my timestamp integer column?  
Why do you want an rtree index?  They're for multidimensional polygonal 
data and stuff.  Just create a normal index...

Chris
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] How to create an index for type timestamp column using rtree?

2004-07-13 Thread Chris Cheston
Thanks,  Chris and Tom.
I had read *incorrectly* that rtrees are better for = and = comparisons.

Chris

On Tue, 13 Jul 2004 14:33:48 +0800, Christopher Kings-Lynne
[EMAIL PROTECTED] wrote:
  I'm storing some timestamps as integers (UTF) in a table and I want to
  query by = and = for times between a certain period.  The table has
  gotten quite large and is now very slow in querying.
 
  I think it's time to create an index for the timestamp column.
 
 Uh, yeah.
 
  I tried using an rtree (for = and = optimization):
 
 Bad idea.
 
  Do I have to create an rtree type for my timestamp integer column?
 
 Why do you want an rtree index?  They're for multidimensional polygonal
 data and stuff.  Just create a normal index...
 
 Chris
 


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Insert are going slower ...

2004-07-13 Thread Herv Piedvache
Hi,

I have a database with 10 tables having about 50 000 000 records ...
Every day I have to delete about 20 000 records, inserting about the same in 
one of this table.

Then I make some agregations inside the other tables to get some week results, 
and globals result by users. That mean about 180 000 to 300 000 insert by 
table each days.

The time for the calculation of the request is about 2 / 4 minutes ... I do 
the request inside a temporary table ... then I do an
insert into my_table select * from temp_table.

And that's the point, the INSERT take about (depending of the tables) 41 
minutes up to 2 hours ... only for 180 to 300 000 INSERTs ...

The table have index really usefull ... so please do not tell me to delete 
some of them ... and I can't drop them before inserting data ... it's really 
too long to regenerate them ...

I'm configured with no flush, I have 8 Gb of RAM, and RAID 5 with SCSI 7200 
harddrive ... I'm using Linux Debian, with a PostgreSQL version compiled by 
myself in 7.4.3.

What can I do to get better results ?? (configuration option, and/or hardware 
update ?)
What can I give you to get more important informations to help me ?

Regards,
-- 
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Swapping in 7.4.3

2004-07-13 Thread Jim Ewert

When I went to 7.4.3 (Slackware 9.1) w/ JDBC, the improvements are that it doesn't 
initially take much memory (have 512M) and didn't swap. I ran a full vaccum and a 
cluster before installation, however speed degaded to 1 *second* / update of one row 
in 150 rows of data, within a day! pg_autovacuum now gives excellent performance 
however it is taking 66M of swap; only 270k cached.




___
Join Excite! - http://www.excite.com
The most personalized portal on the Web!

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Working on huge RAM based datasets

2004-07-13 Thread Andy Ballingall
Oops - sorry - I confused my numbers. The opteron machine in mind *only* has
up to 64GB of RAM (e.g. HP DL585)  - here's the datapage:

http://h18004.www1.hp.com/products/servers/proliantdl585/index.html

Still - with *just* 64GB of RAM, that would comfortably provide for the type
of scenario I envisage. Is that still enough for your app?

The 256GB number came from something I read saying that the current crop of
64 bit chips will allow up to 256GB of RAM in principle, so it is just a
matter of time before the memory limit shoots up on these simple products.

If you are prepared to pay a bit more, already there are some big memory
options on linux:

E.g. you can have up to 192GB in an SGI Altix 350:

http://www.sgi.com/servers/altix/downloads/altix350_at_a_glance.pdf

Or up to 4 terabytes in it's bigger brother the Altix 3000 - but that's
getting a bit esoteric.

http://www.sgi.com/servers/altix/

(This won lots of awards recently)

The nice thing about the two things above is that they run linux in a single
address space NUMA setup, and in theory you can just bolt on more CPUs and
more RAM as your needs grow.

Thanks,
Andy




- Original Message - 
From: J. Andrew Rogers [EMAIL PROTECTED]
To: Andy Ballingall [EMAIL PROTECTED]
Sent: Friday, July 09, 2004 10:40 PM
Subject: Re: [PERFORM] Working on huge RAM based datasets


 On Fri, 2004-07-09 at 02:28, Andy Ballingall wrote:
  After all, we're now seeing the first wave of 'reasonably priced' 64 bit
  servers supported by a proper 64 bit OS (e.g. linux). HP are selling a 4
  Opteron server which can take 256GB of RAM, and that starts at $1
(ok -
  they don't give you that much RAM for that price  - not yet, anyway!)


 Which server is this?!  They are selling an Opteron system that can hold
 256 GB of RAM?

 I looked on their site, and couldn't find anything like that.  I run
 some MASSIVE memory codes that don't need a lot of CPU, and if such a
 box existed, I'd be very interested.

 cheers,

 j. andrew rogers



- Original Message - 
From: J. Andrew Rogers [EMAIL PROTECTED]
To: Andy Ballingall [EMAIL PROTECTED]
Sent: Friday, July 09, 2004 10:40 PM
Subject: Re: [PERFORM] Working on huge RAM based datasets


 On Fri, 2004-07-09 at 02:28, Andy Ballingall wrote:
  After all, we're now seeing the first wave of 'reasonably priced' 64 bit
  servers supported by a proper 64 bit OS (e.g. linux). HP are selling a 4
  Opteron server which can take 256GB of RAM, and that starts at $1
(ok -
  they don't give you that much RAM for that price  - not yet, anyway!)


 Which server is this?!  They are selling an Opteron system that can hold
 256 GB of RAM?

 I looked on their site, and couldn't find anything like that.  I run
 some MASSIVE memory codes that don't need a lot of CPU, and if such a
 box existed, I'd be very interested.

 cheers,

 j. andrew rogers






---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Inverted-list databases (was: Working on huge RAM based datasets)

2004-07-13 Thread Mischa Sandberg
Andy Ballingall [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]

 On another thread, (not in this mailing list), someone mentioned that
there
 are a class of databases which, rather than caching bits of database file
 (be it in the OS buffer cache or the postmaster workspace), construct a a
 well indexed memory representation of the entire data in the postmaster
 workspace (or its equivalent), and this, remaining persistent, allows the
DB
 to service backend queries far quicker than if the postmaster was working
 with the assumption that most of the data was on disk (even if, in
practice,
 large amounts or perhaps even all of it resides in OS cache).

As a historical note, System R (grandaddy of all relational dbs) worked this
way.
And it worked under ridiculous memory constraints by modern standards.

Space-conscious MOLAP databases do this, FWIW.

Sybase 11 bitmap indexes pretty much amount to this, too.

I've built a SQL engine that used bitmap indexes within B-Tree indexes,
making it practical to index every field of every table (the purpose of the
engine).

You can also build special-purpose in-memory representations to test for
existence (of a key), when you expect a lot of failures. Google
superimposed coding e.g.  http://www.dbcsoftware.com/dbcnews/NOV94.TXT



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Swapping in 7.4.3

2004-07-13 Thread Matthew T. O'Connor
Jim Ewert wrote:
When I went to 7.4.3 (Slackware 9.1) w/ JDBC, the improvements are that it doesn't 
initially take much memory (have 512M) and didn't swap. I ran a full vaccum and a 
cluster before installation, however speed degaded to 1 *second* / update of one row 
in 150 rows of data, within a day! pg_autovacuum now gives excellent performance 
however it is taking 66M of swap; only 270k cached.
Are you saying that your system stays fast now that you are using 
pg_autovacuum, but pg_autovacuum is using 66M of memory?  Please 
clarify, I'm not sure what question you want an answered.

Matthew
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Insert are going slower ...

2004-07-13 Thread Herv Piedvache
Josh,

Le mardi 13 Juillet 2004 19:10, Josh Berkus a écrit :

  What can I do to get better results ?? (configuration option, and/or
  hardware update ?)
  What can I give you to get more important informations to help me ?

 1) What PostgreSQL version are you using?

v7.4.3

 2) What's your VACUUM, ANALYZE, VACUUM FULL, REINDEX schedule?

VACUUM FULL VERBOSE ANALYZE;

Every day after the calculation I was talking about ...

 3) Can you list the non-default settings in your PostgreSQL.conf?
 Particularly, shared_buffers, sort_mem, checkpoint_segments,
 estimated_cache, and max_fsm_pages?

shared_buffers = 48828
sort_mem =   512000
vacuum_mem = 409600
max_fsm_pages = 5000
max_fsm_relations = 2000
max_files_per_process = 2000
wal_buffers = 1000
checkpoint_segments = 3
effective_cache_size = 500
random_page_cost = 3
default_statistics_target = 20
join_collapse_limit = 10

Regards,
-- 
Hervé Piedvache


---(end of broadcast)---
TIP 8: explain analyze is your friend