from:"Ow Mun Heng"

Re: [GENERAL] --//pgsql partitioning-///--------------------

2009-11-04 Thread Ow Mun Heng



-Original Message-
From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Vick Khera

> Here are two (not necessarily mutually exclusive) options for you:

>1) use triggers/rules on the master table and do all your inserts
>directed to it and have the trigger/rule do the right thing
>2) write your insert/update/delete statements to use the correct
>inherited table directly, and write a trigger on the master that
>denies any inserts.

#2 would have better performance than #1.

Use #1 only if you have no choice.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

2009-10-21 Thread Ow Mun Heng



-Original Message-
From: Greg Smith [mailto:gsm...@gregsmith.com] 
On Wed, 21 Oct 2009, Scott Marlowe wrote:

>> Actually, later models of linux have a direct RAID-10 level built in.
>> I haven't used it.  Not sure how it would look in /proc/mdstat either.

>I think I actively block memory of that because the UI on it is so cryptic 
>and it's been historically much more buggy than the simpler RAID0/RAID1 
>implementaions.  But you're right that it's completely possible Ow used 
>it.  Would explain not being able to figure out what's going on too.

You're right, the newer linux all support raid10 by default and do not do
the funky Raid1 first then raid0 stuffs combined.

>There's a good example of what the result looks like with failed drives in 
>one of the many bug reports related to that feature at 
>https://bugs.launchpad.net/ubuntu/intrepid/+source/linux/+bug/285156 and I 
>liked the discussion of some of the details here at 
>http://robbat2.livejournal.com/231207.html

I actually stumbled onto that (the 2nd link) and tried some of the methods,
but it's actually kinda of outdated I think.

> The other hint I forgot to mention is that you should try:

> mdadm --examine /dev/XXX

> For each of the drives that still works, to help figure out where they fit

> into the larger array.  That and --detail are what I find myself using 
> instead of /proc/mdstat , which provides an awful interface IMHO.

That's one of the problem, I'm not exactly sure.

Sda1 = 1
Sdb1 = 2
Sdc1 = 3
Sdd1 = 4

If they are following the sequence, and I'm losing sda1 and sdd1, I
theoretically is supposed to be able to recover them, but I'm not getting
much luck.

FYI.. I've left the box as it is for now and have yet to connect it back up
and all, hence, I can't really post the outputs of /proc/mdstat and
--examine.

But I will once I boot it up.



--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] PANIC : right sibling's left-link doesn't match

2009-10-20 Thread Ow Mun Heng

[resend w/ plain text only - Sorry]

right sibling's left-link doesn't match: block 121425 links to 124561
instead of expected 121828 in index
Oct 20 22:21:29 hmweb5 postgres[8795]: [3-2]  "d_trh_trr_water_eval_pkey"


WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because
WARNING:  terminating connection because of crash of another server process

What does the above mean?

This is causing DB to have bad reactions.

Thanks.

Is there an index error? Should I drop and re-create the index is it?




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] PANIC : right sibling's left-link doesn't match

2009-10-20 Thread Ow Mun Heng

right sibling's left-link doesn't match: block 121425 links to 124561
instead of expected 121828 in index

Oct 20 22:21:29 hmweb5 postgres[8795]: [3-2]  "d_trh_trr_water_eval_pkey"

 

 

WARNING:  terminating connection because of crash of another server process

DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because

WARNING:  terminating connection because of crash of another server process

 

What does the above mean?

 

This is causing DB to have bad reactions.

[GENERAL] OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

2009-10-20 Thread Ow Mun Heng

Sorry guys, I know this is very off-track for this list, but google hasn't
been of much help. This is my raid array on which my PG data resides.

I have a 4 disk Raid10 array running on linux MD raid. 
Sda / sdb / sdc / sdd

One fine day, 2 of the drives just suddenly decide to die on me. (sda and
sdd)

I've tried multiple methods to try to determine if I can get them back
online.

1) replace sda w/ fresh drive and resync - Failed
2) replace sdd w/ fresh drive and resync - Failed
3) replace sda w/ fresh drive but keeping existing sdd and resync - Failed
4) replace sdd w/ fresh drive but keeping existing sda and resync - Failed


Raid10 is supposed to be able to withstand up to 2 drive failures if the
failures are from different sides of the mirror.

Right now, I'm not sure which drive belongs to which. How do I determine
that? Does it depend on the output of /prod/mdstat and in that order?

Thanks


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] hardware information

2009-09-16 Thread Ow Mun Heng

-Original Message-
From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-
> "htop" is really nice too.    http://htop.sourceforge.net/
> (disclaimer - I did not write it)

I like atop better 

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] trigger and returning the #of rows affected (partitioning)

2009-09-07 Thread Ow Mun Heng




-Original Message-
> From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- 
> Is there any way in which a trigger can return the # of rows affected by 
> the
> insert / delete ?
>
> Master
> ---> slave_1
> ---> slave_2
>
> Trigger is against master which will, based on the conditions re-direct 
> the
> data into the relevant slave_X partitions.
>
> I think this post basically is what I am seeing. 
> http://www.nabble.com/affected-rows-count-td21124000.html
>
> If there any methods in which I can get back the affected # or rows
> affected? Or will I have to use / do another trigger on the slave_X tables
> to do the counts? 


I tried something like this, but it doesn't seem to be working. The Update
is not updating. Does this have anything to do with me returning NULL?

CREATE OR REPLACE FUNCTION test_insert_trigger() RETURNS trigger AS
$BODY$ 
BEGIN

IF (NEW.code in ('PROD') THEN 
  BEGIN 
INSERT INTO test_prod VALUES (NEW.*);
  UPDATE xmms.log_update 
 SET rows_inserted = rows_inserted + 1 
   WHERE job_name = 'mssql_2_pg' 
 AND table_name = 'test' 
 AND from_date = (SELECT last_sync FROM xmms.log_sync WHERE
table_name = 'test' AND db_name = 'test_tbl');

  END;
ELSEIF NEW.code not in ('PROD') THEN 
  BEGIN 
INSERT INTO test_eval VALUES (NEW.*);
  UPDATE xmms.log_update 
 SET rows_inserted = rows_inserted + 1 
   WHERE job_name = 'mssql_2_pg' 
 AND table_name = 'test' 
 AND from_date = (SELECT last_sync FROM xmms.log_sync WHERE
table_name = 'test' AND db_name = 'test_tbl');
  END;
ELSE   
INSERT INTO test_orphan VALUES (NEW.*);
RAISE NOTICE 'ORPHAN Found. Fix the function!';   
  
END IF;
RETURN NULL;
END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE COST 100;

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] trigger and returning the #of rows affected (partitioning)

2009-09-07 Thread Ow Mun Heng

Is there any way in which a trigger can return the # of rows affected by the
insert / delete ?

Master
→ slave_1
→ slave_2

Trigger is against master which will, based on the conditions re-direct the
data into the relevant slave_X partitions.

I think this post basically is what I am seeing. 
http://www.nabble.com/affected-rows-count-td21124000.html

If there any methods in which I can get back the affected # or rows
affected? Or will I have to use / do another trigger on the slave_X tables
to do the counts? 

[Sorry, re-send as plain text. Mail client defaults to HTML]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] trigger and returning the #of rows affected (partitioning)

2009-09-07 Thread Ow Mun Heng

Is there any way in which a trigger can return the # of rows affected by the
insert / delete ?

 

Master

--> slave_1

--> slave_2

 

Trigger is against master which will, based on the conditions re-direct the
data into the relevant slave_X partitions.

 

I think this post basically is what I am seeing. 

http://www.nabble.com/affected-rows-count-td21124000.html

 

If there any methods in which I can get back the affected # or rows
affected? Or will I have to use / do another trigger on the slave_X tables
to do the counts?

[GENERAL] Truncating table doesn't bring back (ALL?) used space?

2009-09-06 Thread Ow Mun Heng

I've got a largeish table which according to pg_size_pretty, has an on-disk
size of ~22GB Table size and 12GB index size, approx 55million rows.

When I truncate the table, (I've moved the data somewhere else), I see that
I only gain back ~7GB in the Filesystem space.

What gives?



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Connecting to Teradata via Postgresql

2009-08-30 Thread Ow Mun Heng

Hi All,

Anyone here has a teradata box ? Are you able to connect to it from withing
postgresql?

I would like to pull 1or 2 tables from the box (sync) and was wondering if
there's anyway to do that w/o using dbi-link.

I actually am trying dbi-link but it seem as though it doesn't support
DBD::Teradata.

Based on the error descriptions, and some googling, seems like Teradata.pm
does not support the "column_info" DBI method of getting the column
descriptions and such.

is there any other methods available?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] dbi-link freezing up DBs, needing reboot

2009-08-30 Thread Ow Mun Heng

-Original Message-
From: Andy Colson [mailto:a...@squeakycode.net] 
Ow Mun Heng wrote:
>> I was playing around with dbi-link, hoping to get it connected to a
>teradata
>> database. However, before I dive into that, I figured that I might as
>well
>> try it out first on a PG Database (on another server)
>> 
>> I did a select on a 30GB table and it froze the Originating database and
>it
>> ALSO froze the foreign database.
>> 
>That looks like it came from dmesg.  Did you look in the postgres log?
>
>"froze" is not a helpful description.  PG spawns off a client for each
>connection, and I doubt one client could freeze another.  So was the one
>connection froze, all PG clients froze, or the entire computer froze?
>
>You said you had to reboot, so I assume the entire computer.  
>
>On the foreign box, have you ever pushed a large amount of data over the
>network?  You might wanna try to copy some really big files a few times and
>see if you get the eth0 timeout error again.
>
>I assume you are using Linux and a new version of PG, right?

Sorry, I don't know how else to describe it cos I don't much activity over
my ssh connections. Even top refused to work on the foreign box.

Yeah, the foreign box has handled large amount of data before. I pushed out
over 300G of data while rsyncing the db to another slave.

Centos -5.2 and PG 8.3.7 on the foreign box and 8.3.12 on the originating
box.

I was told that I shouldn't use the views directly. I believe libpq or
something just tried to push out all 30G of data all at once from the
foreign box to the originating box.

After I used the remote_select functions. All is better (for now)

Thanks

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] dbi-link freezing up DBs, needing reboot

2009-08-30 Thread Ow Mun Heng

I was playing around with dbi-link, hoping to get it connected to a teradata
database. However, before I dive into that, I figured that I might as well
try it out first on a PG Database (on another server)

So, it installed dbi-link fine.

I did a select on a 30GB table and it froze the Originating database and it
ALSO froze the foreign database.

Looking into the foreign box's logs, I see for some reason the network just
ceased to function? (can't be a coincidence?)

Aug 30 15:15:18 kernel: NETDEV WATCHDOG: eth0: transmit timed out
Aug 30 15:15:18 kernel: r8169: eth0: link up


I then had to reboot the originating DB/box.

Once it got back up, I did a select on a very small table <1MB and the data
is returned properly and in timely fashion.

Then I tried it on a slightly bigger table --> 50MB and it froze again

select * from xmms_b4.log_update where record_update_date_time > '2009-08-30
10:00:00' and record_update_date_time < '2009-08-30 11:00:00' limit 10;
NOTICE:  SELECT dbi_link.cache_connection( 2 ) at line 12.

#\d log_update;
 Column  |Type |   Modifiers
-+-+

 job_name| text| not null
 table_name  | text| not null
 from_date   | timestamp without time zone | not null
 to_date | timestamp without time zone |
 rows_deleted| integer |
 delete_duration | interval|
 rows_inserted   | integer |
 insert_duration | interval|
 rows_updated| integer |
 update_duration | interval|
 record_update_date_time | timestamp without time zone | not null default
now()

After 2 times of this happening, I'm really worried that it will do other
nasty things.


Help?

PS : dbi-link is, for me, ultimately to try if I can get it to connect to
teradata to pull some data on a daily basis. I currently use dblink for 
pg-to-pg connections

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Is there a function for Converting a Decimal into BINARY ?

2009-08-27 Thread Ow Mun Heng

Hi Guys,

Searching the net didn't give me much clues as to how to convert a Decimal
number into BINARY.


Eg:
I have a datatype in the DB which needs to be converted.

DEC = 192
BINARY = 1100

DEC = 197
BINARY = 11000101

Which I then need to break down into pairs to do calculations on

11 : 00 : 00 : 00
11 : 00 : 01 : 01

Some of the solutions I've seen on the Internet is based on VB and mainly
userland apps, I would like to do the conversion within PG itself.

Thanks

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] mail alert

2009-08-13 Thread Ow Mun Heng


-Original Message-
From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Christophe Pettus

>On Aug 11, 2009, at 12:19 AM, Jan Verheyden wrote:
>> I was looking in what way it's possible to alert via mail when some  
>> conditions are true in a database.

>An external script that connects to the database, checks for the  
>condition, and sends the email if it exists is probably your best  
>solution.  You might want to take a look at check_postgres:

>   http://bucardo.org/check_postgres/

I would look forward to having such a feature in Postgres actually. Right
now, I'm using cron to do those checks.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Best way to "mask" password in DBLINK

2009-08-12 Thread Ow Mun Heng

-Original Message-
>From: Magnus Hagander [mailto:mag...@hagander.net] 


>No, we're talking about operating system user here, not postgres user.
>So the owner of the database object is irrelevant - only the user that
>the backend process is executing as.

Got it.. Thanks for the tip.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Best way to "mask" password in DBLINK

2009-08-12 Thread Ow Mun Heng



-Original Message-
From: Magnus Hagander [mailto:mag...@hagander.net] 
On Wed, Aug 12, 2009 at 09:30, Ow Mun Heng wrote:
>>
>> From: Tommy Gildseth [mailto:tommy.gilds...@usit.uio.no]
>>
>> Ow Mun Heng wrote:
>>>> I'm starting to use DBLink / DBI-Link and one of the "bad" things is
>>>that
>>>> the password is out in the clear.
>>>> What can I do to prevent it from being such? How do I protect it from
>>>> 'innocent' users?
>>
>>>If I'm not mistaken, it's possible to put your password in the .pgpass
>>>file in the postgres-users home folder, on the server where the postgres
>>>cluster is running.
>>
>> Isn't that how one connects using the CLI? Eg: via psql?

>You need to put it in the .pgpass file of the postgres user - the one
>that runs the server. .pgpass is dealt with by libpq, and DBLink and
>DBI-Link both use libpq to connect to the remote server.

The View is owned by the user "operator" not postgres
Does it make a difference?

My understanding of your words are that it _does_ make a difference and If I
put it into the .pgpass of the postgres user then all is fine.

Thanks for confirmation


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Best way to "mask" password in DBLINK

2009-08-12 Thread Ow Mun Heng



-Original Message-
From: Tommy Gildseth [mailto:tommy.gilds...@usit.uio.no] 

Ow Mun Heng wrote:
>> I'm starting to use DBLink / DBI-Link and one of the "bad" things is that
>> the password is out in the clear. 
>> What can I do to prevent it from being such? How do I protect it from
>> 'innocent' users?

>If I'm not mistaken, it's possible to put your password in the .pgpass 
>file in the postgres-users home folder, on the server where the postgres 
>cluster is running.

Isn't that how one connects using the CLI? Eg: via psql?

My connection string looks like this.

SELECT aaa
   FROM dblink('dbname=hmxmms host=xxx.xxx.xxx.xxx user=yyy
password=zzz'::text, 'SELECT * from tablename'::text) b4(aaa xxx );

I've placed the above as a view hence the user/pass is being hardcoded(?) of
sorts

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Best way to "mask" password in DBLINK

2009-08-11 Thread Ow Mun Heng

I'm starting to use DBLink / DBI-Link and one of the "bad" things is that
the password is out in the clear. 

What can I do to prevent it from being such? How do I protect it from
'innocent' users?



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] xlog flus not satisfied

2009-08-10 Thread Ow Mun Heng


-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 

"Ow Mun Heng"  writes:
>> As background, this is a new box mirrored from a separate box via rsync.
>> I've basically copied/rsync the entire postgresql server and data files 
>>over
>> to create a mirror copy. After which, I've tried to do the vacuum and
gotten
>> into the above trouble.

>Did you shut down the old postmaster while mirroring its files?  I could
>believe seeing this type of problem as a consequence of getting
>out-of-sync copies of different parts of the database.

You've caught me. I'm actually planning to down the master server during
lunch to re-sync it.

Aside : I wonder how pgcluster does it then, cos I thought they use rsync to
do replication.

Thanks. Will report back in couple hours.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] xlog flus not satisfied

2009-08-10 Thread Ow Mun Heng

While doing

# VACUUM VERBOSE ANALYZE d_trr_dfh;

INFO:  vacuuming "xmms.d_trr_dfh"

ERROR:  xlog flush request 21F/9F57DF88 is not satisfied --- flushed only to
21F/924CE76C

CONTEXT:  writing block 2919652 of relation 17461/17462/17668

 

I see this in the logs

 

user= CONTEXT:  writing block 2919680 of relation 17461/17462/17668

user= WARNING:  could not write block 2919680 of 17461/17462/17668

DETAIL:  Multiple failures --- write error might be permanent.

user= LOG:  checkpoint starting: time

xlog flush request 21F/9F67DA80 is not satisfied --- flushed only to
21F/924CE76C

 

Does this mean I have disk issues?

 

As background, this is a new box mirrored from a separate box via rsync.

 

I've basically copied/rsync the entire postgresql server and data files over
to create a mirror copy. After which, I've tried to do the vacuum and gotten
into the above trouble.

 

Any ideas from the PG community? (or is rsync not the way to go for this?)

Re: [GENERAL] Row insertion w/ trigger to another table update causes row insertion to _not_ occur

2009-07-21 Thread Ow Mun Heng



> From: Sim Zacks [mailto:s...@compulab.co.il] 

>-Original Message-
>From: gsst...@gmail.com [mailto:gsst...@gmail.com] On Behalf Of Greg Stark
>  
>On Tue, Jul 21, 2009 at 11:25 PM, Ow Mun Heng wrote:
>   RETURN NULL;
>  
>Just make your trigger return NEW and it won't kill the insert to the
>child table.

>>I agree that is certainly one problem, but in any case an After Update
>>trigger would have worked.

I actually did try that, it didn't work. (I thought I mentioned that in the
original email. Oh well..)

Have yet to try the NEW method though

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Row insertion w/ trigger to another table update causes row insertion to _not_ occur

2009-07-21 Thread Ow Mun Heng

-Original Message-
From: gsst...@gmail.com [mailto:gsst...@gmail.com] On Behalf Of Greg Stark
>> On Tue, Jul 21, 2009 at 11:25 PM, Ow Mun Heng wrote:
>>    RETURN NULL;


> From the docs:
> "It can return NULL to skip the operation for the current row."
> -- http://www.postgresql.org/docs/current/static/trigger-definition.html

> Just make your trigger return NEW and it won't kill the insert to the
> child table.

Dang it.. I just re-read the online docs this morning and I missed it. 
Then again, since I'm new to triggers, I may have mis-interpreted that.

Thanks for it. I'll give it a go.

 

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Row insertion w/ trigger to another table update causes row insertion to _not_ occur

2009-07-21 Thread Ow Mun Heng


I think I'm doing this wrongly. Before I go out re-invent the wheel, I
thought I'll just check w/ the list. (I previously got the idea from IRC)

Table "Master" 
--> Table "Child1"
--> Table "Child2"

--> Table "Child2"
Table "Update" --> Table to update come key items from source table.


The Master table is populated with some huge amount of data on a per minute
and based on date_ranges, it is diverted to the different child tables.
(classic partitioning)

What I'm trying to do here is to consolidate the (subset of) whole list of
data inserted into the individual child tables into a table called "update"

I tried it via a trigger function which is called "BEFORE INSERT" (I also
tried "AFTER INSERT")

But what happens is that once the insertion to the child table is in
progress, the update_table_trigger is fired, and the Update table gets
updated, _but_ the child table is un-touched.


How do I go about this? As of right now, I've turned off the triggers.


CREATE OR REPLACE FUNCTION update_table_trigger()
  RETURNS trigger AS
$BODY$ 
BEGIN
INSERT INTO update
SELECT NEW.uniq_id, 
NEW.start_date_time, 
NEW.week_id 
WHERE NOT EXISTS (  SELECT 1 FROM update WHERE uniq_id=NEW. uniq_id
  AND start_date_time=NEW.start_date_time 
  AND week_id = NEW.week_id
  );
RETURN NULL;
END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE
  COST 100;
ALTER FUNCTION head_raw_prod_sn_trigger() OWNER TO "operator";

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Evil Nested Loops

2009-06-03 Thread Ow Mun Heng

On Wed, 2009-06-03 at 01:28 -0600, Scott Marlowe wrote:
> On Wed, Jun 3, 2009 at 12:32 AM, Ow Mun Heng  wrote:
> > HashAggregate  (cost=8035443.21..8035445.17 rows=157 width=24)
> >  ->  Nested Loop  (cost=37680.95..7890528.72 rows=28982898 width=24) <<<<< 
> > suspect
> >Join Filter: ((a.test_run_start_date_time >= date.start_time) AND 
> > (a.test_run_start_date_time <= date.end_time))
> >->  Bitmap Heap Scan on d_trh_pbert a  (cost=37677.22..1369372.99 
> > rows=1661440 width=24)
> >  Recheck Cond: ((test_run_start_date_time >= '2009-05-08 
> > 07:00:00'::timestamp without time zone) AND (test_run_start_date_time <= 
> > '2009-05-15 06:59:59'::timestamp without time zone))
> >  ->  Bitmap Index Scan on idx_d_trh_pbert_sdate  
> > (cost=0.00..37261.86 rows=1661440 width=0)
> >Index Cond: ((test_run_start_date_time >= '2009-05-08 
> > 07:00:00'::timestamp without time zone) AND (test_run_start_date_time <= 
> > '2009-05-15 06:59:59'::timestamp without time zone))
> >->  Materialize  (cost=3.73..5.30 rows=157 width=24)
> >  ->  Seq Scan on lookup_ww_date2 date  (cost=0.00..3.57 
> > rows=157 width=24)

> OK, looking at your query and the plan, what you're doing is kind of this:
> 
> 157 Rows times 1661440 Rows (cross product) = 260M or so and then you
> filter out the 157 original rows and their matches.  Note that an
> explain ANALYZE might shed more light, but given the high cost in this
> query for the nested loop I'm guessing the only thing you can do is
> throw more work_mem at it.  But it's fundamentally flawed in design I
> think.

The explain analyze runs >10 mins and then I just aborted it.

WW49 is basically between 5/8 and 5/15, unfortunately, it's not actually
just the base dates, it's also the time.

eg: 200949|5/8/2009 7:00:00am|5/15/2009 6:59:59AM

The definition of  WW or a "day" is actually between 
eg: 5/8 7am to 5/9 6:59:59am


> If you're always working with dates maybe joining on
> date_trunc('day',test_run_start_date)=date_trunc('day',startdate')
> with an index on both terms will work?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Evil Nested Loops

2009-06-02 Thread Ow Mun Heng

What can I do about this plan?



HashAggregate  (cost=8035443.21..8035445.17 rows=157 width=24)
  ->  Nested Loop  (cost=37680.95..7890528.72 rows=28982898 width=24) < 
suspect
Join Filter: ((a.test_run_start_date_time >= date.start_time) AND 
(a.test_run_start_date_time <= date.end_time))
->  Bitmap Heap Scan on d_trh_pbert a  (cost=37677.22..1369372.99 
rows=1661440 width=24)
  Recheck Cond: ((test_run_start_date_time >= '2009-05-08 
07:00:00'::timestamp without time zone) AND (test_run_start_date_time <= 
'2009-05-15 06:59:59'::timestamp without time zone))
  ->  Bitmap Index Scan on idx_d_trh_pbert_sdate  
(cost=0.00..37261.86 rows=1661440 width=0)
Index Cond: ((test_run_start_date_time >= '2009-05-08 
07:00:00'::timestamp without time zone) AND (test_run_start_date_time <= 
'2009-05-15 06:59:59'::timestamp without time zone))
->  Materialize  (cost=3.73..5.30 rows=157 width=24)
  ->  Seq Scan on lookup_ww_date2 date  (cost=0.00..3.57 rows=157 
width=24)


The query
select week_id,count(serial_number) 
from d_trh a
inner join lookup_ww_date2 date
on ( a.test_run_start_date_time between start_time and end_time)
where a.test_run_start_date_time between '5/8/2009 7:00:00 AM' and '5/15/2009 
6:59:59 AM'
group by 
week_id

the lookup_ww_date looks like this

( week_id bigint
  start_time timestamp
  end_time timestamp
)

eg: 200949|5/8/2009 7:00:00am|5/15/2009 6:59:59AM

The whole aim of this exercise is to look at which WW the particular date falls 
into.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] changing datatype from int to bigint quickly

2009-05-23 Thread Ow Mun Heng

Is there a method to do this without transversing the whole 20GB table?

What about manipulating the pg_attribute table and changing atttypid just
like we can manipulate atttypmod to change from varchar(4) to varchar(8)?

Thanks

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Putting many related fields as an array

2009-05-12 Thread Ow Mun Heng

-Original Message-
From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-
On Tue, May 12, 2009 at 01:23:14PM +0800, Ow Mun Heng wrote:
>> | sum of count | sum_of_count_squared | qty | qty < 100 | qty < 500 |
>> 
>> 
>> I'm thinking of lumping them into 1 column via an array instead of into
>> 5 different columns. Not sure how to go about this, hence the email to
>> the list.

>The normal array constructor should work:
>
>  SELECT ARRAY[MIN(v),MAX(v),AVG(v),STDEV(v)]
>  FROM (VALUES (1),(3),(4)) x(v);
>
>Not sure why this is better than using separate columns though.  Maybe a
>new datatype and a custom aggregate would be easier to work with?

The issue here is the # of columns needed to populate the table.

The table I'm summarizing has close to between 50 to 100+ columns, if the
1:5x is used as a yardstick, then the table will get awfully wide quickly.

I need to know how to do it first, then test accordingly for performance and
corner cases.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Putting many related fields as an array

2009-05-11 Thread Ow Mun Heng

Hi,

Currently doing some level of aggregrate tables for some data. These
data will be used for slice/dice activity and we want to be able to
play/manipulate the data such that I can get means and stddev data.

Eg: For each Original Column eg: 

population_in_town : (I get derivatives)
- mean # of ppl in each town
- stddev # of ppl in each town (stdev calc already uses 2 extra columns
for # of ppl squared and qty of ppl)
- count of ppl
- count of # of ppl is < 100 (to get a percentage of population)
- count of # of ppl is < 500

Hence, I'm seeing a 1:5 column growth here if I put them as column
based.

eg:
| sum of count | sum_of_count_squared | qty | qty < 100 | qty < 500 |


I'm thinking of lumping them into 1 column via an array instead of into
5 different columns. Not sure how to go about this, hence the email to
the list.

something like {244,455,1234,43,23}

query can be done like 

sum_of_count / qty = Ave
(sum_of_count_squared * sum_qty ) / (qty * (qty-1)) = STDEV
(sum_qty<100 / sum_qty) = % < 100
(sum_qty<500 / sum_qty) = % < 500


Then there's the issue of speed/responsiveness on doing it.


Help would be appreciated in this.







-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] pg_reorg -> Anyone has any experience with it?

2009-04-15 Thread Ow Mun Heng

I was trying to clean up my database and after 2 days of vacuum full on
a 20GB table, I gave up and used pg_reorg which is seriously fast.

However, now I'm not sure if it is the cause of my unstable DB (8.2.13)

I can connect, initiate a simple query and the DB will do down


logs says : 
The postmaster has commanded this server process to roll back the current 
transaction and exit, because
another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat 
your command.
all server processes terminated; reinitializing database system was interrupted 
at 2009-04-15 15:36:23 MYT
checkpoint record is at 342/CC8E6148 redo record is at 342/CC8E6148; undo 
record is at 0/0; shutdown TRUE
next transaction ID: 0/460828465; next OID: 158897211 next MultiXactId: 1; next 
MultiXactOffset: 0
database system was not properly shut down; 

automatic recovery in progress redo starts at 342/CC8E6190
FATAL:  the database system is starting up


something like

select * from table where condition = 'AX'

doesn't matter which table I do, it will periodically work and then 
periodically die at the some other time.

I've tried lowering and upping shared memory from 128MB to 200MB(original) to 
400MB and it still crashes.


I've no clue how to fix or start to diagnose the issue.

Appreciate any help.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] partial TEXT search on an index

2009-03-30 Thread Ow Mun Heng

I don't think I understand how PG implements fulltext searching or if my
search actually needs to use fulltext search.

basically, I have a btree index on a SERIAL_NUMBER which is of sort like
ABC12345 or AAA123434 or AAB131441

I would like to have search on the specific text of the SERIAL_NUMBER 

eg:
select * from table where serial_number LIKE 'AAA%'

does not use the index. What Do i need to do to have it recognise the
"partial index?" (is it called?)

Thanks

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Parallel DB architechture

2009-03-30 Thread Ow Mun Heng

>>On Behalf Of Asko Oja
>>Hello

>>We use plProxy (RUN ON ALL) to run queries in parallel. 
>>We split our database into 16 shards and distributed it over 4 servers.
>>So now we are running queries on 16 cpu's in parallel :)

Wow.. query time improved How many fold? Any idea?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Smartest way to resize a column?

2009-03-04 Thread Ow Mun Heng

On Mon, 2009-01-12 at 14:42 +0800, Phoenix Kiula wrote:
> > ALTER COLUMN TYPE is intended for cases where actual transformation of
> > the data is involved.  Obviously varchar(20) to varchar(35) doesn't
> > really require any per-row effort, but there's no operation in the
> > system that handles that case.  But if you're brave, you can do it
> > via manipulation of the system catalogs.  Observe:
> 
> ...snip
> 
> 
> Thanks Tom. Yes, no data change is required. But I don't think I have
> the understanding to try what you suggested. Running the alter table
> on my home database (almost same data) took an hour and it wasn't
> complete. I had even dropped the index. So I guess altering it that
> way is a pain in PGSQL. Bummer.
> 

Actually, I did the same thing on a live database and it worked!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] partitioning : replicate_partition doesn't seem to be working

2008-10-18 Thread Ow Mun Heng

Hmm... seems like a SNAFU on my part.

I didn't know that I need to execute it using EXECUTE SCRIPT via slonik.

I was doing it via pgadmin & psql.

Reading this :
http://lists.slony.info/pipermail/slony1-general/2008-October/008929.html

and trying it found the issue.

Thanks and sorry for the noise.

Ps : I can't count how many times I've bang my head against the wall on
these sort of things and when I decided to send the email to the list to ask
for help, then I solve it. :-)





-Original Message-
From: Ow Mun Heng 
Sent: Saturday, October 18, 2008 4:07 PM
To: 'pgsql-general@postgresql.org'
Subject: partitioning : replicate_partition doesn't seem to be working

Quoted from 
http://www.nabble.com/Group-Sets-For-Replication-w--Partition-d19369646.html

> Basically, what you would need to do, weekly, is to run a slonik 
> "execute script" script where the SQL script consists of something 
> similar to what's in tests/testpartition/gen_ddl_sql.sh: 

> -> create the new table, with indication of inheritace from the parent; 

> -> create a rule to indicate when data gets inserted into the new 
   table, assuming you're using rules to control this; 

> -> run replicate_partition() on this newly-created table. 

I've created the 2 tables w/ inheritance info into both the master and the
slave.

Then I've run the following 

/*select
_hmxmms_my_cluster.replicate_partition(200810,'xmms'::text,'2replicate_test'
::text,NULL::text,'Replicate test partition 2'::text)*/

select
_hmxmms_my_cluster.add_empty_table_to_replication(10,200810,'xmms'::text,'2r
eplicate_test'::text,NULL::text,'Replicate test partition 2'::text) 

This successfully adds the table into the set #10, put it as a tab_id of
200810 in sl_table. However, in sl_table, the tab_altered column/state will
still be marked as false.

In addition to that, the triggers are not even created.

If I do execute this directly 

select
_hmxmms_my_cluster.setaddtable_int(10,200810,'xmms.2replicate_test','2replic
ate_test_pkey','')

Then the tab_altered state/column will be marked as true and the trigger
will appear

select * from pg_catalog.pg_trigger where tgname =
'_hmxmms_my_cluster_logtrigger_200810'


I've tried to debug the scripts but thus far, has not been able to pinpoint
the location in which it is failing.

Thanks



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] partitioning : replicate_partition doesn't seem to be working

2008-10-18 Thread Ow Mun Heng

Quoted from 
http://www.nabble.com/Group-Sets-For-Replication-w--Partition-d19369646.html

> Basically, what you would need to do, weekly, is to run a slonik 
> "execute script" script where the SQL script consists of something 
> similar to what's in tests/testpartition/gen_ddl_sql.sh: 

> -> create the new table, with indication of inheritace from the parent; 

> -> create a rule to indicate when data gets inserted into the new 
   table, assuming you're using rules to control this; 

> -> run replicate_partition() on this newly-created table. 

I've created the 2 tables w/ inheritance info into both the master and the
slave.

Then I've run the following 

/*select
_hmxmms_my_cluster.replicate_partition(200810,'xmms'::text,'2replicate_test'
::text,NULL::text,'Replicate test partition 2'::text)*/

select
_hmxmms_my_cluster.add_empty_table_to_replication(10,200810,'xmms'::text,'2r
eplicate_test'::text,NULL::text,'Replicate test partition 2'::text) 

This successfully adds the table into the set #10, put it as a tab_id of
200810 in sl_table. However, in sl_table, the tab_altered column/state will
still be marked as false.

In addition to that, the triggers are not even created.

If I do execute this directly 

select
_hmxmms_my_cluster.setaddtable_int(10,200810,'xmms.2replicate_test','2replic
ate_test_pkey','')

Then the tab_altered state/column will be marked as true and the trigger
will appear

select * from pg_catalog.pg_trigger where tgname =
'_hmxmms_my_cluster_logtrigger_200810'


I've tried to debug the scripts but thus far, has not been able to pinpoint
the location in which it is failing.

Thanks



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] varchar vs Text & TOAST

2008-09-07 Thread Ow Mun Heng

Hi,

I'm a (more than a) bit confuse as to the diference between TEXT and
varchar data-types.

AFAIK, varchar will have a max limit char of, if not mistaken ~65k? But
for TEXT, it's more like a BLOB and there's supposed to be no limit?

Anyhow, searching the archives (in my mail client - no internet at the
moment), I see references that when I use TEXT, I will create TOAST
tables which will have them lie _outside_ of my main data table.

I'm not sure what is the implications or etc. I'm wondering if I should
change my current usage of TEXT to varchar.

the columns will have data like

NULL:NULL:FLE:NULL:FLE:NULL:FLE:FLE

I initially chose TEXT, (for X reasons) and now since I found out about
the TOAST tables, (oh.. i didn't like the TOAST tables cos it's created
_not_ in my usual raidspace, but in my OS drive), I'm trying to
determine if I need to change the data-type.

comments? Advise?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] max_stack_depth Exceeded

2008-09-05 Thread Ow Mun Heng



-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Friday, September 05, 2008 11:22 PM
To: Magnus Hagander
Cc: Ow Mun Heng; pgsql-general@postgresql.org
Subject: Re: [GENERAL] max_stack_depth Exceeded 

Magnus Hagander <[EMAIL PROTECTED]> writes:
> Ow Mun Heng wrote:
>> Am I doing something wrong?

> If your trigger is defined on the head_raw_all_test_2 table,  then yes.
> Because it will do a new insert there, and the new insert will fire the
> trigger again, which will do a new insert, which wil lfire the trigger
etc.

>>Of course, the way to have the row be inserted into the parent table is
>>to just let the trigger return it, instead of returning null.

Er.. which is how? I'm new in this so, syntax would be appreciated.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] max_stack_depth Exceeded

2008-09-05 Thread Ow Mun Heng

On Fri, 2008-09-05 at 10:35 +0200, Magnus Hagander wrote:
> Ow Mun Heng wrote:
> > Hi,
> > 
> > I'm playing around with triggers to implement partitioning. 
> > I hit something which I don't know what and I don't have internet here
> > at work to find out what is the cause.
> > 
> > 
> > ERROR : stack depth limit exceeded
> > 
> > I see that this is one of the options in postgresql.conf but I don't
> > know exactly what it is.
> 
> Sounds like you may have created a situation with infinite recursion.
> 
> Like in some branch your trigger is inserting back into the parent
> table, thus firing the trigger again in an endless loop, instead of
> inserting it into the proper child table.

This seems simple enough.

CREATE OR REPLACE FUNCTION head_raw_all_test_2_insert_trigger()
  RETURNS "trigger" AS
$BODY$
BEGIN

IF ( NEW.test_run_start_date_time >= '2008-08-18' and
NEW.test_run_start_date_time < '2008-08-19' ) THEN
   INSERT INTO head_raw_all_test_2_prod_8_18 VALUES (NEW.*);
ELSEIF ( NEW.test_run_start_date_time >= '2008-08-19' and
NEW.test_run_start_date_time < '2008-08-20' ) THEN
   INSERT INTO head_raw_all_test_2_prod_8_19 VALUES (NEW.*);
ELSE
  INSERT INTO head_raw_all_test_2 VALUES (NEW.*);
END IF;
RETURN NULL;
END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;


This one, though, works.

CREATE OR REPLACE FUNCTION head_raw_all_test_insert_trigger()
  RETURNS "trigger" AS
$BODY$
BEGIN

IF ( NEW.dcm_evaluation_code = 'PROD' OR NEW.dcm_evaluation_code is
null) THEN
   INSERT INTO head_raw_all_test_prod VALUES (NEW.*);
ELSEIF ( NEW.dcm_evaluation_code <> 'PROD' ) THEN
   INSERT INTO head_raw_all_test_eval VALUES (NEW.*);
ELSE
  INSERT INTO head_raw_all_test VALUES (NEW.*);
END IF;
RETURN NULL;
END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION head_raw_all_test_insert_trigger() OWNER TO "operator";


Am I doing something wrong? Is the ELSE condition that is making it
recurse further and further?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] max_stack_depth Exceeded

2008-09-05 Thread Ow Mun Heng

Hi,

I'm playing around with triggers to implement partitioning. 
I hit something which I don't know what and I don't have internet here
at work to find out what is the cause.


ERROR : stack depth limit exceeded

I see that this is one of the options in postgresql.conf but I don't
know exactly what it is.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Range Partititioning & Constraint Exclusion Oddities

2008-09-05 Thread Ow Mun Heng

Hi, appreciate if someone can help shed some light on what i may be doing 
wrong. 

I know there are caveat on using constraint exclusion to reduce the # of 
partitions scanned.

pg:8.2.9

create table test (
code varchar,
dummy_col1 int,
dummy_col2 int
)

create table test_experimental_code (
code varchar,
dummy_col1 int,
dummy_col2 int
) inherits(test)


alter table test_experimental_code add check (code not in ('P000','000','0'))
alter table test_prod_code add check (code in ('P000','000','0'))

insert into test_prod_code(code, dummy_col1, dummy_col2) values 
('P000',1,1),('000',2,2),('0',3,3),('P000',44,44)
insert into test_experimental_code(code, dummy_col1, dummy_col2) values 
('AAA',1,1),('BBB',2,2),('BBC',3,3),('DAD',44,44)

set constraint_exclusion = on
select count(*) from test [Expected]
"Aggregate  (cost=71.25..71.26 rows=1 width=0)"
"  ->  Append  (cost=0.00..63.00 rows=3300 width=0)"
"->  Seq Scan on test  (cost=0.00..21.00 rows=1100 width=0)"
"->  Seq Scan on test_prod_code test  (cost=0.00..21.00 rows=1100 
width=0)"
"->  Seq Scan on test_experimental_code test  (cost=0.00..21.00 
rows=1100 width=0)"

select count(*) from test where code = 'AAA' <--[NOT expected result]
"Aggregate  (cost=71.30..71.31 rows=1 width=0)"
"  ->  Append  (cost=0.00..71.25 rows=18 width=0)"
"->  Seq Scan on test  (cost=0.00..23.75 rows=6 width=0)"
"  Filter: ((code)::text = 'AAA'::text)"
"->  Seq Scan on test_prod_code test  (cost=0.00..23.75 rows=6 width=0)"
"  Filter: ((code)::text = 'AAA'::text)"
"->  Seq Scan on test_experimental_code test  (cost=0.00..23.75 rows=6 
width=0)"
"  Filter: ((code)::text = 'AAA'::text)"

select count(*) from test where code = 'AAA' and code not in ('P000','000','0') 
<--[I thought this would help]
"Aggregate  (cost=91.92..91.92 rows=1 width=0)"
"  ->  Append  (cost=0.00..91.88 rows=15 width=0)"
"->  Seq Scan on test  (cost=0.00..30.62 rows=5 width=0)"
"  Filter: (((code)::text = 'AAA'::text) AND ((code)::text <> ALL 
(('{P000,000,0}'::character varying[])::text[])))"
"->  Seq Scan on test_prod_code test  (cost=0.00..30.62 rows=5 width=0)"
"  Filter: (((code)::text = 'AAA'::text) AND ((code)::text <> ALL 
(('{P000,000,0}'::character varying[])::text[])))"
"->  Seq Scan on test_experimental_code test  (cost=0.00..30.62 rows=5 
width=0)"
"  Filter: (((code)::text = 'AAA'::text) AND ((code)::text <> ALL 
(('{P000,000,0}'::character varying[])::text[])))"

select count(*) from test where code in ('P000','000','0') <--[NOT Expected 
result]
"Aggregate  (cost=83.75..83.76 rows=1 width=0)"
"  ->  Append  (cost=0.00..83.62 rows=48 width=0)"
"->  Seq Scan on test  (cost=0.00..27.88 rows=16 width=0)"
"  Filter: ((code)::text = ANY (('{P000,000,0}'::character 
varying[])::text[]))"
"->  Seq Scan on test_prod_code test  (cost=0.00..27.88 rows=16 
width=0)"
"  Filter: ((code)::text = ANY (('{P000,000,0}'::character 
varying[])::text[]))"
"->  Seq Scan on test_experimental_code test  (cost=0.00..27.88 rows=16 
width=0)"
"  Filter: ((code)::text = ANY (('{P000,000,0}'::character 
varying[])::text[]))"

ALTER TABLE test_prod_code DROP CONSTRAINT test_prod_code_code_check;
ALTER TABLE test_experimental_code DROP CONSTRAINT 
test_experimental_code_code_check;
alter table test_prod_code add check (code = 'PROD')
alter table test_experimental_code add check (code <> 'PROD')
update test_prod_code set code ='PROD'

select count(*) from test where code = 'AAA' <<-- Expected Result
"Aggregate  (cost=47.53..47.54 rows=1 width=0)"
"  ->  Append  (cost=0.00..47.50 rows=12 width=0)"
"->  Seq Scan on test  (cost=0.00..23.75 rows=6 width=0)"
"  Filter: ((code)::text = 'AAA'::text)"
"->  Seq Scan on test_experimental_code test  (cost=0.00..23.75 rows=6 
width=0)"
"  Filter: ((code)::text = 'AAA'::text)"


select count(*) from test where code::text in ('AAA'::character 
varying,'BBB'::character varying) <<-- Explicit data-type 
"Aggregate  (cost=47.56..47.57 rows=1 width=0)"
"  ->  Append  (cost=0.00..47.50 rows=22 width=0)"
"->  Seq Scan on test  (cost=0.00..23.75 rows=11 width=0)"
"  Filter: ((code)::text = ANY ('{AAA,BBB}'::text[]))"
"->  Seq Scan on test_experimental_code test  (cost=0.00..23.75 rows=11 
width=0)"
"  Filter: ((code)::text = ANY ('{AAA,BBB}'::text[]))"


select count(*) from test where code in ('AAA','BBB') <-- W/o it it will query 
all partitions
"Aggregate  (cost=79.58..79.59 rows=1 width=0)"
"  ->  Append  (cost=0.00..79.50 rows=33 width=0)"
"->  Seq Scan on test  (cost=0.00..26.50 rows=11 width=0)"
"  Filter: ((code)::text = ANY (('{AAA,BBB}'::character 
varying[])::text[]))"
"->  Seq Scan on test_prod_code test  (cost=0.00..26.50 rows=11 
width=0)"
"  Filter: ((code)::te

[GENERAL] Large Selects and cursors..

2008-09-04 Thread Ow Mun Heng

Hi,
 I frequently query PG for between 10k - 65k rows of data and was
wondering if I should be considering usage of cursors.

I’m not too well versed with it’s purpose but based on the docs, it is
supposed to be more efficient and also gives the impression of
responsiveness.

Currently, when I do these large selects, there will be a few minutes
(to tens of minutes) of unresponsiveness before the results come out.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Oracle and Postgresql

2008-09-02 Thread Ow Mun Heng

On Tue, 2008-09-02 at 22:56 -0400, Robert Treat wrote:
> On Tuesday 02 September 2008 17:21:12 Asko Oja wrote:
> > On Tue, Sep 2, 2008 at 2:09 AM, Michael Nolan <[EMAIL PROTECTED]> wrote:
> > > Oracle handles connecting to multiple databases (even on multiple/remote
> > > computers) fairly seamlessly, PG does not (yet.)
> >
> > Stuff we do with plProxy on PostgreSQL is in some respects more advanced
> > than anything Oracle has to offer :) 

> oracle's dblink, which comes pre-installed, is 
> simple to set-up, and has a much more straight-forward syntax for use in day 
> to day query work. 
> 

I've not used oracle's dblink before, but based on PG's dblink, one
thing which I didn't especially like is the need to pre-cast the data
types. Then again, how else will it know what data (type) is being
pulled in? 

For simple queries, knowing the data type means an additional lookup to
determine the data-type casting needed for the query.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Partitioned Tables - How/Can does slony handle it?

2008-08-28 Thread Ow Mun Heng

I posed this question to the Slony List as well, but no response yet.

I'll post it here as well, to elicit some responses, as there's a larger
community of people using PG+Slony who may also be on slony-list.

The question that I have is, I'm trying to determine if there's a
possibility that I can do table partitioning and still use slony to
replicate the table to the slave.

1st. I'm not sure if it can be done, if it can, then how does it do it?
Is it able to automatically add the new partitioned table or will it
need to be manually added through slonik?

If anyone has any experience in this, it would be good for my knowledge
as I'm trying to optimise the DB I have. The individual tables are close
to like 60million rows and occupying ~15G (+15G Indexes)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Re: Partial Indexes Not being Used [WAS]Re: Partial_indexes (Immutable?)

2008-08-25 Thread Ow Mun Heng

On Tue, 2008-08-26 at 00:58 -0400, Tom Lane wrote:
> Ow Mun Heng <[EMAIL PROTECTED]> writes:
> > On Mon, 2008-08-25 at 10:18 -0400, Tom Lane wrote: 
> >> I suppose code_id is varchar or some such?
> 
> > Yep 
> > After a few more investigation on the usefulness of the partial indexes,
> > I found that, it really isn't all that useful, perhaps some experts can
> > shed some light.
> 
> I poked at that example a bit more earlier today, and found that 8.3
> has a problem that's interfering with optimizing x IN ('y','z') type
> clauses when x is varchar.  If you don't mind building a local copy,
> see if this patch helps you any:
> http://archives.postgresql.org/pgsql-committers/2008-08/msg00254.php

I wouldn't mind if I have a box with a compiler installed or if i have a
NON-Production Box at all.

:-(



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Partial Indexes Not being Used [WAS]Re: Partial_indexes (Immutable?)

2008-08-25 Thread Ow Mun Heng

On Mon, 2008-08-25 at 10:18 -0400, Tom Lane wrote: 
> Ow Mun Heng <[EMAIL PROTECTED]> writes:
> > On Mon, 2008-08-25 at 12:23 +0800, Ow Mun Heng wrote:
> >> CREATE INDEX idx_d_trh_code_id_partial
> >> ON xmms.d_trh_table
> >> USING btree
> >> (code_id) where code_id not in ('P000','000') and code_id is not null;
> >> ERROR:  functions in index predicate must be marked IMMUTABLE
> 
> > BTW, this is on 8.2.9 Seems to work OK on 8.3.3.
> 
> I suppose code_id is varchar or some such?

Yep 
> 
> Try "where code_id::text not in ...".  There's an array type coercion
> underlying the right-hand side of the NOT IN, and 8.2 had some problems
> with correctly identifying the volatility of such coercions.

This now works.

Prior to that, I was trying
WHERE code_id::text <> 'P000'::text OR code_id::text <> '000'::text
Which is basically a variant of the above (only that I didn't realise
it!)


After a few more investigation on the usefulness of the partial indexes,
I found that, it really isn't all that useful, perhaps some experts can
shed some light.

explain analyse select * from d_trh_pbert where code_id IN( 'HAMA2');
  QUERY PLAN
--
 Index Scan using idx_d_trh_pbert_eval on d_trh_pbert  (cost=0.00..26669.96 
rows=7125 width=216) (actual time=0.066..2.491 rows=1840 loops=1)
   Index Cond: ((code_id)::text = 'HAMA2'::text)
 Total runtime: 4.018 ms


explain analyse select * from d_trh_pbert where code_id IN( 'HAMA2','HAMA3');
  QUERY PLAN
--
 Bitmap Heap Scan on d_trh_pbert  (cost=262.02..53641.68 rows=14249 width=216) 
(actual time=0.926..4.858 rows=3556 loops=1)
   Recheck Cond: ((code_id)::text = ANY (('{HAMA2,HAMA3}'::character 
varying[])::text[]))
   ->  Bitmap Index Scan on idx_d_trh_pbert_eval  (cost=0.00..258.45 rows=14249 
width=0) (actual time=0.853..0.853 rows=3556 loops=1)
 Index Cond: ((code_id)::text = ANY (('{HAMA2,HAMA3}'::character 
varying[])::text[]))
 Total runtime: 7.809 ms

It doesn't even hit the partial indexes.


explain analyse select * from d_trh_pbert where code_id IN( 'HAMA2','HAMA3') 
and code_id not in ('P000','000') and code_id is not null;
  QUERY PLAN
--
 Bitmap Heap Scan on d_trh_pbert  (cost=259.90..53675.18 rows=5788 width=216) 
(actual time=0.916..7.477 rows=3556 loops=1)
   Recheck Cond: ((code_id)::text = ANY (('{HAMA2,HAMA3}'::character 
varying[])::text[]))
   Filter: ((code_id IS NOT NULL) AND ((code_id)::text <> ALL 
(('{P000,000}'::character varying[])::text[])))
   ->  Bitmap Index Scan on idx_d_trh_pbert_eval  (cost=0.00..258.45 rows=14249 
width=0) (actual time=0.835..0.835 rows=3556 loops=1)
 Index Cond: ((code_id)::text = ANY (('{HAMA2,HAMA3}'::character 
varying[])::text[]))
 Total runtime: 10.510 ms

hmxmms=# explain analyse select * from d_trh_pbert where code_id IN( 'HAMA3') 
and code_id not in ('P000','000') and code_id is not null;
  QUERY PLAN
--
 Index Scan using idx_d_trh_pbert_eval on d_trh_pbert  (cost=0.00..26687.77 
rows=2894 width=216) (actual time=0.077..3.506 rows=1716 loops=1)
   Index Cond: ((code_id)::text = 'HAMA3'::text)
   Filter: ((code_id)::text <> ALL (('{P000,000}'::character 
varying[])::text[]))
 Total runtime: 5.025 ms


The 2 indexes.

CREATE INDEX idx_d_trh_pbert_eval_partial2
  ON xmms.d_trh_pbert
  USING btree
  (code_id)
  WHERE (code_id::text <> ALL (ARRAY['P000'::text, '000'::text])) AND code_id 
IS NOT NULL; (size ~500MB)

CREATE INDEX idx_d_trh_pbert_eval
  ON xmms.d_trh_pbert
  USING btree
  (code_id); (size ~1.5G)

This table has approx 73 million rows and is 35 columns wide.
Stats on the code_id column is at 200 and there's ~1k of distinct values in it.


code_id is varchar(5)

I was hoping that doing the partial index will make things faster as ~70-80% of 
the time, it's ('P000','000')



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Issue with creation of Partial_indexes (Immutable?)

2008-08-24 Thread Ow Mun Heng

On Mon, 2008-08-25 at 12:23 +0800, Ow Mun Heng wrote:
> CREATE INDEX idx_d_trh_code_id_partial
>   ON xmms.d_trh_table
>   USING btree
>   (code_id) where code_id not in ('P000','000') and code_id is not null;
> ERROR:  functions in index predicate must be marked IMMUTABLE
> 
> Just trying something new.
> 
> I want to create partial indexes on code_id which are not null and not
> P000/000
> 
> the ones I want are like HMD11 or UE935 or OIOR11 etc.
> 
> not sure where the IMMUTABLE part is coming from..
> 

BTW, this is on 8.2.9 Seems to work OK on 8.3.3.

However, got a question on usage of this index.

I've got 2 indexes.

case #1
> CREATE INDEX idx_d_trh_code_id
>   ON xmms.d_trh_table
>   USING btree
>   (code_id)

case#2

> CREATE INDEX idx_d_trh_code_id_partial
>   ON xmms.d_trh_table
>   USING btree
>   (code_id)where code_id not in ('P000','000') and code_id is not null;

when I do a 

select * from d_trh_table where code_id = 'UAH11'

it will still use the full index which is idx_d_trh_code_id instead of the 
partial index.

it is only when I do a 

select * from d_trh_table where code_id = 'UAH11' and code_id not in
('P000','000') will it use the partial index. 

I would _think_ that this is expected based on the documentation I'm
reading. Is it?
(need confirmation)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Issue with creation of Partial_indexes (Immutable?)

2008-08-24 Thread Ow Mun Heng

CREATE INDEX idx_d_trh_code_id_partial
  ON xmms.d_trh_table
  USING btree
  (code_id) where code_id not in ('P000','000') and code_id is not null;
ERROR:  functions in index predicate must be marked IMMUTABLE

Just trying something new.

I want to create partial indexes on code_id which are not null and not
P000/000

the ones I want are like HMD11 or UE935 or OIOR11 etc.

not sure where the IMMUTABLE part is coming from..

Clue?





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Interval Formatting -> Convert to timestamp

2008-08-21 Thread Ow Mun Heng

On Thu, 2008-08-21 at 11:53 +0200, Tomasz Ostrowski wrote:
> On 2008-08-21 11:09, Ow Mun Heng wrote:
> 
> > I want to find out if there's a method to change this
> > select to_char('1 day 09:18:42.37996'::interval,'HH24:MI:SS')
> > to something like
> > 24+9(hours) = 33:18:42 instead of returning It as 09:19:42
> 
> That's because 1 day doesn't always have 24 hours, because there are
> daylight saving times. 1 month can also have various number of days or
> hours.
> 
> If you want 1 day to be always 24 hours then:
> select date_part('day', t)*'24h'::interval + t-date_trunc('day',t)
> from (select '1 day 09:18:42.37996'::interval as t) as s;
> 
> It will not work for intervals with months. I'm not sure for negative ones.
> 

Thanks for all the suggestions, but in the end, I went back to
old-school solution, meaning, choose the lowest denominator which is
epoch and seconds.

extract('epoch' from (max(a.delta))/3600)

where a.delta = '1 day 09:18:42.37996'

which returns me something in the form of X.YZ (14.11) or sotmehing like
that..



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Interval Formatting -> Convert to timestamp

2008-08-21 Thread Ow Mun Heng

Hi,

I want to find out if there's a method to change this

select to_char('1 day 09:18:42.37996'::interval,'HH24:MI:SS')

to something like

24+9(hours) = 33:18:42 instead of returning It as 09:19:42

I've not found a way to do this (yet)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] schema name in SQL statement.

2008-08-19 Thread Ow Mun Heng

-Original Message-
From: johnf <[EMAIL PROTECTED]>
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] schema name in SQL statement.
Date: Tue, 19 Aug 2008 22:25:14 -0700

On Tuesday 19 August 2008 10:06:55 pm Scott Marlowe wrote:
> On Tue, Aug 19, 2008 at 10:53 PM, johnf <[EMAIL PROTECTED]> wrote:
> > On Tuesday 19 August 2008 04:01:55 pm Adrian Klaver wrote:
> >> From: "Masis, Alexander (US SSA)" <[EMAIL PROTECTED]>
> >>
> >> > I have to explicitly specify the schema name to make SQL statement to
> >> > work.
> >> > Can I set the schema before the query, or set a default schema?
> >> > My current statement:
> >> > "SELECT col FROM schema.table"
> >> > I like to be able to use generic SQL statement like:
> >> > "SELECT col FROM table"
> >>
> >> See search_path in:
> >> http://www.postgresql.org/docs/8.3/interactive/runtime-config-client.htm
> >>l
> >
> > I see answers like this all the time.  When I review the doc's I still
> > don't know how to set the "search_path" because there is no example in
> > the doc's. Do I do something like this:
> > "select search_path=(public)"  or "select search_path="public" .  So how
> > is the search_path set?


set search_path = 'xmxmxmxmxmxm'



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] What's size of your PostgreSQL Database?

2008-08-19 Thread Ow Mun Heng

On Tue, 2008-08-19 at 02:28 -0400, David Wilson wrote:
> On Fri, Aug 15, 2008 at 11:42 PM, Amber <[EMAIL PROTECTED]> wrote:
> > Dear all:
> >We are currently considering using PostgreSQL to host a read only 
> > warehouse, 
> we would like to get some experiences, best practices and performance metrics 
> from the 
> user community, following is the question list:

I didn't realise the initial questions from this and since I'm lazy to look for 
the original 
mail, I'll put in my 2 cents worth.

DB is a DSS type store instead of OLTP type. Heavily denormalised data.

Master is a celeron 1.7Ghz, 768MB ram, 2x500GB 7200rpm IDE RAID1(data)+
1 spare, 1x80GB (system).
Slave is a celeron 1.7Ghz, 1.5GB RAM, 3x160GB 7200rpm IDE RAID1(data),
1x160GB system

Max columns ~120
DB size is ~200+GB ~600+M (denormalised) rows in ~60+ tables
(partitioned and otherwise)

vacuum is done nightly in addition to turning on autovacuum.

I'm both IO and CPU constrainted. :-)

Denormalisation/ETL process is done on the master and only the final
product is shipped to the slave for read-only via slony.

I've got close to 8 indexes on each table (for bitmap scanning) 

Due to the denormalisation, gettin to the data is very snappy even based
on such a "small" server. (adding ram to the slave saw drastic
performance improvement over the initial 512MB)

Currently looking for an FOSS implementation of a Slice and Dice kind of
drilldown for reporting purposes. Tried a variety including pentaho, but
never been able to get it set-up.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] What's size of your PostgreSQL Database?

2008-08-18 Thread Ow Mun Heng

On Mon, 2008-08-18 at 11:01 -0400, justin wrote:
> Ow Mun Heng wrote: 
> > -Original Message-
> > From: Scott Marlowe <[EMAIL PROTECTED]>
> >   
> > > If you're looking at read only / read
> > > mostly, then RAID5 or 6 might be a better choice than RAID-10.  But
> > > RAID 10 is my default choice unless testing shows RAID-5/6 can beat
> > > it.
> > > 
> > 
> > I'm loading my slave server with RAID-0 based on 3 IDE 7200 Drives.
> > Is this worst off than a RAID 5 implementation?
> > 
> > 
> >   
> I see no problem using Raid-0 on a purely read only database where
> there is a copy of the data somewhere else. RAID 0 gives performance.
> If one of the 3 drives dies it takes the server down and lost of data
> will happen.  The idea behind RAID 1/5/6/10  is  if a drive does fail
> the system can keep going.Giving you time to shut down and replace
> the bad disk or if you have hot swappable just pull and replace.

I'm looking for purely read-only performance and since I didn't have the
bandwidth to do extensive testing, I didn't know whether a RAID1 or a
Raid 0 will do the better job. In the end, I decided to go with RAID 0
and now, I'm thinking if RAID1 will do a better job.

>  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] What's size of your PostgreSQL Database?

2008-08-18 Thread Ow Mun Heng

-Original Message-
From: Scott Marlowe <[EMAIL PROTECTED]>
>If you throw enough drives on a quality RAID controller at it you can
>get very good throughput.  If you're looking at read only / read
>mostly, then RAID5 or 6 might be a better choice than RAID-10.  But
>RAID 10 is my default choice unless testing shows RAID-5/6 can beat
>it.

I'm loading my slave server with RAID-0 based on 3 IDE 7200 Drives.
Is this worst off than a RAID 5 implementation?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] automatic REINDEX-ing

2008-08-12 Thread Ow Mun Heng

On Tue, 2008-08-12 at 08:38 -0700, Lennin Caro wrote:
> you can use a cron job
> 

I have my cron setup to do database wide vacuums each night and it
usually takes ~between 4-6 hours on ~200G DB size. 

On days where there is huge activity, it can drag on for like 15+ hours.

I've recently dropped all my indexes and started to only rebuild _some_
needed ones.

What's the method for looking at index bloats anyway?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] test message --> Is this post getting to the list?

2008-08-12 Thread Ow Mun Heng



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Quick way to alter a column type?

2008-07-08 Thread Ow Mun Heng

On Mon, 2008-07-07 at 02:10 -0400, Lew wrote:
> Ow Mun Heng wrote:
> >> I want to change a column type from varchar(4) to varchar(5) or should I
> >> just use text instead.

> The choice of TEXT for the column would seem to be supported in the PG 
> manual, 
> which stresses that TEXT and VARCHAR are quite close in performance, if not 
> identical.  I recommend to constrain the length if it's proper for the data 
> domain.  That is, if you are 100% absolutely certifiably certain that the 
> length will never change again once you set it to 5, that is, if the data 
> domain is a set of values that must be no more than 5 characters long, then 
> VARCHAR(5) is a good choice.  It accurately represents the data.

It was varchar(4) for a _long_ time until there came a need recently to
move it to a 5 characters.
Hence the issue, I've already changed it to varchar() instead to make it
more open (and less problematic in the future)



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Altering a column type w/o dropping views

2008-07-07 Thread Ow Mun Heng

I'm going to alter a bunch a tables columns's data type and I'm being
forced to drop a view which depends on the the colum.

eg: ALTER TABLE xs.d_trh ALTER m_dcm TYPE character varying; 
ERROR:  cannot alter type of a column used by a view or rule
DETAIL:  rule _RETURN on view v_hpp depends on column "m_dcm"

Is there an alternative method of doing this w/o dropping the existing
view?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Quick way to alter a column type?

2008-07-06 Thread Ow Mun Heng

Is there any quick hacks to do this quickly? There's around 20-30million
rows of data.

I want to change a column type from varchar(4) to varchar(5) or should I
just use text instead.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] LIKE not using indexes (due to locale issue?)

2008-06-25 Thread Ow Mun Heng

On Wed, 2008-06-25 at 17:00 +1000, Klint Gore wrote:
> Ow Mun Heng wrote:
> > On Wed, 2008-06-25 at 14:58 +1000, Klint Gore wrote:
> > > Ow Mun Heng wrote:
> > > > explain select * from d_trr where revision like '^B2.%.SX'
> > > > --where ast_revision  = 'B2.M.SX'
> > > >
> > > > Seq Scan on d_trr  (cost=0.00..2268460.98 rows=1 width=16)
> > > >   Filter: ((revision)::text ~~ '^B2.%.SX'::text)
> > > >
> > > > show lc_collate;
> > > > en_US.UTF-8
> > > >
> > > > Is it that this is handled by tsearch2? Or I need to do the locale to
> > > > "C" for this to function?
> > > >   
> > > See http://www.postgresql.org/docs/8.3/interactive/indexes-opclass.html. 
> > > It tells you how to create an index that like might use in non-C locales.
> >
> > Just more information. This columns is created with the varchar type.
> >
> > original index is created using
> >
> > CREATE INDEX idx_d_ast
> >   ON xmms.d_trh
> >   USING btree
> >   (revision varchar_pattern_ops);
> >
> >
> > CREATE INDEX idx_d_ast2
> >   ON xmms.d_trh
> >   USING btree
> >   (revision);
> >
> > after creating it, seems like it is still doing the seq_scan.
> > So what gives? Can I get more clues here?
> >   
> Post your query and the explain analyze of it and how many rows are in 
> the table.


explain analyse select count(*) from d_trr_iw
--where ast_revision like '^B2.%.SX'
where ast_revision  = 'B2.P.SX'

QUERY PLAN
Aggregate  (cost=353955.35..353955.36 rows=1 width=0) (actual 
time=54.565..54.566 rows=1 loops=1)
  ->  Bitmap Heap Scan on d_trr_iw  (cost=3150.63..353593.31 rows=144813 
width=0) (actual time=54.557..54.557 rows=0 loops=1)
Recheck Cond: ((ast_revision)::text = 'B2.P.SX'::text)
->  Bitmap Index Scan on idx_d_trr_iw_ast  (cost=0.00..3114.42 
rows=144813 width=0) (actual time=54.520..54.520 rows=0 loops=1)
  Index Cond: ((ast_revision)::text = 'B2.P.SX'::text)
Total runtime: 54.662 ms


> 
> In my database, there's 7200 rows in items and I know that none of the 
> identifiers for them start with 'xb'.  As you can see below, the 1st 
> query is sequential and the 2nd one is using the new index.  (v8.3.0)
> 
There's approx 29million rows in there and using the LIKE condition will
churn it for a good 20-30min I suppose. (didn't try - live database)

> postgres=# show lc_collate;
>lc_collate
> 
>  English_Australia.1252
> (1 row)
> 
> postgres=# explain analyse select * from items where identifier like 'xb%';
>  QUERY PLAN
> -
>  Seq Scan on items  (cost=0.00..160.18 rows=1 width=113) (actual 
> time=4.966..4.966 rows=0 loops=1)
>Filter: ((identifier)::text ~~ 'xb%'::text)
>  Total runtime: 5.029 ms
> (3 rows)
> 
> postgres=# create index anindex on items(identifier varchar_pattern_ops);
> CREATE INDEX
> postgres=# explain analyse select * from items where identifier like 'xb%';
> QUERY PLAN
> ---
>  Index Scan using anindex on items  (cost=0.00..8.27 rows=1 width=113) 
> (actual time=0.165..0.165 rows=0 loops=1)
>Index Cond: (((identifier)::text ~>=~ 'xb'::text) AND 
> ((identifier)::text ~<~ 'xc'::text))
>Filter: ((identifier)::text ~~ 'xb%'::text)
>  Total runtime: 0.255 ms
> (4 rows)
> 
> 

Could it be that it's not able to determine the B2.%.SX in there?

explain select count(*) from d_trr_iw where ast_revision like 'B2.P.SX'

even this will result in a seq_scan.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] LIKE not using indexes (due to locale issue?)

2008-06-24 Thread Ow Mun Heng

On Wed, 2008-06-25 at 14:58 +1000, Klint Gore wrote:
> Ow Mun Heng wrote:
> > explain select * from d_trr where revision like '^B2.%.SX'
> > --where ast_revision  = 'B2.M.SX'
> >
> > Seq Scan on d_trr  (cost=0.00..2268460.98 rows=1 width=16)
> >   Filter: ((revision)::text ~~ '^B2.%.SX'::text)
> >
> > show lc_collate;
> > en_US.UTF-8
> >
> > Is it that this is handled by tsearch2? Or I need to do the locale to
> > "C" for this to function?
> >   
> See http://www.postgresql.org/docs/8.3/interactive/indexes-opclass.html. 
> It tells you how to create an index that like might use in non-C locales.

Just more information. This columns is created with the varchar type.

original index is created using

CREATE INDEX idx_d_ast
  ON xmms.d_trh
  USING btree
  (revision varchar_pattern_ops);


CREATE INDEX idx_d_ast2
  ON xmms.d_trh
  USING btree
  (revision);

after creating it, seems like it is still doing the seq_scan.
So what gives? Can I get more clues here?





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] LIKE not using indexes (due to locale issue?)

2008-06-24 Thread Ow Mun Heng

explain select * from d_trr where revision like '^B2.%.SX'
--where ast_revision  = 'B2.M.SX'

Seq Scan on d_trr  (cost=0.00..2268460.98 rows=1 width=16)
  Filter: ((revision)::text ~~ '^B2.%.SX'::text)

show lc_collate;
en_US.UTF-8

Is it that this is handled by tsearch2? Or I need to do the locale to
"C" for this to function?

the revision is indexed.

Bitmap Heap Scan on d_trr  (cost=4492.43..547709.26 rows=193453
width=16)
  Recheck Cond: ((revision)::text = 'B2.M.SX'::text)
  ->  Bitmap Index Scan on idx_d_trr_iw_ast  (cost=0.00...06
rows=193453 width=0)
Index Cond: ((revision)::text = 'B2.M.SX'::text)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] PG Yum Repo - can't Find Slony1

2008-04-23 Thread Ow Mun Heng

This question, I think is directed at Devrim, but if anyone else can
answer it would be great as well.

I saw from the site that states that slony1 packages are available.
However, I can't find it from the yum archives. This is for Centos 5.

Does anyone know?

muchos gracias.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] forcing use of more indexes (bitmap AND)

2008-03-14 Thread Ow Mun Heng


On Fri, 2008-03-14 at 08:26 +0100, A. Kretschmer wrote:
> am  Fri, dem 14.03.2008, um 15:06:56 +0800 mailte Ow Mun Heng folgendes:
> > 
> > On Fri, 2008-03-14 at 07:53 +0100, A. Kretschmer wrote:
> > > am  Fri, dem 14.03.2008, um 14:28:15 +0800 mailte Ow Mun Heng folgendes:
> > > > query is something like this
> > > > 
> > > > Select *
> > > > from v_test
> > > > where acode Like 'PC%' 
> > > > and rev = '0Q'
> > > > and hcm = '1'
> > > > and mcm = 'K'
> > > > 
> > > > where acode, rev, hcm, mcm are all indexes.
> > 
> > 
> > pg_version 8.2.5
> > 
> > "Nested Loop Left Join  (cost=6482.71..9605.00 rows=4 width=347)"
> > "  ->  Nested Loop Left Join  (cost=6482.71..9536.68 rows=4 width=305)"
> > "->  Hash Left Join  (cost=6482.71..9461.26 rows=4 width=297)"
> > "  Hash Cond: (((famid.product_family)::text = 
> > dcm.product_family) AND ((a.preamp_dcm)::text = (dcm.preamp_dcm)::text))"
> > "  ->  Nested Loop Left Join  (cost=6474.69..9453.09 rows=4 
> > width=242)"
> > "->  Bitmap Heap Scan on d_trh_pbert a  
> > (cost=6474.69..9419.97 rows=4 width=237)"
> > "  Recheck Cond: (((mecm)::text = 'K'::text) AND 
> > ((rev)::text = '0Q'::text))"
> > "  Filter: (((acode)::text ~~ 'PC%'::text) AND 
> > ((hcm)::text = '1'::text))"
> > "  ->  BitmapAnd  (cost=6474.69..6474.69 rows=747 
> > width=0)"
> > "->  Bitmap Index Scan on 
> > idx_d_trh_pbert_mediadcm  (cost=0.00..3210.50 rows=164046 width=0)"
> > "  Index Cond: ((media_dcm)::text = 
> > 'MK-0'::text)"
> > "->  Bitmap Index Scan on 
> > idx_d_trh_pbert_ast  (cost=0.00..3263.93 rows=148130 width=0)"
> > "  Index Cond: ((rev)::text = 
> > '0Q'::text)"
> > "->  Index Scan using 
> > driv_family_identifier_lookup_pkey on driv_family_identifier_lookup famid  
> > (cost=0.00..8.27 rows=1 width=17)"
> > "  Index Cond: ((famid.family_identifier)::text = 
> > (a.family_identifier)::text)"
> > "  ->  Hash  (cost=5.61..5.61 rows=161 width=76)"
> > "->  Seq Scan on lookup_preamp_dcm dcm  
> > (cost=0.00..5.61 rows=161 width=76)"
> > "->  Index Scan using d_trr_iw_pkey on d_trr_iw b  
> > (cost=0.00..18.83 rows=1 width=38)"
> > "  Index Cond: (((a.serial_number)::text = 
> > (b.serial_number)::text) AND (a.head_id = b.head_id) AND 
> > (a.test_run_start_date_time = b.test_run_start_date_time) AND (a.test_type 
> > = b.test_type) AND (a.test_phase_id = b.test_phase_id))"
> > "  ->  Index Scan using d_trr_dfh_pkey on d_trr_dfh c  (cost=0.00..16.87 
> > rows=1 width=72)"
> > "Index Cond: (((a.serial_number)::text = (c.serial_number)::text) 
> > AND (a.head_id = c.head_id) AND (a.test_run_start_date_time = 
> > c.test_run_start_date_time) AND (a.test_type = c.test_type) AND 
> > (a.test_phase_id = c.test_phase_id))"
> > 
> > 
> 
> This plan doesn't match with the query above...

the query is based on a view

Regardless of how it is, The question is, Is there a method to force it
to use more indexes to satisfy a query. The most I've seen the planner
use is 2 indexes even though the where clause uses up to 6 indexes.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] forcing use of more indexes (bitmap AND)

2008-03-14 Thread Ow Mun Heng


On Fri, 2008-03-14 at 07:53 +0100, A. Kretschmer wrote:
> am  Fri, dem 14.03.2008, um 14:28:15 +0800 mailte Ow Mun Heng folgendes:
> > query is something like this
> > 
> > Select *
> > from v_test
> > where acode Like 'PC%' 
> > and rev = '0Q'
> > and hcm = '1'
> > and mcm = 'K'
> > 
> > where acode, rev, hcm, mcm are all indexes.
> > 
> > Currently this query is only using the rev and mcm for the bitmapAND.
> > it then does a bitmap heap scan using the acode and the hcm indexes.
> 
> 
> Please show us the output generated from 'explain analyse select ...'
> Please tell us your PG-Version.
> 
> 
> > I would like to try to see if forcing the planner to favour heavier usage 
> > of the indexes would yield faster results.
> > 
> > I've tried lowering random_page_cost(default 4) down to 2 with no change in 
> > planner.
> 
> The planner will use the index only if he assume it make sense.
> For instance, it make no sense to use the index if almost all rows 'hcm'
> contains '1'


pg_version 8.2.5

"Nested Loop Left Join  (cost=6482.71..9605.00 rows=4 width=347)"
"  ->  Nested Loop Left Join  (cost=6482.71..9536.68 rows=4 width=305)"
"->  Hash Left Join  (cost=6482.71..9461.26 rows=4 width=297)"
"  Hash Cond: (((famid.product_family)::text = dcm.product_family) 
AND ((a.preamp_dcm)::text = (dcm.preamp_dcm)::text))"
"  ->  Nested Loop Left Join  (cost=6474.69..9453.09 rows=4 
width=242)"
"->  Bitmap Heap Scan on d_trh_pbert a  
(cost=6474.69..9419.97 rows=4 width=237)"
"  Recheck Cond: (((mecm)::text = 'K'::text) AND 
((rev)::text = '0Q'::text))"
"  Filter: (((acode)::text ~~ 'PC%'::text) AND 
((hcm)::text = '1'::text))"
"  ->  BitmapAnd  (cost=6474.69..6474.69 rows=747 
width=0)"
"->  Bitmap Index Scan on 
idx_d_trh_pbert_mediadcm  (cost=0.00..3210.50 rows=164046 width=0)"
"  Index Cond: ((media_dcm)::text = 
'MK-0'::text)"
"->  Bitmap Index Scan on idx_d_trh_pbert_ast  
(cost=0.00..3263.93 rows=148130 width=0)"
"  Index Cond: ((rev)::text = '0Q'::text)"
"->  Index Scan using driv_family_identifier_lookup_pkey on 
driv_family_identifier_lookup famid  (cost=0.00..8.27 rows=1 width=17)"
"  Index Cond: ((famid.family_identifier)::text = 
(a.family_identifier)::text)"
"  ->  Hash  (cost=5.61..5.61 rows=161 width=76)"
"->  Seq Scan on lookup_preamp_dcm dcm  (cost=0.00..5.61 
rows=161 width=76)"
"->  Index Scan using d_trr_iw_pkey on d_trr_iw b  (cost=0.00..18.83 
rows=1 width=38)"
"  Index Cond: (((a.serial_number)::text = (b.serial_number)::text) 
AND (a.head_id = b.head_id) AND (a.test_run_start_date_time = 
b.test_run_start_date_time) AND (a.test_type = b.test_type) AND 
(a.test_phase_id = b.test_phase_id))"
"  ->  Index Scan using d_trr_dfh_pkey on d_trr_dfh c  (cost=0.00..16.87 rows=1 
width=72)"
"Index Cond: (((a.serial_number)::text = (c.serial_number)::text) AND 
(a.head_id = c.head_id) AND (a.test_run_start_date_time = 
c.test_run_start_date_time) AND (a.test_type = c.test_type) AND 
(a.test_phase_id = c.test_phase_id))"


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] forcing use of more indexes (bitmap AND)

2008-03-14 Thread Ow Mun Heng


On Fri, 2008-03-14 at 00:50 -0600, Scott Marlowe wrote:
> On Fri, Mar 14, 2008 at 12:28 AM, Ow Mun Heng <[EMAIL PROTECTED]> wrote:
> > query is something like this
> >
> > Select *
> > from v_test
> > where acode Like 'PC%'
> > and rev = '0Q'
> > and hcm = '1'
> > and mcm = 'K'
> >
> >  where acode, rev, hcm, mcm are all indexes.
> >
> >  Currently this query is only using the rev and mcm for the bitmapAND.
> >  it then does a bitmap heap scan using the acode and the hcm indexes.
> >
> >  I would like to try to see if forcing the planner to favour heavier usage 
> > of the indexes would yield faster results.
> >
> >  I've tried lowering
> 
> Would setting enable_bitmapscan=off do that?  I'm not being sarcastic,
> I really don't know.

This is even worst as the planner would only use 1 index.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] forcing use of more indexes (bitmap AND)

2008-03-13 Thread Ow Mun Heng

query is something like this

Select *
from v_test
where acode Like 'PC%' 
and rev = '0Q'
and hcm = '1'
and mcm = 'K'

where acode, rev, hcm, mcm are all indexes.

Currently this query is only using the rev and mcm for the bitmapAND.
it then does a bitmap heap scan using the acode and the hcm indexes.

I would like to try to see if forcing the planner to favour heavier usage of 
the indexes would yield faster results.

I've tried lowering random_page_cost(default 4) down to 2 with no change in 
planner.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Column Statistics - How to dertermine for whole database

2008-03-12 Thread Ow Mun Heng


On Wed, 2008-03-12 at 21:40 -0700, Scott Marlowe wrote:
> On Wed, Mar 12, 2008 at 8:45 PM, Ow Mun Heng <[EMAIL PROTECTED]> wrote:
> > On Wed, 2008-03-12 at 21:33 -0500, Adam Rich wrote:
> >  > select c.relname, a.attname, attstattarget
> >  > from pg_catalog.pg_attribute a, pg_catalog.pg_class c,
> >  > pg_catalog.pg_namespace n
> >  > where a.attrelid = c.oid and c.relnamespace=n.oid
> >  > and n.nspname = 'public' and a.attnum > 0
> >
> >  Funny, that does not work.
> >
> >  note : I did change the nspace value to reflect the DB I'm
> >  using/querying
> >
> >  I've even changed the stat level to 200 (default is 100 or -1)
> 
> The nspname setting setting is for schema, not db name.


select c.relname, a.attname, attstattarget
from pg_catalog.pg_attribute a, pg_catalog.pg_class c,
pg_catalog.pg_namespace n
where a.attrelid = c.oid and c.relnamespace=n.oid
and n.nspname = 'xmms' and a.attnum > 0 and attstattarget <> -1;

This works now. 
The clarification on it being the schema name was useful.
Thanks.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Column Statistics - How to dertermine for whole database

2008-03-12 Thread Ow Mun Heng

On Wed, 2008-03-12 at 21:33 -0500, Adam Rich wrote:
> select c.relname, a.attname, attstattarget
> from pg_catalog.pg_attribute a, pg_catalog.pg_class c,
> pg_catalog.pg_namespace n
> where a.attrelid = c.oid and c.relnamespace=n.oid
> and n.nspname = 'public' and a.attnum > 0

Funny, that does not work.

note : I did change the nspace value to reflect the DB I'm
using/querying

I've even changed the stat level to 200 (default is 100 or -1)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Column Statistics - How to dertermine for whole database

2008-03-12 Thread Ow Mun Heng

Hi,

I finally figure out how come (i think) my analyszing of some specific
tables is taking so freaking long. 12million rows, ~11GB table.

I had some of the columns with the stat level set up to 1000. (this was
previously because I was trying to optimise somethings to make things
faster. ) When the table was small, the analyse of that table went by
pretty fast, until it became bigger, now it's a headache.

So, my investigation found that It was due to the stats level I put into
that column. (normal stat level = 100).

Is there a query to pg_catalog tables to find out which table/column has
the stat level not at default in 1 sweep?

Appreciate any pointers.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] MySQL [WAS: postgresql book...]

2008-01-30 Thread Ow Mun Heng


On Wed, 2008-01-30 at 20:14 -0600, Josh Trutwin wrote:
> On Wed, 30 Jan 2008 13:20:58 -0500
> Tom Hart <[EMAIL PROTECTED]> wrote:
> 
> > I have 4 years of mySQL experience (I know, I'm sorry)
> 
> Why is this something to apologize for?  I used to use MySQL for
> everything and now use PostgreSQL for the majority of my DB needs.  I
> certainly advocate PG now to anyone who will listen, but I don't
> think it helps to portray MySQL as a POS or regret its existence.
> It's a very useful tool if used correctly.  IMO, people who are able
> to effectively use PG/MySQL/Oracle/XYZ appropriately are more valuable
> than those that blindly use the same one for every single task.

Best tool for the JOB in my books.
Heck, I'm even considering running MySQL through it's MyISAM backend as
a slave to my PG master.

Should(emphasis!) be plenty Fast

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] postgresql book - practical or something newer?

2008-01-30 Thread Ow Mun Heng


On Wed, 2008-01-30 at 20:47 +0900, Jason Topaz wrote:

> I don't disagree with your point that it's not robust with examples of
> "exactly how a particular problem can be solved".  But I think there are
> enough, and more importantly, I don't think problem-solving is an
> important focus for a manual (that's why 3rd party books exist). 

Which is also the cause of the original rant. There is very few 3rd
party books.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Get the number of records of a result set

2008-01-30 Thread Ow Mun Heng


On Wed, 2008-01-30 at 09:14 +0100, Eugenio Tacchini wrote:
> Hello,
> I'm writing a function in PL/pgSQL and I would like to know if there 
> is a method to get the number of records in a result set, after a 
> select query, without executing the same query using COUNT(*).


not sure what exactly you mean, but perhaps this could help?

del_stime := timeofday();
execute del_qry;
del_etime := timeofday();

GET DIAGNOSTICS del_rows = ROW_COUNT;

This would get you the # of rows inserted into the destination table
etc.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] postgresql book - practical or something newer?

2008-01-29 Thread Ow Mun Heng

On Tue, 2008-01-29 at 19:16 +, Dave Page wrote:
> On Jan 29, 2008 6:16 PM, Joshua D. Drake <[EMAIL PROTECTED]> wrote:
> 
> > I try to be reasonable (no laughing people :)).
> 
> Oh it's hard, so very, very hard!
> 

But seriously, I've ranted on this some time ago( and you can tell that
I'm about to start again)

One of the worst aspect of PG is the documentation, or the lack of it in
terms of "traditional" house. The Manual is fine and all, but in most
cases, what I find that it lacks is actually examples. Either examples
to show what it a particular field/query means but also as a way to show
exactly how a particular problem can be solved.

When I played with both MSSQL and MySQL, I had loads of books (and I
bought a bit of it too, didn't bother subscribing to safari, it just
ain't a book!) to be used as reference and what not.

In PG, all there is, is the manual, a book by Robert Treat, the Book
from Joshua, 1 or 2 other books authored by someone I can't remember etc
and that's about it.

Then I would have to go hunt(via google) for any bit of blog/
presentation slides from a meetup/talk etc for ways to find out how to
do a particular thing. (Thanks Bruce M, Thanks Robert T - excellent
partitioning talk!, Thanks PgCon!) and pore over those.

Other than that, it's more or less, "Bang you head here" and "send email
to the list and hope someone answers"

I hang on to my O'reilly "SQL Hacks" book tightly as it gives me
examples on how to solve a problem and even how other DBs solve it.

I wish there was a book like MySQL Cookbook (which I have a copy)

---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] [OT] Slony + Alter table using pgadmin

2008-01-29 Thread Ow Mun Heng

This is OT for this list and I don't have access to I-net (only email)
and I'm not subscribed to the Slony list.

I need to add a few additional columns to an existing replicated
set/table. I know that I can't just add the columns normally but have to
go through slonik's EXECUTE SCRIPT (/usr/bin/slonik_execute_script), but
I've never done this before.

Additionally, there's an option in Pgadmin, which I _think_ also does
this automatically using built-in slonik_execute-script equivalent.

Has anyone done this before? (it outputs this in the SQL. Can I add more
than 1 column per instance?)

-- Execute replicated using cluster "_my_cluster", set 4
ALTER TABLE xmms.d_product_e
   ADD COLUMN  integer;


PS : I can't test this out since my test_db virtual machine is at home.

many thanks..

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [OT] Re: [GENERAL] enabling autovacuum

2008-01-28 Thread Ow Mun Heng


On Mon, 2008-01-28 at 20:57 -0500, Greg Smith wrote:
> On Tue, 29 Jan 2008, Ow Mun Heng wrote:
> 
> > Can you let me know what is the sql used to generate such a nice summary
> > of the tables?
> 
> Might as well dupe the old text; this went out to the performance list:
> 
> Greg Sabino Mullane released a Nagios plug-in for PostgreSQL that you can 
> grab at http://bucardo.org/nagios_postgres/ , and while that is itself 
> nice the thing I found most remarkable is the bloat check. The majority of 
> that code is an impressive bit of SQL that anyone could use even if you 
> have no interest in Nagios, which is why I point it out for broader 
> attention. Look in check_postgres.pl for the "check_bloat" routine and the 
> big statement starting at the aptly labled "This was fun to write" 
> section. If you pull that out of there and replace $MINPAGES and 
> $MINIPAGES near the end with real values, you can pop that into a 
> standalone query and execute it directly.

I'm subscribed to perf list and I _did_ take a look at the tool
previously. However, something happened and I didn't managed to look at
it throughly or something.

I'll take another look at it and thanks for the pointers..

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

[OT] Re: [GENERAL] enabling autovacuum

2008-01-28 Thread Ow Mun Heng


On Mon, 2008-01-28 at 22:17 +, Jeremy Harris wrote:
> We have one problematic table, which has a steady stream of entries
> and a weekly mass-delete of ancient history.  The "bloat" query from
> Greg Sabino Mullane (thanks to Greg Smith for pointing it out) returns:
> 
>  schemaname | tablename  | reltuples | relpages | otta | tbloat | wastedpages 
> | wastedbytes | wastedsize |iname| ituples | ipages | 
> iotta | ibloat | wastedipages | wastedibytes | wastedisize 
> ++---+--+--++-+-++-+-++---++--+--+-
>  public | rcpt_audit |   1300300 |   152149 | 6365 |   23.9 |  145784 
> |  1194262528 | 1139 MB| rcpt_audit_msg_audit_id_idx | 1300300 |   6798 | 
>  3819 |1.8 | 2979 | 24403968 | 23 MB
>  public | rcpt_audit |   1300300 |   152149 | 6365 |   23.9 |  145784 
> |  1194262528 | 1139 MB| rcpt_audit_id_idx   | 1300300 |   4727 | 
>  3819 |1.2 |  908 |  7438336 | 7264 kB
> 
Can you let me know what is the sql used to generate such a nice summary
of the tables?


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] [OT] Slony Triggers pulling down performance?

2008-01-27 Thread Ow Mun Heng

On Fri, 2008-01-18 at 14:57 -0500, Chris Browne wrote:
> [EMAIL PROTECTED] (Ow Mun Heng) writes:
> > Just wondering if my 'Perceived' feeling that since implementing slony
> > for master/slave replication of select tables, my master database
> > performance is getting slower.
> >
> > I'm constantly seeing a very high amount of IO wait. ~40-80 according to
> > vmstat 1
> >
> > and according to atop. (hdb/hdc = raid1 mirror)
> > DSK | hdb | busy 83% | read1052 | write 50 | avio7 
> > ms |
> > DSK | hdc | busy 81% | read1248 | write 49 | avio6 
> > ms |
> 
> The triggers generate some extra I/O, as they go off and write tuples
> into sl_log_1/sl_log_2, so there's certainly a cost, there.
> 
> When you pull data from sl_log_1/sl_log_2, that will have a cost, too.
> 
> Replication does not come at zero cost...

I've been battling with this issus for the past week and that prompted a
few changes in the manner I pull the data and in the location where i
store the data. I ended up implementing partitioning on the 2 main
largest (problematic) tables and put it intp weekly rotation and moved
the broke the 3 disk raid1(1 spare) spare disk and used that as the
slony-I sl_log_1/sl_log_2 tablespace. 

Now, everything is back to normal. (until I break it again!!) IO Wait is
hovering between 0-40%

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

[GENERAL] DB wide Vacuum(Goes thru readonly tables) vs Autovacuum

2008-01-24 Thread Ow Mun Heng

I'm currently seeing more and more problems with vacuum as the DB size
gets bigger and bigger. (~220GB+)

Bear in mind that I'm working on a fairly big DB with unfairly sized
hardware (Celeron 1.7G, 2x500G Raid1 dbspace1, 1x500Gb dbspace2, 1x80G
system, 768MB Ram, 2G Swap on dspace2)

IO is main bottleneck when doing the vacuum and I've had vacuum stuck on
a particular 5gb table for over 5 hours w/o moving and I've to kill the
entire DB and restart)

Right now, I've already implemented partitioning of some of the huge
tables (weekly) and moved the older ones to read-only tables which does
not have updates/deletes etc.

I'm doing both autovacuum and nightly vacuum. The nightly vacuum is DB
wide (so tht I can see where is the FSM) and this is un-necessary(?)
hitting the read-only tables. Is there a way to specify it to _not_
vacuum those tables and yet still give me the FSM? (if I vacuum based on
per-table via vacuumdb, I won't get the FSM information)

autovacuum = on # enable autovacuum subprocess?
autovacuum_vacuum_threshold = 200   # min # of tuple updates before
autovacuum_vacuum_scale_factor = 0.03   # fraction of rel size before
autovacuum_analyze_scale_factor = 0.02  # fraction of rel size before


I've bumped maintenance_work_mem from 32 to 64 and now to 128mb and I've
stopped all activity on the DB while I'm vacuuming and I see that
setting it to 128MB is keeping things zippy.

Any comments would be appreciated on how best to tune this. (with the
aforementioned hardware limitation)


ps : Shared buffers = 200mb
effective cache size = 350mb


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] [OT] Slony Triggers pulling down performance?

2008-01-17 Thread Ow Mun Heng

Just wondering if my 'Perceived' feeling that since implementing slony
for master/slave replication of select tables, my master database
performance is getting slower.

I'm constantly seeing a very high amount of IO wait. ~40-80 according to
vmstat 1

and according to atop. (hdb/hdc = raid1 mirror)
DSK | hdb | busy 83% | read1052 | write 50 | avio7 ms |
DSK | hdc | busy 81% | read1248 | write 49 | avio6 ms |


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Experiences with extensibility

2008-01-08 Thread Ow Mun Heng


On Wed, 2008-01-09 at 00:24 -0700, Guido Neitzer wrote:
> On 09.01.2008, at 00:14, Ow Mun Heng wrote:
> 
> >> Like, I have a situation where I need multi-master just for
> >> availability. Two small servers are good enough for that. But
> >> unfortunately with PostgreSQL the whole setup is a major pain in  
> >> the ...
> >>
> >
> > Isn't that the reason they hire DB admins and not the run of the mill
> > guy.
> 
> Isn't that more the situation where it is preferred to have a working  
> fail-over with as less money and work as possible?

Yep.. There's where FOSS comes about. But as mentioned, there's a
learning curve in everything and granted that in FOSS, sometimes
documentation is sparse etc.

I guess the other side of the coin is this -> If you want it cheap, you
have to do it yourself and I've be rich for each time the
plumber/electricion/etc comes around to fix something. Each time, the
itch is for me to learn how to do it myself.

> 
> There is just no way I (personally) can afford hiring someone to set  
> that up as I'm talking about something that hasn't brought a dollar  
> yet and will probably not for the next time ... and it is my own  
> project, but there is still some need for a reliable service to come  
> to a point where I can maybe hire someone.

point taken. 

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] Experiences with extensibility

2008-01-08 Thread Ow Mun Heng


On Wed, 2008-01-09 at 00:21 -0700, Guido Neitzer wrote:
> On 09.01.2008, at 00:08, Joshua D. Drake wrote:

> > Great! I was just trying to show you that there was a JDBC layer  
> > available for multi-mastering with PostgreSQL.
> 
> When I find some time, I might dig a bit deeper in the Sequoia stuff  
> again. We will see. Thanks for that hint.
> 


IIRC, there was a presentation on avaiable replication solutions in one
of the pgcons. Perhaps you can search there.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Experiences with extensibility

2008-01-08 Thread Ow Mun Heng

On Tue, 2008-01-08 at 23:05 -0800, Joshua D. Drake wrote:
> Sim Zacks wrote:

> 
> > 
> > The reason companies go with the closed source, expensive solutions is 
> > because they are better products.
> 
> Sometimes, sometimes not. It depends on your needs.

This is total FUD. Everything has a place. And besides, as what I read, nobody 
ever gets fired 
for recommending an expensive solution that comes with expensive support 
contracts and what not.
(wish I could google and insert the link to where I read that)

> 
> > 
> > When evaluating a database for your company, it is better to look at 
> > what the
> > closed source products offer that cause companies to shell out tons of 
> > money and
> > decide if it is worth locking yourself into an expensive and/or 
> > exclusive agreement.
> 
> The only thing this post could possibly be is a Troll. Please go back 
> under the bridge.

No, it's better to evaluate if the features which are being provided
will fit your needs. This is akin to buying a lamborghini only to drive
it down to the local 7-11, down the (same) road to buy some bread.

Take a walk instead, save my ears, save some petrol, save some money.

Otherwise, you end up paying X amount more for features you don't need.
(Me remembers vividly an episode of Simpsons where Homer was given free
rein to design the ultimate American Dream Car.)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] Connect to SQL Server via ODBC from Postgresql

2008-01-08 Thread Ow Mun Heng


On Tue, 2008-01-08 at 23:16 -0800, Joshua D. Drake wrote:
> Ow Mun Heng wrote:
> > On Wed, 2008-01-09 at 08:41 +0200, Sim Zacks wrote:
> >> Another way of doing this, without dblink, is using an unsecured language 
> >> (plpython, for example) is to connect to the sql server using odbc and 
> >> then 
> >> putting the data into your postgresql.
> > 
> > I use perl DBI to connect to both PG and MSSQL.
> > 
> 
> Perl or Python both would service this just fine or PHP (egads).


hehe.. I read a presentation from someone (I forgot who) who detailed
how to use PHP as a procedural language for PG.

That was fun (to read only)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Experiences with extensibility

2008-01-08 Thread Ow Mun Heng

On Tue, 2008-01-08 at 23:37 -0700, Guido Neitzer wrote:
> On 08.01.2008, at 23:20, Joshua D. Drake wrote:

> Like, I have a situation where I need multi-master just for  
> availability. Two small servers are good enough for that. But  
> unfortunately with PostgreSQL the whole setup is a major pain in the ...
> 

Isn't that the reason they hire DB admins and not the run of the mill
guy.

I've not played with multimaster (sync/async) and I doubt I will since
there's no requirement for it., (yet)

In any case, based on my research there's lots of FOSS and (not-so)FOSS
based solutions and of course, each comes with their own learning curve
and also depends on the complexity of the requirements. (Mind you, even
MSSQL with all it's polished point and click interface, you still have
times when you pull hairs out)

I've done a simple master/slave configuration which is faring well, so
that's fine (for me)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Connect to SQL Server via ODBC from Postgresql

2008-01-08 Thread Ow Mun Heng


On Wed, 2008-01-09 at 08:41 +0200, Sim Zacks wrote:
> Another way of doing this, without dblink, is using an unsecured language 
> (plpython, for example) is to connect to the sql server using odbc and then 
> putting the data into your postgresql.

I use perl DBI to connect to both PG and MSSQL.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Announcing PostgreSQL RPM Buildfarm

2008-01-07 Thread Ow Mun Heng

On Mon, 2008-01-07 at 21:54 -0800, Devrim GÜNDÜZ wrote:
> Hi,
> 
> On Tue, 2008-01-08 at 13:34 +0800, Ow Mun Heng wrote:
> > finally a ptop in RPM form.
> 
> I packaged it last month, but did not announce it to public.:)

Hmm.. can I find it in the same location as the pgpool rpms?

> 
> > Devrim, can you also package up pgpool-II-2.0.1? The .spec file you
> > have in the tarball is destined for beta1 of pgpool and your website's
> > rpm is till version 1.3
> > 
> > I've hacked your .spec file a bit and gotten it installed, in anycase.
> 
> Well, I pushed 2.0.1 to Fedora-9 (rawhide). I want to test it for a few
> days, and will build it Fedora 8 early next week. For other distros:
> Yes, it will be available in the yum repository soon.

I was building it for Centos 4. (You have no idea how screwed up it was
for me, tearing out my hair because I can't get it to work as advertised
and then I found out that the version I was using, 2.0.1(pulled from
rpmforge) was actually pgpool-I (and not pgpool-II) and it just so
happens to have the same version number.)

When I _did_ find out, it was a simple matter to get it to work.

I've already rolled the RPM into my production master for master/slave
load balancing. (feature complete for my purpose)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Announcing PostgreSQL RPM Buildfarm

2008-01-07 Thread Ow Mun Heng

On Mon, 2008-01-07 at 13:13 -0600, Scott Marlowe wrote:
> On Jan 7, 2008 12:57 PM, Devrim GÜNDÜZ <[EMAIL PROTECTED]> wrote:
> >
> > I want to announce PostgreSQL RPM Buildfarm today.
> 
> This is very very very cool!  Thanks you guys.

Ultra Even. No more mucking about for RPMS and finally a ptop in RPM
form.

Devrim, can you also package up pgpool-II-2.0.1? The .spec file you have
in the tarball is destined for beta1 of pgpool and your website's rpm is
till version 1.3

I've hacked your .spec file a bit and gotten it installed, in anycase.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] PostgresSQL vs Ingress

2008-01-06 Thread Ow Mun Heng


On Fri, 2007-11-30 at 09:33 -0500, Andrew Sullivan wrote:
> On Fri, Nov 30, 2007 at 01:22:31PM -, Greg Sabino Mullane wrote:
> > or a scapegoat. Please don't perpetuate this urban myth. No companies are 
> > suing Oracle and Microsoft because of their products, and companies have 
> > no expectation of doing so. It might be nice if they did, and some theorize 
> 
> Indeed, by using the product, companies have explicitly given up the right
> to sue over it.  This is the main point of the EULA of most products, and is
> one of the strangest things about the computer industry.  No other industry
> can get away with producing shoddy products that endanger others, and induce
> its users to give up the right to sue in case that shoddiness causes
> problems.  

> But if you think you could sue Oracle Corp -- or even complain on a public
> list about how their software ate your data -- and win that fight, I think
> you need to have a long talk with your corporate counsel :)

Well, that was what I was led to believe in talking to some people and
to be truly honest, I've never read the EULA (not in its entirety nor
understanding its implications entirely either)

So, in that respect, I would say I was un-informed. And I take that
back.

In anycase, if all they want is someone to call, then by all means, it's
not a problem. But that's not what I hear/see or rather, not the item
which is steering much of the decision making.

anyway..

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[GENERAL] [OT] Slony (initial) Replication - Slow

2008-01-04 Thread Ow Mun Heng

I'm just wetting my hands with slony and during the setup of the slave,
I did and dump and restore of the master DB to the Slave DB.

However during the startup of slony, I noticed that it issues a truncate
command to the (to be) replicated table. Hence, this means that there's
no such need for me to do a dump/restore in the 1st place.

can someone confirm this? It _is_ taking long time (for slony) to do the
\copy (~60GB in multiple tables being replicated, including (on the fly)
index creation)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] [OT] Slony (initial) Replication - Slow

2008-01-03 Thread Ow Mun Heng


On Thu, 2008-01-03 at 19:17 -0500, Geoffrey wrote:
> Ow Mun Heng wrote:
> > However during the startup of slony, I noticed that it issues a truncate
> > command to the (to be) replicated table. Hence, this means that there's
> > no such need for me to do a dump/restore in the 1st place.
> > 
> This is correct.  You want an empty replication database.  When you 
> start replication slony will bring the master and slave into sync.
> 

Thanks to you and others who has responded for confirmation.
I would additionally like to know if there was any way for me to use the
dump/restore method and have slony pick up where it was left off?

BTW, it's working right now after some unsuccessful attempts.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[GENERAL] [OT] Slony (initial) Replication - Slow

2008-01-03 Thread Ow Mun Heng

{resend as don't see it on the list after 4 hours}

I'm just wetting my hands with slony and during the setup of the slave,
I did and dump and restore of the master DB to the Slave DB.

However during the startup of slony, I noticed that it issues a truncate
command to the (to be) replicated table. Hence, this means that there's
no such need for me to do a dump/restore in the 1st place.

can someone confirm this? It _is_ taking long time (for slony) to do the
\copy (~60GB in multiple tables being replicated, including (on the fly)
index creation)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] Read-only availability of a standby server?

2008-01-02 Thread Ow Mun Heng

On Wed, 2007-11-21 at 15:33 -0500, Andrew Sullivan wrote:
> On Wed, Nov 21, 2007 at 12:20:51PM -0800, Garber, Mikhail wrote:
> 
> > In the high-availabilty situation with a warm standby, is it possible (or
> > planned) to be able to make standby readable?
> 
> Yes, but it won't happen for 8.3.  It's a feature you can have today with
> Slony, by the way.

Slony provides the _means_ for it to be a standby-read-only server, but
it _does_ not provide a way for load balancing the queries, (which is
what I  really want anyway) The only way to do that is through another
app like LVS / pgpool or even coding it directly into the frontend app.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] replication in Postgres

2008-01-02 Thread Ow Mun Heng


On Mon, 2007-11-26 at 12:39 -0500, Chris Browne wrote:
> [EMAIL PROTECTED] ("Jeff Larsen") writes:
> Unfortunately, the only way to make things deterministic (or to get
> from "near real time" to "*GUARANTEED* real time") is to jump to
> synchronous replication, which is not much different from 2PC (Two
> Phase Commit), and which is certain to be prohibitively expensive
> across a WAN.
> 

2PC is costly and will make things slow overall if there ever was issues
with the WAN. And to alleviate that, I believe one would have to get a
dedicated WAN line just for the syncing process. Expensive.. Anyone can
Spell S-L-A??

> At this point, I tend to get visions of Tom Cruise telling Jack
> Nicholson, "I want real time replication!", and getting the response:
> "You can't HANDLE real time replication!"

Woo.. I like this movie. _best_ court scene _ever_!


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[GENERAL] Howto backup all functions?

2007-12-20 Thread Ow Mun Heng

Hi,

is there a simple way for me to backup all the functions which I've
written for a server?

Thanks

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] thank you

2007-12-18 Thread Ow Mun Heng


On Tue, 2007-12-18 at 20:12 -0700, Gregory Williamson wrote:
> Kevin H. wrote on Tue 12/18/2007 7:26 PM
> > This is aimed at everyone in this community who contributes to the
> > Postgres project, but especially at the core folks who continually
> make this community great through energy, time, money, responses, and
> > what-have-you.
> 
> <...snipped...>
> 
> > The point is that I hope you realize just how much you all mean to
> the community.
> >
> 
> +1
+1

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Need to find out which process is hitting hda

2007-12-17 Thread Ow Mun Heng


On Sun, 2007-12-16 at 16:11 -0800, Joshua D. Drake wrote:
> On Sun, 16 Dec 2007 17:55:55 -0600
> "Scott Marlowe" <[EMAIL PROTECTED]> wrote:
> 
> > On Dec 14, 2007 1:33 AM, Ow Mun Heng <[EMAIL PROTECTED]> wrote:
> > > I kept looking at the io columns and didn't even think of the swap
> > > partition. It's true that it's moving quite erratically but I won't
> > > say that it's really thrashing.
> > >
> > >  total   used   free sharedbuffers
> > > cached Mem:   503498  4  0
> > > 3287 -/+ buffers/cache:207295
> > > Swap: 2527328   2199
> > >
> > > (YEP, I know I'm RAM starved on this machine)
> > 
> > Good lord, my laptop has more memory than that. :)
> 
> My phone has more memory than that :P

What can I say :-p
budgets are tight


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] HouseKeeping and vacuum Questions

2007-12-14 Thread Ow Mun Heng


On Fri, 2007-12-14 at 09:35 +0100, Harald Armin Massa wrote:
> Ow Mun Heng,
> 
> The current issue which prompted me to do such housekeeping is
> due to 
> long database wide vacuum time. (it went from 2 hours to 4
> hours to 7
> hours)
> 
> If vacuum takes to long, you are doing it not often enough. You should
> use autovacuum, you should be able to express the delta between two
> vacuums in seconds or minutes, not hours or days. 

autovacuum is already turned on and the threshold is at default
settings.

Perhaps I need to tune it to be more aggressive? how can I do that?

delta between 2 vacuums in seconds or minutes? that would be nice. 

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] Need to find out which process is hitting hda

2007-12-13 Thread Ow Mun Heng


On Fri, 2007-12-14 at 01:54 -0500, Tom Lane wrote:
> "Merlin Moncure" <[EMAIL PROTECTED]> writes:
> > there are a few things that I can think of that can can cause postgres
> > to cause i/o on a drive other than the data drive:
> > * logging (eliminate this by moving logs temporarily)
I'll have to try this

> > * swapping (swap is high and changing, other ways)
> > * dumps, copy statement (check cron)
Not doing any of these

> > * procedures, especially the external ones (perl, etc) that write to disk
Nope. the only perl running is just pulling data from the master DB into
this little box


> 
> > my seat-of-the-pants guess is that you are looking at swap.
> 
> vmstat would confirm or disprove that particular guess, since it tracks
> swap I/O separately.

procs ---memory-- ---swap-- -io --system-- 
-cpu--
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa st
 2  6 300132   5684   4324 315888  420   32  1024   644 1309  485 35 11  0 54  0
 0  6 299820   6768   4328 313004  588   76  3048   576 1263  588 36 12  0 52  0
 0  6 299428   5424   4340 313700  480   36  2376   104 1291  438 24  9  0 67  0
 2  6 298836   5108   4268 313788  8000  2312   216 1428  625 30 10  0 60  0
 2  6 298316   5692   4192 313044  8760  1652  1608 1488  656 33 11  0 56  0
 2  6 298004   6256   4140 312184  5604  1740  1572 1445  601 42 11  0 47  0

I kept looking at the io columns and didn't even think of the swap
partition. It's true that it's moving quite erratically but I won't say
that it's really thrashing. 

 total   used   free sharedbuffers cached
Mem:   503498  4  0  3287
-/+ buffers/cache:207295
Swap: 2527328   2199

(YEP, I know I'm RAM starved on this machine)


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[GENERAL] HouseKeeping and vacuum Questions

2007-12-13 Thread Ow Mun Heng

I'm starting to perform some basic housekeeping to try to trim some big
tables (~200 million rows - ~50GB+indexes) into separate partitions (via
inheritance).

The current issue which prompted me to do such housekeeping is due to
long database wide vacuum time. (it went from 2 hours to 4 hours to 7
hours)

My current strategy is to keep only 1 month of data in the base table
and make a job to automatically insert/delete 1 days worth of data into
the child partition tables.

Currently, I've moved 3 such large tables into separate partitions and
my vacuum time is still slow. (I suspect it's because I bulk
insert/deleted the tuples from the main_tables in 1 go and then the
vacuum had to deal with vacuum off the MVCC for the past few months's
data)

I'm already batching my housekeep into 6 hours timeframes (eg: insert
into foo_child  select * from fooo from hour1 to hour6)

So, my question is now

1. Should I perform a vacuum after each 6 hour batch? or
2. perform a vacuum after 1 day batch? (4x 6hour sessions)

and what should I do with the few tables which I've not started to
partition? There's 4 months worth of data which is still sitting in the
respective main tables.

Appreciate pointers etc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[GENERAL] Need to find out which process is hitting hda

2007-12-13 Thread Ow Mun Heng

I'm using centos 5 as the OS so, there's no fancy dtrace to look at
which processes is causing my disks to thrash.

I have 4 disks in the box. (all ide, 7200rpm)

1 OS disk [hda]
2 raided (1) disks [hdb/hdc]
1 pg_xlog disk (and also used as an alternate tablespace for [hdd]
temp/in-transit files via select, insert into tmp table. delete from tmp
table, insert into footable select * from tmp table)

Problem now I see from both atop and iostat, the Device: (iostat -dx 10)

 rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
hda  98.6014.69 121.98 15.08  1775.02  2908.2934.17
47.53  551.67   7.29  99.95
hdb   0.70 4.20 16.48  2.30   304.5051.9518.98 0.21 
  10.94   8.45  15.86
hdc   0.00 3.40 12.49  2.00   223.7843.1618.43 0.07 
   5.04   4.42   6.40
hdd   0.0056.94  0.50  3.7053.55   485.91   128.57 0.02 
   5.48   3.95   1.66
md0   0.00 0.00 29.57 11.89   526.6795.1015.00 0.00 
   0.00   0.00   0.00

the number of writes and reads on hda is much greater than expected and I'm not 
sure who/what is causing it.

Thanks for any clues.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

1 2 3 >

1 - 100 of 229 matches

Mail list logo