Re: [GENERAL] serialization errors when inserting new records

2005-01-22 Thread Gary Doades
Ralph van Etten wrote:
Hoi,
I searched the archives but couldn't find an answer to this:
I have a table (simplyfied)
CREATE TABLE test (
  id   INT PRIMARY KEY,
  name VARCHAR(250)
);
I insert records with
INSERT INTO test (id, name)
SELECT COALESCE(MAX(id)+1, 1), 'name' FROM test
Ofcourse this gives problems when two clients are inserting a record at
the same time. (duplicate primary keys) But, i can't use a sequence in my
application (the pk consists of more than just a sequence)
It's not clear why you can't use a serial as the primary key or as part 
of the primary key. From your example it looks like you are trying to do 
exactly that.

What does your *real* primary key consist of?
Cheers,
Gary.
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [GENERAL] USENET vs Mailing Lists Poll ...

2004-11-29 Thread Gary Doades
Marc G. Fournier wrote:

 The WWW folks just put up a survey asking:

 If there was an official newsgroup for postgresql, would you switch to
 using Usenet from using the mailing lists?
Switch no, use in addition yes.
I subscribe to several of the mailing lists that I find most 
interesting, but I also dip into other lists such as committers and 
hackers occasionally to keep up with what's going on without being 
subject to all the email noise from those lists.

I find newsgroups very useful for quickly downloading the headers and 
picking the subjects of interest rather then wading through emails and 
clearing out the ones I dont want to read.

I think it is important to maintain *both* forms of communication, but 
also give wider promotion of the existence and usage of the newsgroups.

It seems from what I read so far the the main users of the mailing lists 
will not move over to newgroups. Good for them. Hopefully, the up and 
coming release of 8.0 will generate wider interest, particularly from 
the Windows users, many of which will look to usenet first for 
information.

Just my 2(replace with local currency) worth,
Gary.
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Functions not dropped.

2004-11-01 Thread Gary Doades
On 1 Nov 2004 at 20:45, Glen Eustace wrote:

 When I tried to reload the production database dump, I got a series of
 errors relating to functions. I repeated the exercise but this time did
 a \df prior to trying to load the database. In the public schema for the
 newly created admin database where all the functions from the old one.
 It seems that the drop database doesn't drop all of the functions. Is it
 supposed to ? I am using 7.4.2 on FC2.
 
 If it isn't is there a way to clean them all out before the load ?
 

It's more likely that the functions are defined in the template1 database 
and so get copied to every new database. You should remove them 
from there if you don't want this behaviour.

Cheers,
Gary.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [GENERAL] Win installed pgsql 8.0 beta2 dev3

2004-10-23 Thread Gary Doades
On 23 Oct 2004 at 10:42, A. Mous wrote:

 
 Note, if I install the ODBC driver separately, these characters do not show
 up.  Therefore, this seems to be an issue specific to the Win installer, no?
 

No, not as such. It is a problem with the version of the ODBC driver that 
ships with Beta2 Dev3. There were some unterminated (incorrectly 
terminated) buffer problems. I would strongly recommend that you 
update to the latest windows beta installer that has a version of the 
ODBC driver that fixes this problem.

Cheers,
Gary.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [GENERAL] how much ram do i give postgres?

2004-10-20 Thread Gary Doades
On 19 Oct 2004 at 17:35, Josh Close wrote:

 Well, I didn't find a whole lot in the list-archives, so I emailed
 that list whith a few more questions. My postgres server is just
 crawling right now :(
 

Unlike many other database engines the shared buffers of Postgres is 
not a private cache of the database data. It is a working area shared 
between all the backend processes. This needs to be tuned for number 
of connections and overall workload, *not* the amount of your database 
that you want to keep in memory. There is still lots of debate about what 
the sweet spot is. Maybe there isn't one, but its not normally 75% of 
RAM.

If anything, the effective_cache_size needs to be 75% of (available) 
RAM as this is telling Postgres the amount of your database the *OS* is 
likely to cache in memory.

Having  said that, I think you will need to define crawling. Is it 
updates/inserts that are slow? This may be triggers/rules/referential 
integrity checking etc that is slowing it. If it is selects that are slow, this 
may be incorrect indexes or sub-optimal queries. You need to show us 
what you are trying to do and what the results are.

Regards,
Gary.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] how much ram do i give postgres?

2004-10-20 Thread Gary Doades
On 20 Oct 2004 at 11:37, Josh Close wrote:

 On Wed, 20 Oct 2004 09:52:25 -0600, Scott Marlowe [EMAIL PROTECTED] wrote:
  1: Is the bulk insert being done inside of a single transaction, or as
  individual inserts?
 
 The bulk insert is being done by COPY FROM STDIN. It copies in 100,000
 rows at a time, then disconnects, reconnects, and copies 100k more,
 and repeats 'till done. There are no indexes on the tables that the
 copy is being done into either, so it won't be slowed down by that at
 all.
 
  

What about triggers? Also constraints (check contraints, integrity 
constraints) All these will slow the inserts/updates down.

If you have integrity constraints make sure you have indexes on the 
referenced columns in the referenced tables and make sure the data 
types are the same.

How long does 100,000 rows take to insert exactly?

How many updates are you performing each hour?

Regards,
Gary.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] how much ram do i give postgres?

2004-10-20 Thread Gary Doades
On 20 Oct 2004 at 13:34, Josh Close wrote:

  How long does 100,000 rows take to insert exactly?
 
 I believe with the bulk inserts, 100k only takes a couple mins.
 

Hmm, that seems a bit slow. How big are the rows you are inserting? Have you checked 
the cpu and IO usage during the inserts? You will need to do some kind of cpu/IO 
monitoring to determine where the bottleneck is.

What hardware is this on? Sorry if you specified it earlier, I can't seem to find 
mention of 
it.

Cheers,
Gary.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] how much ram do i give postgres?

2004-10-20 Thread Gary Doades
On 20 Oct 2004 at 14:09, Josh Close wrote:

 On Wed, 20 Oct 2004 19:59:38 +0100, Gary Doades [EMAIL PROTECTED] wrote:
  Hmm, that seems a bit slow. How big are the rows you are inserting? Have you 
  checked
  the cpu and IO usage during the inserts? You will need to do some kind of cpu/IO
  monitoring to determine where the bottleneck is.
 
 The bulk inserts don't take full cpu. Between 40% and 80%. On the
 other hand, a select will take 99% cpu.

Is this the select(1) query? Please post an explain analyze for this and any other 
slow 
queries.

I would expect the selects to take 99% cpu if all the data you were trying to select 
was 
already in memory. Is this the case in general? I can do a select count(1) on a 
500,000 
row table in about 1 second on a Athlon 2800+ if all the data is cached. It takes 
about 25 
seconds if it has to fetch it from disk.

I have just done a test by inserting (via COPY) of 149,000 rows in a table with 23 
columns, mostly numeric, some int4, 4 timestamps. This took 28 seconds on my 
Windows XP desktop, Athlon 2800+, 7200 rpm SATA disk, Postgres 8.0 beta 2. It used 
around 20% to 40% cpu during the copy. The only index was the int4 primary key, 
nothing else.

How does this compare?

  What hardware is this on? Sorry if you specified it earlier, I can't seem to find 
  mention of
  it.
 
 It's on a P4 HT with 1,128 megs ram.

Disk system??

Regards,
Gary.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] how much ram do i give postgres?

2004-10-20 Thread Gary Doades
On 20 Oct 2004 at 15:36, Josh Close wrote:

 On Wed, 20 Oct 2004 20:49:54 +0100, Gary Doades [EMAIL PROTECTED] wrote:
  Is this the select(1) query? Please post an explain analyze for this and any other 
  slow
  queries.
 
 I think it took so long 'cause it wasn't cached. The second time I ran
 it, it took less than a second. How you can tell if something is
 cached? Is there a way to see what's in cache?

No. The OS caches the data as read from the disk. If you need the data to be in memory 
for performance then you need to make sure you have enough available RAM to hold 
your typical result sets if possible.

 What about the postgresql.conf config settings. This is what I have and why.
 
 sort_mem = 32768
 
 This is default.

This is not the default. The default is 1000. You are telling Postgres to use 32Megs 
for 
*each* sort that is taking place. If you have several queries each performing large 
sorts 
you can quickly eat up available RAM this way. If you will only have a small number of 
concurrrent queries performing sorts then this may be OK. Don't forget, a single query 
can perform more than one sort operation. If you have 10 large sorts happening at the 
same time, you can eat up to 320 megs this way!

You will need to tell us the number of updates/deletes you are having. This will 
determine the vacuum needs. If the bulk of the data is inserted you may only need to 
analyze frequently, not vacuum.

In order to get more help you will need to supply the update/delete frequency and the 
explain analyze output from your queries.

Regards,
Gary.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] removing idle connections

2004-10-19 Thread Gary Doades
On 19 Oct 2004 at 13:00, Josh Close wrote:

 Is there a way to remove idle connections? My postgres server is
 getting serveral hundred idle connections. It's due to a postgres .NET
 provider not closing the connections properly. I don't want to kill
 them all, or restart postgres everytime the connections go crazy.

I would have though it would be better to fix the client application. If the 
app is not closing connections then you may be leaking handles and 
memory.

What .NET provider is this? Are you sure it is not just normal 
connection pooling?

Cheers,
Gary.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [GENERAL] removing idle connections

2004-10-19 Thread Gary Doades
On 19 Oct 2004 at 13:32, Josh Close wrote:

 The provider is corelabs. The programmer that wrote the code says he's
 closing the connections, but they aren't actually closing.
 
 Any ideas? Or better yet, do you know of a good postgres .net provider?
 

Hmm, I've had lots of problems with the Corelabs provider. The open 
source provider (Npgsql) available on GBorg is a lot better for most 
things.

I reverted to using the ODBC driver for Postgres with the .NET Odbc 
provider. Works fine.

With connection pooling enabled you will see several idle connections 
around waiting to be re-used by a provider connection supplying the 
same credentials and options as a disconnected pooled-but-idle 
connection. If you connect with a different username/password 
everytime then this may be a problem, otherwise it is a big performance 
gain in the disconnected recordset world of .NET.

Cheers,
Gary.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] [HACKERS] OT moving from MS SQL to PostgreSQL

2004-10-03 Thread Gary Doades
On 3 Oct 2004 at 11:24, Scott Marlowe wrote:

 On Sun, 2004-10-03 at 06:33, stig erikson wrote:
 There are a few tools I've seen that will try to convert ASP to PHP, but
 for the most part, they can't handle very complex code, so you're
 probably better off just rewriting it and learning PHP on the way.
 
 By the way, I have moved this over to -general, as this is quite off
 topic for -hackers.  Next person to reply please remove the
 pgsql-hackers address from the CC list please.
 

Also you might want to try converting it to ASP.NET. If you use the 
mono packages you can run ASP.NET on Windows and Linux/Unix with 
very little change (if any).

I have done this just fine with some C# ASP.NET stuff using Apache, 
PostgreSQL and mod-mono. Just needed recompilation because mono 
doesn't understand the debug stuff the MS produces. Otherwise no 
changes.

Cheers,
Gary.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Slow queries in PL/PGSQL function

2004-06-08 Thread Gary Doades
It's OK, I'm an idiot.

I read s many times that you need to get the data types in the query the same as 
the column
types or indexes won't work. So I go and get it wrong!

I had defined the parameter to the function as timestamp (equivalent to SQLServer 
datetime), but the
column data type is date. SQLServer does not have a date type so the datetime 
parameters work, but
straight conversion causes PG not to use indexes.

I guess that many might be put off PG because of this without bothering to find out 
why. I think I
read somewhere that there are plans to improve PGs behaviour in this area, is this so?


Cheers,
Gary.


On Sat, 21 Feb 2004 11:15:50 -0500, [EMAIL PROTECTED] (Tom Lane) wrote:

Gary Doades [EMAIL PROTECTED] writes:
 Is there someway to force the use of an index. Or at least get the
 backend to substitue the parameters in a function before doing the
 first query plan so it has more typical values to work with?

Could we see the EXPLAIN ANALYZE output for your problem query?
Table schemas (column data types and available indexes) are necessary
background for this type of question as well.

You might want to take the question to pgsql-performance, too ...
it's a tad off topic for -general.

   regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Slow queries in PL/PGSQL function

2004-02-28 Thread Gary Doades
 location_pkey on location  (cost=0.00..5.46 
rows=1 width=18)
(actual time=0.010..0.010 rows=1 loops=98)
Index Cond: (location.location_id = outer.location_id)
Filter: ((area_id = 1) OR (subplan))
SubPlan
  -  Seq Scan on wruserarea  (cost=1.46..3.44 rows=2 width=4) 
(never executed)
Filter: ((uid = $6) AND ((area_id = 1) OR (area_id = $7)))
InitPlan
  -  Seq Scan on wruser  (cost=0.00..1.46 rows=1 width=4) 
(never
executed)
Filter: ((username)::name = current_user())
  SubPlan
-  GroupAggregate  (cost=3.10..652.99 rows=1 width=4) (actual time=1.847..1.847 
rows=1
loops=98)
  Filter: (count(contract_id) = $9)
  InitPlan
-  Aggregate  (cost=1.55..1.55 rows=1 width=4) (actual time=0.000..0.000 
rows=1
loops=1)
  -  Seq Scan on search_order_reqt  (cost=0.00..1.55 rows=1 width=4) 
(actual
time=0.000..0.000 rows=1 loops=1)
Filter: (search_id = $4)
  -  Nested Loop IN Join  (cost=1.55..651.43 rows=1 width=4) (actual 
time=1.449..1.837
rows=1 loops=98)
Join Filter: (outer.product_id = inner.product_id)
-  Nested Loop  (cost=0.00..645.69 rows=186 width=8) (actual 
time=0.429..1.306
rows=245 loops=98)
  -  Index Scan using staff_contract_pkey on staff_contract c
(cost=0.00..17.45 rows=1 width=4) (actual time=0.276..0.398 rows=1 loops=98)
Filter: ((staff_id = $8) AND ((avail_date_from)::timestamp 
without time
zone = $1) AND ((avail_date_to)::timestamp without time zone = $1))
  -  Index Scan using staff_product_contract_id_key on 
staff_product p
(cost=0.00..625.85 rows=191 width=8) (actual time=0.133..0.582 rows=245 loops=98)
Index Cond: (p.contract_id = outer.contract_id)
-  Materialize  (cost=1.55..1.56 rows=1 width=4) (actual 
time=0.000..0.001 rows=1
loops=23972)
  -  Seq Scan on search_order_reqt  (cost=0.00..1.55 rows=1 
width=4) (actual
time=0.000..0.000 rows=1 loops=1)
Filter: (search_id = $4)
-  Aggregate  (cost=8760.23..8760.24 rows=1 width=2) (actual 
time=236.245..236.245 rows=1
loops=98)
  -  Nested Loop  (cost=2009.53..8760.21 rows=9 width=2) (actual 
time=226.378..236.194
rows=14 loops=98)
-  Hash Join  (cost=2009.53..8732.97 rows=9 width=4) (actual 
time=226.286..235.755
rows=14 loops=98)
  Hash Cond: (outer.booking_id = inner.booking_id)
  -  Seq Scan on booking_plan bp  (cost=0.00..6716.98 rows=1274 
width=4)
(actual time=209.684..229.684 rows=4704 loops=98)
Filter: (((booking_date)::timestamp without time zone = 
$1) AND
((booking_date)::timestamp without time zone = $2))
  -  Hash  (cost=2008.02..2008.02 rows=606 width=8) (actual 
time=3.357..3.357
rows=0 loops=98)
-  Index Scan using staff_book_idx5 on staff_booking b
(cost=0.00..2008.02 rows=606 width=8) (actual time=0.163..3.061 rows=358 loops=98)
  Index Cond: (contract_id = $0)
-  Index Scan using order_reqt_pkey on order_reqt r  (cost=0.00..3.01 
rows=1
width=6) (actual time=0.024..0.026 rows=1 loops=1326)
  Index Cond: (outer.reqt_id = r.reqt_id)
-  Aggregate  (cost=3443.91..3443.91 rows=1 width=0) (actual time=4.745..4.745 
rows=1 loops=98)
  -  Nested Loop  (cost=0.00..3443.90 rows=2 width=0) (actual 
time=4.531..4.724 rows=3
loops=98)
-  Index Scan using main_order_idx on main_order  (cost=0.00..5.87 
rows=2 width=4)
(actual time=0.143..0.153 rows=1 loops=98)
  Index Cond: (client_id = $3)
-  Index Scan using timesheet_detail_idx on timesheet_detail  
(cost=0.00..1718.97
rows=3 width=4) (actual time=4.378..4.571 rows=3 loops=98)
  Index Cond: (outer.order_id = timesheet_detail.order_id)
  Filter: (contract_id = $0)
Total runtime: 23853.000 ms

The only real difference I can see is the booking_plan table using a sequential scan.

Any help appreciated

Thanks,
Gary.


On Sat, 21 Feb 2004 11:15:50 -0500, [EMAIL PROTECTED] (Tom Lane) wrote:

Gary Doades [EMAIL PROTECTED] writes:
 Is there someway to force the use of an index. Or at least get the
 backend to substitue the parameters in a function before doing the
 first query plan so it has more typical values to work with?

Could we see the EXPLAIN ANALYZE output for your problem query?
Table schemas (column data types and available indexes) are necessary
background for this type of question as well.

You might want to take the question to pgsql-performance, too ...
it's a tad off topic for -general

Re: [GENERAL] Slow queries in PL/PGSQL function

2004-02-20 Thread Gary Doades
I have a very similar problem. I put the following SQL into a
function:

SELECT VS.*,VL.TEL1,SC.CONTRACT_ID,SC.CONTRACT_REF, SC.MAX_HOURS,
SC.MIN_HOURS, 
 (SELECT COUNT(*) FROM TIMESHEET_DETAIL
JOIN MAIN_ORDER ON (MAIN_ORDER.ORDER_ID = TIMESHEET_DETAIL.ORDER_ID
AND MAIN_ORDER.CLIENT_ID = $3)
 WHERE TIMESHEET_DETAIL.CONTRACT_ID = SC.CONTRACT_ID) AS VISITS,
 (SELECT (SUM(R.DURATION+1))/60.0 FROM ORDER_REQT R
 JOIN STAFF_BOOKING B ON (B.REQT_ID = R.REQT_ID)
 JOIN BOOKING_PLAN BP ON (BP.BOOKING_ID = B.BOOKING_ID)
 WHERE B.CONTRACT_ID = SC.CONTRACT_ID 
 AND BP.BOOKING_DATE BETWEEN $1 AND $2) AS RHOURS
 FROM VSTAFF VS
  JOIN STAFF_CONTRACT SC ON (SC.STAFF_ID = VS.STAFF_ID) 
 JOIN VLOCATION VL ON (VL.LOCATION_ID = VS.LOCATION_ID)
 JOIN SEARCH_REQT_RESULT SR ON (SR.STAFF_ID = VS.STAFF_ID)
WHERE SR.SEARCH_ID = $4
AND SC.CONTRACT_ID IN
(SELECT C.CONTRACT_ID FROM STAFF_PRODUCT P,STAFF_CONTRACT C
 WHERE P.CONTRACT_ID=C.CONTRACT_ID AND C.STAFF_ID = VS.STAFF_ID AND
P.PRODUCT_ID IN (SELECT PRODUCT_ID FROM SEARCH_ORDER_REQT WHERE
SEARCH_ID = $4)
 AND C.AVAIL_DATE_FROM = $1 AND C.AVAIL_DATE_TO = $2  GROUP BY
C.CONTRACT_ID
 HAVING (COUNT(C.CONTRACT_ID) = (SELECT COUNT(DISTINCT PRODUCT_ID)
FROM SEARCH_ORDER_REQT WHERE SEARCH_ID = $4)));

When executed from the client and substituting constants for the
variables ($1,$2,$3) it takes 1 second. When executed from within a
function using exactly the same values it takes 30 seconds! Using your
method to get the execution plan with variables I can see it has
switched to a sequential scan on the BOOKING_PLAN table (250K records)
instead of the index it uses otherwise. I have spent a while
optimising this and tested with quite a range of date values. It
always uses the indexes when executed from the client. The date ranges
given are usually quite small.

I am trying to migrate from MS SQLSever to PG and this has now stopped
me for a while. I could transfer all the stored procedures into client
code, but that seems very inefficient.

Is there someway to force the use of an index. Or at least get the
backend to substitue the parameters in a function before doing the
first query plan so it has more typical values to work with?

Regards,
Gary.

On Fri, 20 Feb 2004 09:20:24 -0500, [EMAIL PROTECTED] (Tom Lane)
wrote:

Richard Huxton [EMAIL PROTECTED] writes:
 On Thursday 19 February 2004 23:00, Jim Crate wrote:
 explain analyze SELECT DISTINCT i_ip
 FROM x_rbl_ips
 LEFT JOIN filter_ips ON x_rbl_ips.i_ip = filter_ips.i_filter_ip
 WHERE x_rbl_ips.dts_last_modified  '2004-02-18 22:24:15.901689+00'
 AND filter_ips.i_filter_ip IS NOT NULL
 AND (i_filter_type_flags  X'0200'::integer)  X'0200'::integer

 I'm guessing that the values in your query are variables/parameters in the 
 plpgsql function? The problem is that the plan is compiled when the function 
 is first run, so it doesn't know what values you will use. You might tend to 
 use values that make sense to index, but it can't tell.

Specifically, the only part of that that looks indexable is the
dts_last_modified constraint.  If it's always dts_last_modified 
some-time-in-the-recent-past then the range of values scanned is going
to be small enough to make an indexscan worthwhile.  But if what the
planner sees is dts_last_modified  variable then it isn't gonna risk
an indexscan, because in the general case that could mean scanning a
large part of the table, and the indexscan would be tremendously slow.

What you can do to work around this (I'm assuming dts_last_modified
never contains values in the future) is add a dummy constraint:

   WHERE x_rbl_ips.dts_last_modified  variable
 AND x_rbl_ips.dts_last_modified = now()
 AND other-stuff

Now what the planner sees is dts_last_modified being constrained to a
range, rather than an open-ended interval.  It still has to guess about
how much of the index will be scanned, but its guess in this scenario
is much smaller and it should usually pick the indexscan instead.

BTW, in recent releases you can investigate planner choices involving
queries with variables by using PREPARE and EXPLAIN EXECUTE.  For
example

   PREPARE myq(timestamptz) AS
   ...
   WHERE x_rbl_ips.dts_last_modified  $1
   ...

   EXPLAIN EXECUTE myq('2004-02-18 22:24:15.901689+00');

This allows you to exactly reproduce the conditions that the planner has
to work under when planning a query from a plpgsql function.

   regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster