Re: [PERFORM] Multicolumn order by

2006-04-19 Thread Theo Kramer
On Wed, 2006-04-19 at 01:08, Tom Lane wrote:
 Theo Kramer [EMAIL PROTECTED] writes:
  select * from mytable where
(c1 = 'c1v' and c2 = 'c2v' and c3 = 'c3v') or
(c1 = 'c1v' and c2  'c2v') or
(c1  'c1v')
order by c1, c2, c3;
 
 Yeah ... what you really want is the SQL-spec row comparison operator
 
 select ... where (c1,c2,c3) = ('c1v','c2v','c3v') order by c1,c2,c3;
 
 This does not work properly in any current PG release :-( but it does
 work and is optimized well in CVS HEAD.  See eg this thread
 http://archives.postgresql.org/pgsql-hackers/2006-02/msg00209.php

That is awesome - been fighting with porting my isam based stuff onto
sql for a long time and the row comparison operator is exactly what I
have been looking for.

I tried this on my test system running 8.1.3 and appears to work fine.
Appreciate it if you could let me know in what cases it does not work
properly.

-- 
Regards
Theo


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] SELECT FOR UPDATE performance is bad

2006-04-19 Thread Mario Splivalo
On Tue, 2006-04-18 at 11:33 -0400, Tom Lane wrote:
 Mario Splivalo [EMAIL PROTECTED] writes:
  If there is concurrent locking,
  you're also running a big risk of deadlock because two processes might
  try to lock the same rows in different orders.
 
  I think there is no risk of a deadlock, since that particular function
  is called from the middleware (functions are used as interface to the
  database), and the lock order is always the same.
 
 No, you don't even know what the order is, let alone that it's always
 the same.

You got me confused here! :) If I have just only one function that acts
as a interface to the middleware, and all the operations on the database
are done trough that one function, and I carefuly design that function
so that I first grab the lock, and then do the stuff, aint I pretty sure
that I won't be having any deadlocks? 

 
  Now, I just need to have serialization, I need to have clients 'line up'
  in order to perform something in the database. Actually, users are
  sending codes from the newspaper, beer-cans, Cola-cans, and stuff, and
  database needs to check has the code allready been played. Since the
  system is designed so that it could run multiple code-games (and then
  there similair code could exists for coke-game and beer-game), I'm using
  messages table to see what code-game (i.e. service) that particular code
  belongs.
 
 I'd suggest using a table that has exactly one row per code-game, and
 doing a SELECT FOR UPDATE on that row to establish the lock you need.
 This need not have anything to do with the tables/rows you are actually
 intending to update --- although obviously such a convention is pretty
 fragile if you have updates coming from a variety of code.  I think it's
 reasonably safe when you're funneling all the operations through a bit
 of middleware.

I tend to design my applications so I don't have flying SQL in my
java/python/c#/php/whereever code, all the database stuff is done trough
the functions which are designed as interfaces. Those functions are also
designed so they don't stop each other. So, since I need the
serialization, I'll do as you suggested, using a lock-table with
exactley one row per code-game.

Just one more question here, it has to do with postgres internals, but
still I'd like to know why is postgres doing such huge i/o (in my log
file I see a lot of messages that say LOG:  archived transaction log
file when performing that big FOR UPDATE.

Mario
-- 
Mario Splivalo
Mob-Art
[EMAIL PROTECTED]

I can do it quick, I can do it cheap, I can do it well. Pick any two.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Multicolumn order by

2006-04-19 Thread Theo Kramer
On Wed, 2006-04-19 at 08:00, Theo Kramer wrote:

 I tried this on my test system running 8.1.3 and appears to work fine.
 Appreciate it if you could let me know in what cases it does not work
 properly.

Please ignore - 'Explain is your friend' - got to look at the tips :)
-- 
Regards
Theo


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Inserts optimization?

2006-04-19 Thread Markus Schaber
Hi, Magnus,

Magnus Hagander wrote:

 Bacula already serializes access to the database (they have to support
 mysql/myisam), so this shouldn't help.

Ouch, that hurts.

To support mysql, they break performance for _every other_ database system?

cynism
Now, I understand how the mysql people manage to spread the legend of
mysql being fast. They convince software developers to thwart all others.
/

Seriously: How can we convince developers to either fix MySQL or abandon
and replace it with a database, instead of crippling client software?

 Actually, it might well hurt by introducing extra delays.

Well, if you read the documentation, you will see that it will only wait
if there are at least commit_siblings other transactions active. So when
Bacula serializes access, there will be no delays, as there is only a
single transaction alive.


HTH
Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Inserts optimization?

2006-04-19 Thread Magnus Hagander
  Bacula already serializes access to the database (they have 
 to support 
  mysql/myisam), so this shouldn't help.
 
 Ouch, that hurts.
 
 To support mysql, they break performance for _every other_ 
 database system?

Actually, it probably helps on SQLite as well. And considering they only
support postgresql, mysql and sqlite, there is some merit to it from
their perspective.

You can find a thread about it in the bacula archives from a month or
two back.

 cynism
 Now, I understand how the mysql people manage to spread the 
 legend of mysql being fast. They convince software developers 
 to thwart all others.
 /

Yes, same as the fact that most (at least FOSS) web project-du-jour are
dumbed down to the mysql featureset. (And not just mysql, but
mysql-lowest-common-factors, which means myisam etc)


  Actually, it might well hurt by introducing extra delays.
 
 Well, if you read the documentation, you will see that it 
 will only wait if there are at least commit_siblings other 
 transactions active. So when Bacula serializes access, there 
 will be no delays, as there is only a single transaction alive.

Hm. Right. Well, it still won't help :-)

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Inserts optimization?

2006-04-19 Thread Markus Schaber
Hi, Magnus,

Magnus Hagander wrote:

To support mysql, they break performance for _every other_ 
database system?
 Actually, it probably helps on SQLite as well.

AFAICS from the FAQ http://www.sqlite.org/faq.html#q7 and #q8, SQLite
does serialize itsself.

 And considering they only
 support postgresql, mysql and sqlite, there is some merit to it from
 their perspective.

Okay, I understand, but I hesitate to endorse it.

IMHO, they should write their application in a normal way, and then
have the serialization etc. encapsulated in the database driver
interface (possibly a wrapper class or so).

cynism
Now, I understand how the mysql people manage to spread the 
legend of mysql being fast. They convince software developers 
to thwart all others.
/
 Yes, same as the fact that most (at least FOSS) web project-du-jour are
 dumbed down to the mysql featureset. (And not just mysql, but
 mysql-lowest-common-factors, which means myisam etc)

Well, most of those projects don't need a database, they need a bunch of
tables and a lock.

Heck, they even use client-side SELECT-loops in PHP instead of a JOIN
because I always confuse left and right.


Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Inserts optimization?

2006-04-19 Thread Scott Marlowe
On Wed, 2006-04-19 at 07:08, Markus Schaber wrote:
 Hi, Magnus,
 
 Magnus Hagander wrote:
 
  Bacula already serializes access to the database (they have to support
  mysql/myisam), so this shouldn't help.
 
 Ouch, that hurts.
 
 To support mysql, they break performance for _every other_ database system?

Note that should be to support MySQL with MyISAM tables.

If they had written it for MySQL with innodb tables they would likely be
able to use the same basic methods for performance tuning MySQL as or
Oracle or PostgreSQL.

It's the refusal of people to stop using MyISAM table types that's the
real issue.

Of course, given the shakey ground MySQL is now on with Oracle owning
innodb...

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Inserts optimization?

2006-04-19 Thread PFC



Isn't MyISAM still the default over there?


Yes, it's the default.
	Personnally I compile MySQL without InnoDB... and for any new development  
I use postgres.


It's hardly likely that the average MySQL user would use anything but  
the default table type ...


	Double yes ; also many hosts provide MySQL 4.0 or even 3.x, both of which  
have no subquery support and are really brain-dead ; and most OSS PHP apps  
have to be compatible... argh.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Planner doesn't chose Index - (slow select)

2006-04-19 Thread patrick keshishian
Tom,

You are absolutely correct about not having run ANALYZE
on the particular table.

In my attempt to create a simple test case I created that
table (pk_c2) from the original and had not run ANALYZE
on it, even though, ANALYZE had been run prior to building
that table.

The problem on the test table and the simple select count(*)
is no longer there (after ANALYZE).

The original issue, however, is still there. I'm stumped as
how to formulate my question without having to write a
lengthy essay.

As to upgrading from 7.4, I hear you, but I'm trying to support
a deployed product.

Thanks again for your input,
--patrick




On 4/18/06, Tom Lane [EMAIL PROTECTED] wrote:
 patrick keshishian [EMAIL PROTECTED] writes:
  I've been struggling with some performance issues with certain
  SQL queries.  I was prepping a long-ish overview of my problem
  to submit, but I think I'll start out with a simple case of the
  problem first, hopefully answers I receive will help me solve
  my initial issue.

 Have you ANALYZEd this table lately?

  db=# select count(*) from pk_c2 b0 where b0.offer_id=7141;
   count
  ---
   1
  (1 row)

 The planner is evidently estimating that there are 12109 such rows,
 not 1, which is the reason for its reluctance to use an indexscan.
 Generally the only reason for it to be off that far on such a simple
 statistical issue is if you haven't updated the stats in a long time.
 (If you've got a really skewed data distribution for offer_id, you
 might need to raise the statistics target for it.)

  The table has indexes for both 'offer_id' and '(pending=true)':

  Indexes:
  pk_boidx btree (offer_id)
  pk_bpidx btree (((pending = true)))

 The expression index on (pending = true) won't do you any good,
 unless you spell your query in a weird way like
 ... WHERE (pending = true) = true
 I'd suggest a plain index on pending instead.

  db=# select version();
   PostgreSQL 7.4.12 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.6

 You might want to think about an update, too.  7.4 is pretty long in the
 tooth.

 regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Takes too long to fetch the data from database

2006-04-19 Thread soni de


Please provide me some help regarding how could I use cursor in following cases? :


I want to fetch 50 records at a time starting from largest stime. 


Total no. of records in the wan table:
 82019

pdb=# \d wan
 Table wan

 
Column | Type | Modifiers
-+--+---

stime
 | bigint | not null
kname
 | character varying(64) |
eid
 | smallint |
rtpe
 | smallint |
taddr
 | character varying(16) |
ntime
 | bigint |
Primary key: wan_pkey


stime is the primary key.

pdb=#

SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900;


pdb=# explain analyze SELECT * FROM wan ORDER BY stime LIMIT 50

OFFSET 81900;

NOTICE: QUERY PLAN:


Limit (cost=17995.15..17995.15
 rows=50 width=95) (actual time=9842.92..9843.20
rows=50 loops=1)
 -
 Sort (cost=17995.15..17995.15 rows=82016 width=95) (actual time=9364.56..
9793.00 rows=81951 loops=1)
 -
 Seq Scan on wan (cost=0.00..3281.16 rows=82016 width=95) (actu
al time=0.11..3906.29 rows=82019 loops=1)
Total runtime: 10010.76 msec

EXPLAIN
pdb=#


SELECT * FROM wan where kname='pluto' ORDER BY stime LIMIT 50 OFFSET 81900;


pdb=# explain analyze SELECT * from wan where kname='pluto' order by stime limit 50 offset 81900;

NOTICE: QUERY PLAN:


Limit (cost=3494.13..3494.13
 rows=1 width=95) (actual time=9512.85..9512.85 rows=0 loops=1)
 -
 Sort (cost=3494.13..3494.13 rows=206 width=95) (actual time=9330.74..9494.90 rows=27485 loops=1)
 -
 Seq Scan on wan (cost=0.00..3486.20 rows=206 width=95) (actual time=0.28..4951.76 rows=27485 loops=1)
Total runtime: 9636.96 msec

EXPLAIN

SELECT * FROM wan where kname='pluto' and rtpe=20 ORDER BY stime LIMIT 50 OFFSET 81900;


pdb=# explain analyze SELECT * from wan where kname='pluto' and rtpe = 20 order by stime limit 50 offset 81900;

NOTICE: QUERY PLAN:


Limit (cost=3691.25..3691.25
 rows=1 width=95) (actual time=7361.50..7361.50 rows=0 loops=1)
 -
 Sort (cost=3691.25..3691.25 rows=1 width=95) (actual time=7361.50..7361.50 rows=0 loops=1)
 -
 Seq Scan on wan (cost=0.00..3691.24 rows=1 width=95) (actual time=7361.30..7361.30 rows=0 loops=1)
Total runtime: 7361.71 msec

EXPLAIN
pdb=#

all the above queries taking around 7~10 sec. to fetch the last 50 records. I want to reduce this time because table is growing and table can contain more than 1 GB data then for 1 GB data above queries will take too much time.


I am not getting how to use cursor to fetch records starting from last records in the above case offset can be any number (less than total no. of records). 


I have use following cursor, but it is taking same time as query takes.


BEGIN;
DECLARE crs cursor FOR SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900;

FETCH ALL in crs;
CLOSE crs;
COMMIT;


On 4/11/06, Merlin Moncure [EMAIL PROTECTED] wrote:
 pdb=# explain analyze SELECT sdate, stime, rbts from lan WHERE (( bname = 'pluto' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate
= '2004-07-21' ) AND ( sdate = '2004-07-21' ) ))ORDER BY sdate, stime ;this query would benefit from an index onpluto, cno, pno, sdatecreate index Ian_idx on Ian(bname, cno, pno, sdate);
 pdb=# explain analyze SELECT ALL sdate, stime, rbts from lan WHERE (( bname = 'neptune' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate = '2004-07-21' ) AND ( sdate = '2004-07-21' ) ))ORDER BY sdate, stime ;
ditto above.Generally, the closer the fields in the where clause arematched by the index, the it will speed up your query.Merlin