Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-28 Thread Mark Lewis
Tom Lane Wrote:
> Hmm ... I wonder how hard it would be to teach EXPLAIN ANALYZE to show
> the runtime expended in each trigger when the statement is of a kind
> that has triggers.  We couldn't break down the time *within* the
> triggers, but even this info would help a lot in terms of finger
> pointing ...
> 
>   Seq Scan on ... (nn.nnn ms)
>   Trigger foo: nn.mmm ms
>   Trigger bar: nn.mmm ms
>   Total time: nn.mmm ms


So I got the latest from CVS on Friday night to see how hard it would be
to implement this, but it turns out that Tom has already committed the
improvement, so I'm in Tom's fan club today.  I imported my test dataset
and was almost immediately able to track down the cause of my
performance problem.

Thanks!
Mark Lewis


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [NOVICE] Many connections lingering

2005-04-13 Thread Mark Lewis
If there are potentially hundreds of clients at a time, then you may be
running into the maximum connection limit.

In postgresql.conf, there is a max_connections setting which IIRC
defaults to 100.  If you try to open more concurrent connections to the
backend than that, you will get a connection refused.

If your DB is fairly gnarly and your performance needs are minimal it
should be safe to increase max_connections.  An alternative approach
would be to add some kind of database broker program.  Instead of each
agent connecting directly to the database, they could pass their data to
a broker, which could then implement connection pooling.

-- Mark Lewis

On Tue, 2005-04-12 at 22:09, Slavisa Garic wrote:
> This is a serious problem for me as there are multiple users using our
> software on our server and I would want to avoid having connections
> open for a long time. In the scenario mentioned below I haven't
> explained the magnitute of the communications happening between Agents
> and DBServer. There could possibly be 100 or more Agents per
> experiment, per user running on remote machines at the same time,
> hence we need short transactions/pgsql connections. Agents need a
> reliable connection because failure to connect could mean a loss of
> computation results that were gathered over long periods of time.



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] [GENERAL] "Hash index" vs. "b-tree index" (PostgreSQL

2005-05-10 Thread Mark Lewis
If the original paper was published in 1984, then it's been more than 20
years.  Any potential patents would already have expired, no?

-- Mark Lewis

On Tue, 2005-05-10 at 14:35, Mischa Sandberg wrote:
> Quoting "Jim C. Nasby" <[EMAIL PROTECTED]>:
> 
> > Well, in a hash-join right now you normally end up feeding at least
> > one
> > side of the join with a seqscan. Wouldn't it speed things up
> > considerably if you could look up hashes in the hash index instead?
> 
> You might want to google on "grace hash" and "hybrid hash".
> 
> The PG hash join is the simplest possible: build a hash table in memory,
> and match an input stream against it.
> 
> *Hybrid hash* is where you spill the hash to disk in a well-designed
> way. Instead of thinking of it as building a hash table in memory, think
> of it as partitioning one input; if some or all of it fits in memory,
> all the better. The boundary condition is the same. 
> 
> The real wizard of hybrid hash has to be Goetz Graefe, who sadly has now
> joined the MS Borg. He demonstrated that for entire-table joins, hybrid
> hash completely dominates sort-merge. MSSQL now uses what he developed
> as an academic, but I don't know what the patent state is.
> 
> "Grace hash" is the original implementation of hybrid hash:
>   Kitsuregawa, M., Tanaka, H., and Moto-oka, T. (1984).
>   Architecture and Performance of Relational Algebra Machine Grace. 
> 
> 
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Needed: Simplified guide to optimal memory

2005-06-16 Thread Mark Lewis
We run the RPM's for RH 7.3 on our 7.2 install base with no problems.
RPM's as recent as for PostgreSQL 7.4.2 are available here:
ftp://ftp10.us.postgresql.org/pub/postgresql/binary/v7.4.2/redhat/redhat-7.3/

Or you can always compile from source.  There isn't any such thing as a
'supported' package for RH7.2 anyway.

-- Mark Lewis


On Thu, 2005-06-16 at 07:46 -0700, Todd Landfried wrote:
> Yes, it is 7.2. Why? because an older version of our software runs on  
> RH7.3 and that was the latest supported release of Postgresql for  
> RH7.3 (that we can find). We're currently ported to 8, but we still  
> have a large installed base with the other version.
> 
> 
> On Jun 15, 2005, at 7:18 AM, Tom Lane wrote:
> 
> > Dennis Bjorklund <[EMAIL PROTECTED]> writes:
> >
> >> On Wed, 15 Jun 2005, Todd Landfried wrote:
> >>
> >>> NOTICE:  shared_buffers is 256
> >>>
> >
> >
> >> This looks like it's way too low. Try something like 2048.
> >>
> >
> > It also is evidently PG 7.2 or before; SHOW's output hasn't looked  
> > like
> > that in years.  Try a more recent release --- there's usually  
> > nontrivial
> > performance improvements in each major release.
> >
> > regards, tom lane
> >
> > ---(end of  
> > broadcast)---
> > TIP 2: you can get off all lists at once with the unregister command
> > (send "unregister YourEmailAddressHere" to  
> > [EMAIL PROTECTED])
> >
> 
> 
> ---(end of broadcast)---
> TIP 9: the planner will ignore your desire to choose an index scan if your
>   joining column's datatypes do not match


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-07-08 Thread Mark Lewis
On Fri, 2005-07-08 at 16:43 +0200, Enrico Weigelt wrote:
> * PFC <[EMAIL PROTECTED]> wrote:
> 
> 
> > For Python it's the reverse : the MySQL driver is slow and dumb, 
> > and the  postgres driver (psycopg 2) is super fast, handles all 
> > quoting, 
> > and knows  about type conversions, it will automatically convert a 
> > Python List into a  postgres Array and do the right thing with quoting, 
> > and it works both ways  (ie you select a TEXT[] you get a list of 
> > strings all parsed for you). It  knows about all the postgres types (yes 
> > even numeric <=> python Decimal)  and you can even add your own types. 
> > That's really cool, plus the  developer is a friendly guy.
> 
> Is there anything similar for java ?
> 

The postgres JDBC driver is very good-- refer to pgsql-jdbc mailing list
or look at jdbc.postgresql.org.  I've had only limited experience with
the mysql jdbc driver, but it seemed servicable enough, if you can live
with their licensing and feature set.




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] faster INSERT with possible pre-existing row?

2005-07-26 Thread Mark Lewis
Easier and faster than doing the custom trigger is to simply define a
unique index and let the DB enforce the constraint with an index lookup,
something like:

create unique index happy_index ON happy_table(col1, col2, col3);

That should run faster than the custom trigger, but not as fast as the
temp table solution suggested elsewhere because it will need to do an
index lookup for each row.  With this solution, it is important that
your shared_buffers are set high enough that the happy_index can be kept
in memory, otherwise performance will drop precipitously.  Also, if you
are increasing the size of the table by a large percentage, you will
want to ANALYZE periodically, as an optimal plan for a small table may
be a disaster for a large table, and PostgreSQL won't switch plans
unless you run ANALYZE.

-- Mark

On Tue, 2005-07-26 at 14:51 -0500, John A Meinel wrote:
> Matthew Nuzum wrote:
> > On 7/26/05, Dan Harris <[EMAIL PROTECTED]> wrote:
> > 
> >>I am working on a process that will be inserting tens of million rows
> >>and need this to be as quick as possible.
> >>
> >>The catch is that for each row I could potentially insert, I need to
> >>look and see if the relationship is already there  to prevent
> >>multiple entries.  Currently I am doing a SELECT before doing the
> >>INSERT, but I recognize the speed penalty in doing to operations.  I
> >>wonder if there is some way I can say "insert this record, only if it
> >>doesn't exist already".  To see if it exists, I would need to compare
> >>3 fields instead of just enforcing a primary key.
> > 
> > 
> > I struggled with this for a while. At first I tried stored procedures
> > and triggers, but it took very long (over 24 hours for my dataset).
> > After several iterations of rewritting it, first into C# then into
> > Python I got the whole process down to under 30 min.
> > 
> > My scenario is this:
> > I want to normalize log data. For example, for the IP address in a log
> > entry, I need to look up the unique id of the IP address, or if the IP
> > address is new, insert it and then return the newly created entry.
> > Multiple processes use the data, but only one process, run daily,
> > actually changes it. Because this one process knows that the data is
> > static, it selects the tables into in-memory hash tables (C#) or
> > Dictionaries (Python) and then does the lookups there. It is *super*
> > fast, but it uses a *lot* of ram. ;-)
> > 
> > To limit the ram, I wrote a version of the python code that uses gdbm
> > files instead of Dictionaries. This requires a newer version of Python
> > (to allow a gdbm db to work just like a dictionary) but makes life
> > easier in case someone is using my software on a lower end machine.
> > This doubled the time of the lookups from about 15 minutes to 30,
> > bringing the whole process to about 45 minutes.
> > 
> 
> Did you ever try the temp table approach? You could:
> 
> COPY all records into temp_table, with an empty row for ip_id
> -- Get any entries which already exist
> UPDATE temp_table SET ip_id =
>   (SELECT ip_id from ipaddress WHERE add=add)
>   WHERE EXISTS (SELECT ip_id FROM ipaddress WHERE add=add);
> -- Create new entries
> INSERT INTO ipaddress(add) SELECT add FROM temp_table
>WHERE ip_id IS NULL;
> -- Update the rest
> UPDATE temp_table SET ip_id =
>   (SELECT ip_id from ipaddress WHERE add=add)
>   WHERE ip_id IS NULL AND
>   EXISTS (SELECT ip_id FROM ipaddress WHERE add=add);
> 
> This would let the database do all of the updating work in bulk on it's 
> side, rather than you pulling all the data out and doing it locally.
> 
> An alternative would be something like:
> 
> CREATE TEMP TABLE new_ids (address text, ip_id int);
> COPY all potentially new addresses into that table.
> -- Delete all entries which already exist
> DELETE FROM new_ids WHERE EXISTS
>   (SELECT ip_id FROM ipaddresses
> WHERE add=new_ids.address);
> -- Now create the new entries
> INSERT INTO ipaddresses(add) SELECT address FROM new_ids;
> 
> -- At this point you are guaranteed to have all addresses existing in
> -- the database
> 
> If you then insert your full data into the final table, only leave the 
> ip_id column as null. Then if you have a partial index where ip_id is 
> NULL, you could use the command:
> 
> UPDATE final_table SET ip_id =
>   (SELECT ip_id FROM ipaddresses WHERE add=final_table.add)
> WHERE ip_id IS NULL;
> 
> You could also do this in a temporary table, before bulk inserting into 
> the final table.
> 
> I don't know what you have tried, but I know that for Dan, he easily has 
>  > 36M rows. So I don't think he wants to pull that locally and create a 
> in-memory hash just to insert 100 rows or so.
> 
> Also, for your situation, if you do keep a local cache, you could 
> certainly save the cache between runs, and use a temp table to determine 
> what new ids you need to add to it. Then you wouldn't have to pull the 
> complete set each time. You just pull new value

Re: [PERFORM] PG8 Tuning

2005-08-11 Thread Mark Lewis
(Musing, trying to think of a general-purpose performance-tuning rule
that applies here):

Actually, it seems to me that with the addition of the WAL in PostgreSQL
and the subsequent decreased need to fsync the data files themselves
(only during checkpoints?), that the only time a battery-backed write
cache would make a really large performance difference would be on the
drive(s) hosting the WAL.

So although it is in general good to have a dedicated spindle for the
WAL, for many workloads it is in fact significantly better to have the
WAL written to a battery-backed write cache.  The exception would be for
applications with fewer, larger transactions, in which case you could
actually use the dedicated spindle.

Hmmm, on second thought, now I think I understand the rationale behind
having a non-zero commit delay setting-- the problem with putting
pg_xlog on a single disk without a write cache is that frequent fsync()
calls might cause it to spend most of its time seeking instead of
writing (as seems to be happening to Paul here).  Then again, the OS IO
scheduler should take care of this for you, making this a non-issue.
Perhaps Solaris 10 just has really poor IO scheduling performance with
this particular hardware and workload?

Ah well.  Thought myself in circles and have no real conclusions to show
for it.  Posting anyway, maybe this will give somebody some ideas to
work with.

-- Mark Lewis

On Fri, 2005-08-12 at 08:47 +, Steve Poe wrote:
> Paul,
> 
> Before I say anything else, one online document which may be of
> assistance to you is:
> http://www.powerpostgresql.com/PerfList/
> 
> Some thoughts I have:
> 
> 3) You're shared RAM setting seems overkill to me. Part of the challenge
> is you're going from 1000 to 262K with no assessment in between. Each
> situation can be different, but try in the range of 10 - 50K.
> 
> 4) pg_xlog: If you're pg_xlog is on a spindle is *only* for pg_xlog
> you're better off. If it is sharing with any other OS/DB resource, the
> performance will be impacted.
> 
> >From what I have learned from others on this list, RAID5 is not the best
> choice for the database. RAID10 would be a better solution (using 8 of
> your disks) then take the remaining disk and do mirror with your pg_xlog
> if possible.
> 
> Best of luck,
> 
> Steve Poe
> 
> On Thu, 2005-08-11 at 13:23 +0100, Paul Johnson wrote:
> > Hi all, we're running PG8 on a Sun V250 with 8GB RAM and 2*1.3GHz SPARC
> > CPUs running Solaris 10. The DB cluster is on an external fibre-attached
> > Sun T3 array that has 9*36GB drives configured as a single RAID5 LUN.
> > 
> > The system is for the sole use of a couple of data warehouse developers,
> > hence we are keen to use 'aggressive' tuning options to maximise
> > performance.
> > 
> > So far we have made the following changes and measured the impact on our
> > test suite:
> > 
> > 1) Increase checkpoint_segments from 3 to 64. This made a 10x improvement
> > in some cases.
> > 
> > 2) Increase work_mem from 1,024 to 524,288.
> > 
> > 3) Increase shared_buffers from 1,000 to 262,143 (2 GB). This required
> > setting SHMMAX=4294967295 (4 GB) in /etc/system and re-booting the box.
> > 
> > Question - can Postgres only use 2GB RAM, given that shared_buffers can
> > only be set as high as 262,143 (8K pages)?
> > 
> > So far so good...
> > 
> > 4) Move /pg_xlog to an internal disk within the V250. This has had a
> > severe *negative* impact on performance. Copy job has gone from 2 mins to
> > 12 mins, simple SQL job gone from 1 min to 7 mins. Not even run long SQL
> > jobs.
> > 
> > I'm guessing that this is because pg_xlog has gone from a 9 spindle LUN to
> > a single spindle disk?
> > 
> > In cases such as this, where an external storage array with a hardware
> > RAID controller is used, the normal advice to separate the data from the
> > pg_xlog  seems to come unstuck, or are we missing something?
> > 
> > Cheers,
> > 
> > Paul Johnson.
> > 
> > ---(end of broadcast)---
> > TIP 4: Have you searched our list archives?
> > 
> >http://archives.postgresql.org
> 
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Poor performance of delete by primary key

2005-09-06 Thread Mark Lewis
I had a similar problem, so I downloaded 8.1 from CVS, ran it on a
relatively gnarly dev workstation, imported a dump of my 8.0 database,
and ran my troublesome queries with the new EXPLAIN ANALYZE.

This process took about an hour and worked great, provided that you've
actually named your foreign key constraints.  Otherwise, you'll find out
that there's a trigger for a constraint called $3 that's taking up all
of your time, but you won't know what table that constraint is on.

-- Mark



On Tue, 2005-09-06 at 11:32 -0400, Tom Lane wrote:
> "Brian Choate" <[EMAIL PROTECTED]> writes:
> > We are seeing a very strange behavior from postgres. For one of our very =
> > common tasks we have to delete records from a table of around 500,000 =
> > rows. The delete is by id which is the primary key. It seems to be =
> > consistently taking around 10 minutes to preform. This is totally out of =
> > line with the rest of the performance of the database.
> 
> I'll bet this table has foreign-key references from elsewhere, and the
> referencing columns are either not indexed, or not of the same datatype
> as the master column.
> 
> Unfortunately there's no very simple way to determine which FK is the
> problem.  (In 8.1 it'll be possible to do that with EXPLAIN ANALYZE,
> but in existing releases EXPLAIN doesn't break out the time spent in
> each trigger ...)  You have to just eyeball the schema :-(.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Releasing memory during External sorting?

2005-09-23 Thread Mark Lewis
operations != passes.  If you were clever, you could probably write a
modified bubble-sort algorithm that only made 2 passes.  A pass is a
disk scan, operations are then performed (hopefully in memory) on what
you read from the disk.  So there's no theoretical log N lower-bound on
the number of disk passes.

Not that I have anything else useful to add to this discussion, just a
tidbit I remembered from my CS classes back in college :)

-- Mark

On Fri, 2005-09-23 at 13:17 -0400, Tom Lane wrote:
> Ron Peacetree <[EMAIL PROTECTED]> writes:
> > 2= No optimal external sorting algorithm should use more than 2 passes.
> > 3= Optimal external sorting algorithms should use 1 pass if at all possible.
> 
> A comparison-based sort must use at least N log N operations, so it
> would appear to me that if you haven't got approximately log N passes
> then your algorithm doesn't work.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Is There Any Way ....

2005-10-04 Thread Mark Lewis
Which version of PG are you using?  One of the new features for 8.0 was
an improved caching algorithm that was smart enough to avoid letting a
single big query sweep everything else out of cache.

-- Mark Lewis


On Tue, 2005-10-04 at 10:45 -0400, Lane Van Ingen wrote:
> Yes, Stefan, the kind of usage you are mentioning is exactly why I was
> asking.
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Stefan Weiss
> Sent: Tuesday, October 04, 2005 6:32 AM
> To: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Is There Any Way 
> 
> 
> On 2005-09-30 01:21, Lane Van Ingen wrote:
> >   (3) Assure that a disk-based table is always in memory (outside of
> keeping
> > it in
> >   memory buffers as a result of frequent activity which would prevent
> > LRU
> >   operations from taking it out) ?
> 
> I was wondering about this too. IMO it would be useful to have a way to tell
> PG that some tables were needed frequently, and should be cached if
> possible. This would allow application developers to consider joins with
> these tables as "cheap", even when querying on columns that are not indexed.
> I'm thinking about smallish tables like users, groups, *types, etc which
> would be needed every 2-3 queries, but might be swept out of RAM by one
> large query in between. Keeping a table like "users" on a RAM fs would not
> be an option, because the information is not volatile.
> 
> 
> cheers,
> stefan
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings
> 
> 
> 
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] browsing table with 2 million records

2005-10-26 Thread Mark Lewis
Do you have an index on the date column?  Can you post an EXPLAIN
ANALYZE for the slow query?

-- Mark Lewis

On Wed, 2005-10-26 at 13:41 -0700, aurora wrote:
> I am running Postgre 7.4 on FreeBSD. The main table have 2 million
> record (we would like to do at least 10 mil or more). It is mainly a
> FIFO structure with maybe 200,000 new records coming in each day that
> displace the older records.
> 
> We have a GUI that let user browser through the record page by page at
> about 25 records a time. (Don't ask me why but we have to have this
> GUI). This translates to something like
> 
>   select count(*) from table   <-- to give feedback about the DB size
>   select * from table order by date limit 25 offset 0
> 
> Tables seems properly indexed, with vacuum and analyze ran regularly.
> Still this very basic SQLs takes up to a minute run.
> 
> I read some recent messages that select count(*) would need a table
> scan for Postgre. That's disappointing. But I can accept an
> approximation if there are some way to do so. But how can I optimize
> select * from table order by date limit x offset y? One minute
> response time is not acceptable.
> 
> Any help would be appriciated.
> 
> Wy
> 
> 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] RES: Priority to a mission critical transaction

2006-11-29 Thread Mark Lewis
On Wed, 2006-11-29 at 08:25 -0500, Brian Hurt wrote:
...
> I have the same question.  I've done some embedded real-time 
> programming, so my innate reaction to priority inversions is that 
> they're evil.  But, especially given priority inheritance, is there any 
> situation where priority inversion provides *worse* performance than 
> running everything at the same priority?  I can easily come up with 
> situations where it devolves to that case- where all processes get 
> promoted to the same high priority.  But I can't think of one where 
> using priorities makes things worse, and I can think of plenty where it 
> makes things better.
...

It can make things worse when there are at least 3 priority levels
involved.  The canonical sequence looks as follows:

LOW:  Aquire a lock
MED:  Start a long-running batch job that hogs CPU
HIGH: Wait on lock held by LOW task

at this point, the HIGH task can't run until the LOW task releases its
lock. but the LOW task can't run to completion and release its lock
until the MED job completes.

(random musing): I wonder if PG could efficiently be made to temporarily
raise the priority of any task holding a lock that a high priority task
waits on.  I guess that would just make it so that instead of HIGH tasks
being effectively reduced to LOW, then LOW tasks could be promoted to
HIGH.

-- Mark Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] File Systems Compared

2006-12-06 Thread Mark Lewis
> Anyone run their RAIDs with disk caches enabled, or is this akin to
> having fsync off?

Disk write caches are basically always akin to having fsync off.  The
only time a write-cache is (more or less) safe to enable is when it is
backed by a battery or in some other way made non-volatile.

So a RAID controller with a battery-backed write cache can enable its
own write cache, but can't safely enable the write-caches on the disk
drives it manages.

-- Mark Lewis

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow update with simple query

2006-12-13 Thread Mark Lewis
But he's using 8.1.4-- in that version, an explain analyze would list
the time taken to go through triggers, so the fact that we don't see any
of those lines means that it can't be constraint checking, so wouldn't
it have to be the index update overhead?

-- Mark

On Wed, 2006-12-13 at 11:46 -0500, Tom Lane wrote:
> Arnaud Lesauvage <[EMAIL PROTECTED]> writes:
> > Indeed, the new query does not perform that well :
> 
> > "Hash Join  (cost=112.75..307504.97 rows=2024869 width=355) (actual 
> > time=53.995..246443.811 rows=2020061 loops=1)"
> > ...
> > "Total runtime: 2777844.892 ms"
> 
> > I removed all unnecessary indexes on t1 before running the query (I left 
> > the index on uid and the multicolumn index containind the updated field).
> > I believe the multicolumn-functional-index computation is taking some time 
> > here, isn't it ?
> 
> Given that the plan itself only takes 246 sec, there's *something*
> associated with row insertion that's eating the other 2500+ seconds.
> Either index entry computation or constraint checking ...
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Slow update with simple query

2006-12-14 Thread Mark Lewis
Out of curiosity, how hard would it be to modify the output of EXPLAIN
ANALYZE when doing an insert/update to include the index update times
and/or non-FK constraint checking times and/or the table row update
times?  Or any other numbers that might be useful in circumstances like
this.  I'm wondering if it's possible to shed some light on the
remaining dark shadows of PG performance troubleshooting.

-- Mark Lewis

On Thu, 2006-12-14 at 11:19 -0500, Tom Lane wrote:
> Arnaud Lesauvage <[EMAIL PROTECTED]> writes:
> > Tom Lane a crit :
> >> It seems the time must be going into this trigger function.  What
> >> does it do?
> 
> > A lot of things ! Indeed, if it runs it will very badly hurt performances 
> > (table 
> > lookups, string manipulation, etc...) !
> > But it should only be tringered on INSERTs, and I am doing an UPDATE !
> 
> Doh, right, I obviously still need to ingest more caffeine this morning.
> 
> I think the conclusion must be that there was just too much I/O to be
> done to update all the rows.  Have you done any tuning of shared_buffers
> and so forth?  I recall having seen cases where update performance went
> bad as soon as the upper levels of a large index no longer fit into
> shared_buffers ...
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Mark Lewis
On Tue, 2007-02-06 at 12:01 -0500, Merlin Moncure wrote:
> On 2/6/07, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> > On Tue, 2007-02-06 at 10:40, Merlin Moncure wrote:
> > > On 2/6/07, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> > > > On Mon, 2007-02-05 at 18:35, Karen Hill wrote:
> > > > > I have a pl/pgsql function that is inserting 200,000 records for
> > > > > testing purposes.  What is the expected time frame for this operation
> > > > > on a pc with 1/2 a gig of ram and a 7200 RPM disk?   The processor is
> > > > > a 2ghz cpu.  So far I've been sitting here for about 2 million ms
> > > > > waiting for it to complete, and I'm not sure how many inserts postgres
> > > > > is doing per second.
> > > >
> > > > That really depends.  Doing 200,000 inserts as individual transactions
> > > > will be fairly slow.  Since PostgreSQL generally runs in autocommit
> > > > mode, this means that if you didn't expressly begin a transaction, you
> > > > are in fact inserting each row as a transaction.  i.e. this:
> > >
> > > I think OP is doing insertion inside a pl/pgsql loop...transaction is
> > > implied here.
> >
> > Yeah, I noticed that about 10 seconds after hitting send... :)
> 
> actually, I get the stupid award also because RI check to unindexed
> column is not possible :)  (this haunts deletes, not inserts).

Sure it's possible:

CREATE TABLE parent (col1 int4);
-- insert many millions of rows into parent
CREATE TABLE child  (col1 int4 REFERENCES parent(col1));
-- insert many millions of rows into child, very very slowly.


- Mark Lewis



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Mark Lewis
On Tue, 2007-02-06 at 14:06 -0500, Merlin Moncure wrote:
> On 2/6/07, Mark Lewis <[EMAIL PROTECTED]> wrote:
> > > actually, I get the stupid award also because RI check to unindexed
> > > column is not possible :)  (this haunts deletes, not inserts).
> >
> > Sure it's possible:
> >
> > CREATE TABLE parent (col1 int4);
> > -- insert many millions of rows into parent
> > CREATE TABLE child  (col1 int4 REFERENCES parent(col1));
> > -- insert many millions of rows into child, very very slowly.
> 
> the database will not allow you to create a RI link out unless the
> parent table has a primary key/unique constraint, which the database
> backs with an indexand you can't even trick it afterwards by
> dropping the constraint.
> 
> it's the other direction, when you cascade forwards when you can have
> a problem.  this is most common with a delete, but can also happen on
> an update of a table's primary key with child tables referencing it.
> 

Hmmm, should check my SQL before hitting send I guess.  Well, at least
you no longer have to wear the stupid award, Merlin :)

-- Mark Lewis

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] a question about Direct I/O and double buffering

2007-04-05 Thread Mark Lewis
Not to hijack this thread, but has anybody here tested the behavior of
PG on a file system with OS-level caching disabled via forcedirectio or
by using an inherently non-caching file system such as ocfs2?

I've been thinking about trying this setup to avoid double-caching now
that the 8.x series scales shared buffers better, but I figured I'd ask
first if anybody here had experience with similar configurations.

-- Mark

On Thu, 2007-04-05 at 13:09 -0500, Erik Jones wrote:
> On Apr 5, 2007, at 12:09 PM, Xiaoning Ding wrote:
> 
> > Hi,
> > 
> > 
> > A page may be double buffered in PG's buffer pool and in OS's buffer
> > cache.
> > Other DBMS like DB2 and Oracle has provided Direct I/O option to
> > eliminate
> > double buffering. I noticed there were discusses on the list. But
> > I can not find similar option in PG. Does PG support direct I/O now?
> > 
> > 
> > The tuning guide of PG usually recommends a small shared buffer pool
> > (compared
> > to the size of physical memory).  I think it is to avoid swapping.
> > If
> > there were
> > swapping, OS kernel may swap out some pages in PG's buffer pool even
> > PG
> > want to keep them in memory. i.e. PG would loose full control over
> > buffer pool.
> > A large buffer pool is not good because it may
> > 1. cause more pages double buffered, and thus decrease the
> > efficiency of
> > buffer
> > cache and buffer pool.
> > 2. may cause swapping.
> > Am I right?
> > 
> > 
> > If PG's buffer pool is small compared with physical memory, can I
> > say
> > that the
> > hit ratio of PG's buffer pool is not so meaningful because most
> > misses
> > can be
> > satisfied by OS Kernel's buffer cache?
> > 
> > 
> > Thanks!
> 
> 
> To the best of my knowledge, Postgres itself does not have a direct IO
> option (although it would be a good addition).  So, in order to use
> direct IO with postgres you'll need to consult your filesystem docs
> for how to set the forcedirectio mount option.  I believe it can be
> set dynamically, but if you want it to be permanent you'll to add it
> to your fstab/vfstab file.
> 
> 
> erik jones <[EMAIL PROTECTED]>
> software developer
> 615-296-0838
> emma(r)
> 
> 
> 
> 
> 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] a question about Direct I/O and double buffering

2007-04-05 Thread Mark Lewis
...
[snipped for brevity]
...
> 
> > Not to hijack this thread, but has anybody here tested the behavior
> > of
> > PG on a file system with OS-level caching disabled via forcedirectio
> > or
> > by using an inherently non-caching file system such as ocfs2?
> > 
> > 
> > I've been thinking about trying this setup to avoid double-caching
> > now
> > that the 8.x series scales shared buffers better, but I figured I'd
> > ask
> > first if anybody here had experience with similar configurations.
> > 
> > 
> > -- Mark
> 
> 
> Rather than repeat everything that was said just last week, I'll point
> out that we just had a pretty decent discusson on this last week that
> I started, so check the archives.  In summary though, if you have a
> high io transaction load with a db where the average size of your
> "working set" of data doesn't fit in memory with room to spare, then
> direct io can be a huge plus, otherwise you probably won't see much of
> a difference.  I have yet to hear of anybody actually seeing any
> degradation in the db performance from it.  In addition, while it
> doesn't bother me, I'd watch the top posting as some people get pretty
> religious about (I moved your comments down).

I saw the thread, but my understanding from reading through it was that
you never fully tracked down the cause of the factor of 10 write volume
mismatch, so I pretty much wrote it off as a data point for
forcedirectio because of the unknowns.  Did you ever figure out the
cause of that?

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-04-27 Thread Mark Lewis
Maybe he's looking for a switch for initdb that would make it
interactive and quiz you about your expected usage-- sort of a magic
auto-configurator wizard doohicky?  I could see that sort of thing being
nice for the casual user or newbie who otherwise would have a horribly
mis-tuned database.  They could instead have only a marginally mis-tuned
database :)

On Fri, 2007-04-27 at 10:30 -0400, Michael Stone wrote:
> On Fri, Apr 27, 2007 at 09:27:49AM -0400, Carlos Moreno wrote:
> >Notice that the second part of my suggestion covers this --- have 
> >additional
> >switches to initdb so that the user can tell it about estimates on how 
> >the DB
> >will be used:  estimated size of the DB, estimated percentage of 
> >activity that
> >will involve writing, estimated percentage of activity that will be 
> >transactions,
> >percentage that will use indexes, percentage of queries that will be 
> >complex,
> >etc. etc.
> 
> If the person knows all that, why wouldn't they know to just change the 
> config parameters?
> 
> Mike Stone
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Throttling PostgreSQL's CPU usage

2007-05-08 Thread Mark Lewis
1. If you go the route of using nice, you might want to run the 3D
front-end at a higher priority instead of running PG at a lower
priority.  That way apache, php and the other parts all run at the same
priority as PG and just the one task that you want to run smoothly is
elevated.

2. You may not even need separate priorities if you're running on Linux
with a recent kernel and you enable the sleep() calls that you would
need anyway for solution #1 to work.  This is because Linux kernels are
getting pretty good nowadays about rewarding tasks with a lot of sleeps,
although there are some further kernel changes still under development
that look even more promising.

-- Mark

On Tue, 2007-05-08 at 16:27 -0400, Daniel Griscom wrote:
> I'm building a kiosk with a 3D front end accessing PostGIS/PostgreSQL 
> via Apache/PHP. The 3D display is supposed to show smooth motion from 
> location to location, with PostGIS giving dynamically updated 
> information on the locations. Everything runs on the same machine, 
> and it all works, but when I start a query the 3D display stutters 
> horribly. It looks like PostgreSQL grabs hold of the CPU and doesn't 
> let go until it's completed the query.
> 
> I don't need the PostgreSQL query to return quickly, but I must 
> retain smooth animation while the query is being processed. In other 
> words, I need PostgreSQL to spread out its CPU usage so that it 
> doesn't monopolize the CPU for any significant time (more than 50ms 
> or so).
> 
> Possible solutions:
> 
> 1: Set the PostgreSQL task priority lower than the 3D renderer task, 
> and to make sure that the 3D renderer sleep()s enough to let 
> PostgreSQL get its work done. The obvious objection to this obvious 
> solution is "Priority inversion!", but I see that as an additional 
> challenge to be surmounted rather than an absolute prohibition. So, 
> any thoughts on setting the PostgreSQL task priority (including by 
> the much-maligned tool shown at 
> )?
> 
> 2: Some variation of the Cost-Based Vacuum Delay. Hypothetically, 
> this would have the PostgreSQL task sleep() periodically while 
> processing the query, allowing the 3D renderer to continue working at 
> a reduced frame rate. My understanding, however, is that this only 
> works during VACUUM and ANALYZE commands, so it won't help during my 
> SELECT commands. So, any thoughts on using Cost-Based Vacuum Delay as 
> a Cost-Based Select Delay?
> 
> 3: ... some other solution I haven't thought of.
> 
> 
> Any thoughts, suggestions, ideas?
> 
> 
> Thanks,
> Dan
> 

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Ever Increasing IOWAIT

2007-05-18 Thread Mark Lewis
You're not swapping are you?  One explanation could be that PG is
configured to think it has access to a little more memory than the box
can really provide, which forces it to swap once it's been running for
long enough to fill up its shared buffers or after a certain number of
concurrent connections are opened.

-- Mark Lewis

On Fri, 2007-05-18 at 10:45 +1200, Ralph Mason wrote:
> We have a database running on a 4 processor machine.  As time goes by
> the IO gets worse and worse peeking at about 200% as the machine loads
> up.
> 
>  
> 
> The weird thing is that if we restart postgres it’s fine for hours but
> over time it goes bad again.
> 
>  
> 
> (CPU usage graph here
> http://www.flickr.com/photos/[EMAIL PROTECTED]/502596262/ )  You can clearly
> see where the restart happens in the IO area
> 
>  
> 
> This is Postgres  8.1.4 64bit.
> 
>  
> 
> Anyone have any ideas?
> 
>  
> 
> Thanks
> 
> Ralph
> 
>  
> 
> 
> 
> --
> Internal Virus Database is out-of-date.
> Checked by AVG Free Edition.
> Version: 7.5.432 / Virus Database: 268.15.9/573 - Release Date:
> 5/12/2006 4:07 p.m.
> 
> 

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] LIKE search and performance

2007-05-24 Thread Mark Lewis
On Thu, 2007-05-24 at 21:54 +0100, James Mansion wrote:
> > If Sybase is still like SQL Server (or the other way around), it *may*
> > end up scanning the index *IFF* the index is a clustered index. If it's
> > a normal index, it will do a sequential scan on the table.
> >
> >   
> Are you sure its not covered?  Have to check at work - but I'm off next 
> week so it'll have to wait.
> 
> > It's not a win on PostgreSQL, because of our MVCC implementation. We
> > need to scan *both* index *and* data pages if we go down that route, in
> > which case it's a lot faster to just scan the data pages alone.
> >
> >   
> Why do you need to go to all the data pages - doesn't the index 
> structure contain all the keys so
> you prefilter and then check to see if the *matched* items are still in 
> view?  I'll be first to admit I
> know zip about Postgres, but it seems odd - doesn't the index contain 
> copies of the key values?.
> 
> I suspect that I mis-spoke with 'leaf'.  I really just mean 'all index 
> pages with data', since the scan
> does not even need to be in index order, just a good way to get at the 
> data in a compact way.

PG could scan the index looking for matches first and only load the
actual rows if it found a match, but that could only be a possible win
if there were very few matches, because the difference in cost between a
full index scan and a sequential scan would need to be greater than the
cost of randomly fetching all of the matching data rows from the table
to look up the visibility information.  

So yes it would be possible, but the odds of it being faster than a
sequential scan are small enough to make it not very useful.

And since it's basically impossible to know the selectivity of this kind
of where condition, I doubt the planner would ever realistically want to
choose that plan anyway because of its poor worst-case behavior.

-- Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Thousands of tables versus on table?

2007-06-04 Thread Mark Lewis
On Mon, 2007-06-04 at 13:40 -0400, Thomas Andrews wrote:
> I have several thousand clients.  Our clients do surveys, and each survey
> has two tables for the client data,
> 
>responders
>responses
> 
> Frequent inserts into both table.
> 
> Right now, we are seeing significant time during inserts to these two
> tables.

Can you provide some concrete numbers here?  Perhaps an EXPLAIN ANALYZE
for the insert, sizes of tables, stuff like that?

> Some of the indices in tableA and tableB do not index on the client ID
> first.

What reason do you have to think that this matters?

> So, we are considering two possible solutions.
> 
>  (1) Create separate responders and responses tables for each client.
> 
>  (2) Make sure all indices on responders and responses start with the
>   client id (excepting, possibly, the primary keys for these fields) and
>   have all normal operation queries always include an id_client.
> 
> Right now, for example, given a responder and a survey question, we do a
> query in responses by the id_responder and id_survey.  This gives us a
> unique record, but I'm wondering if maintaining the index on
> (id_responder,id_survey) is more costly on inserts than maintaining the
> index (id_client,id_responder,id_survey) given that we also have other
> indices on (id_client,...).
> 
> Option (1) makes me very nervous.  I don't like the idea of the same sorts
> of data being stored in lots of different tables, in part for long-term
> maintenance reasons.  We don't really need cross-client reporting, however.

What version of PG is this?  What is your vacuuming strategy?  Have you
tried a REINDEX to see if that helps?

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Performance query about large tables, lots of concurrent access

2007-06-19 Thread Mark Lewis
On Tue, 2007-06-19 at 09:37 -0400, Karl Wright wrote:
> Alvaro Herrera wrote:
> > Karl Wright wrote:
> > 
> >> This particular run lasted four days before a VACUUM became essential. 
> >> The symptom that indicates that VACUUM is needed seems to be that the 
> >> CPU usage of any given postgresql query skyrockets.  Is this essentially 
> >> correct?
> > 
> > Are you saying you weren't used to run VACUUM all the time?  If so,
> > that's where the problem lies.
> > 
> 
> Postgresql 7.4 VACUUM runs for so long that starting it with a cron job 
> even every 24 hours caused multiple instances of VACUUM to eventually be 
> running in my case.  So I tried to find a VACUUM schedule that permitted 
> each individual vacuum to finish before the next one started.  A vacuum 
> seemed to require 4-5 days with this particular database - or at least 
> it did for 7.4.  So I had the VACUUM schedule set to run every six days.
> 
> I will be experimenting with 8.1 to see how long it takes to complete a 
> vacuum under load conditions tonight.

The longer you wait between vacuuming, the longer each vacuum is going
to take.  

There is of course a point of diminishing returns for vacuum where this
no longer holds true; if you vacuum too frequently the overhead of
running the vacuum will dominate the running time.  But 6 days for a
busy database is probably way, way, way past that threshold.

Generally, the busier the database the more frequently you need to
vacuum, not less.  If your update/delete transaction rate is high enough
then you may need to vacuum multiple times per hour, at least on some
tables.  Playing with autovacuum might help you out here, because it can
look at how badly a vacuum is needed and adjust the vacuuming rate on
the fly on a per-table basis.  Be sure to look up some reasonable
autovacuum settings first; the 8.1 defaults aren't.

-- Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Volunteer to build a configuration tool

2007-06-20 Thread Mark Lewis
On Wed, 2007-06-20 at 11:21 -0400, Greg Smith wrote:
...
> One of the things that was surprising to me when I started looking at the 
> organization of the PostgreSQL buffer cache is how little gross 
> information about its contents is available.  I kept expecting to find a 
> summary section where you could answer questions like "how much of the 
> cache currently has information about index/table X?" used as an input to 
> the optimizer.  I understand that the design model expects much of this is 
> unknowable due to the interaction with the OS cache, and in earlier 
> versions you couldn't make shared_buffers big enough for its contents to 
> be all that interesting, so until recently this wasn't worth collecting.
> 
> But in the current era, where it's feasible to have multi-GB caches 
> efficiently managed by PG and one can expect processor time is relatively 
> cheap, it seems to me one way to give a major boost to the optimizer is to 
> add some overhead to buffer cache management so it collects such 
> information.  When I was trying to do a complete overhaul on the 
> background writer, the #1 problem was that I had to assemble my own 
> statistics on what was inside the buffer cache as it was scanned, because 
> a direct inspection of every buffer is the only way to know things like 
> what percentage of the cache is currently dirty.
...

One problem with feeding the current state of the buffer cache to the
planner is that the planner may be trying to prepare a plan which will
execute 10,000 times.  For many interesting queries, the state of the
cache will be very different after the first execution, as indexes and
active portions of tables are brought in.

For that matter, an early stage of query execution could significantly
change the contents of the buffer cache as seen by a later stage of the
execution, even inside a single query.

I'm not saying that inspecting the buffer cache more is a bad idea, but
gathering useful information with the current planner is a bit tricky.

For purposes of idle speculation, one could envision some non-trivial
changes to PG which would make really slick use this data:

(1) Allow PG to defer deciding whether to perform an index scan or
sequential scan until the moment it is needed, and then ask the buffer
cache what % of the pages from the relevant indexes/tables are currently
cached.

(2) Automatically re-plan prepared queries with some kind of frequency
(exponential in # of executions?  fixed-time?), to allow the plans to
adjust to changes in the buffer-cache.

Besides being hard to build, the problem with these approaches (or any
other approach which takes current temporary state into account) is that
as much as some of us might want to make use of every piece of data
available to make the planner into a super-brain, there are lots of
other folks who just want plan stability.  The more dynamic the system
is, the less predictable it can be, and especially in mission-critical
stuff, predictability matters more than .  Tom said it really well in a
recent post, 

"To me, the worst catch-22 we face in this area is that we'd like the
optimizer's choices of plan to be stable and understandable, but the
real-world costs of queries depend enormously on short-term conditions
such as how much of the table has been sucked into RAM recently by
other queries.  I have no good answer to that one."

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Delete Cascade FK speed issue

2007-07-03 Thread Mark Lewis
On Tue, 2007-07-03 at 08:05 +0200, Patric de Waha wrote:
> Hi,
> I've dbase with about 80 relations.
> On deleting a user, this cascades through all the tables.
> This is very slow, for 20 users it takes 4 hours, with exclusive 
> access to the dbase.
> No other users connected to the dbase.
> 
> Ok I know there will be somewhere a relation with a FK without 
> index, which
> is being scanned sequentially. But how can I find out what postgres 
> is doing
> while it is handling the transaction?
> 
> Is there a way I can find out what postgres does, and where it hangs 
> around, so I know
> where the FK might not be indexed. (The dbase is to big to analyze 
> it by hand).
> 
> The  way I do it now is to check the pg_locks relation, but this is 
> not very representative.
> 
> Is there profiling method for triggers/constraints, or a method 
> which gives me a hint
> why it is taking so long?

In 8.1 and later, an EXPLAIN ANALYZE of the delete will show you the
amount of time spent in each trigger.  Remember that it will still
perform the delete, so if you want to be able to re-run the DELETE over
and over as you add missing indexes, run it in a transaction and
rollback each time.  That will tell you which foreign key constraint
checks are taking up time.  The output will not be nearly as useful if
you don't name your foreign key constraints, but is still better than
nothing.

Alternatively, you can just dump the schema to a text file and spend 30
minutes and some text searching to reconstruct your foreign key
dependency graph rooted at the table in question and check each column
for proper indexes.  We recently did this for a 150 relation database,
it's not as painful as you seem to think it is.  An 80 relation database
is by no means "too big to analyze" :)

-- Mark Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] insert vs select into performance

2007-07-17 Thread Mark Lewis
If you're performing via JDBC, are you using addBatch/executeBatch, or
are you directly executing each insert?  If you directly execute each
insert, then your code will wait for a server round-trip between each
insert.

That still won't get you to the speed of select into, but it should
help.  You could also look at the pgsql-jdbc archives for the JDBC
driver patches which allow you to use COPY-style bulk loading, which
should get you to the performance level of COPY, which should be
reasonably close to the performance of select into.

-- Mark Lewis

On Tue, 2007-07-17 at 22:50 +0200, Thomas Finneid wrote:
> 
> Michael Glaesemann wrote:
> > 
> > On Jul 17, 2007, at 14:38 , Thomas Finneid wrote:
> > 
> >> I was doing some testing on "insert" compared to "select into". I 
> >> inserted 100 000 rows (with 8 column values) into a table, which took 
> >> 14 seconds, compared to a select into, which took 0.8 seconds.
> >> (fyi, the inserts where batched, autocommit was turned off and it all 
> >> happend on the local machine)
> >>
> >> Now I am wondering why the select into is that much faster?
> > 
> > It would be helpful if you included the actual queries you're using, as 
> > there are a number of variables:
> 
> create table ciu_data_type
> (
>   id  integer,
>   loc_id  integer,
>   value1  integer,
>   value2  real,
>   value3  integer,
>   value4  real,
>   value5  real,
>   value6  char(2),
>   value7  char(3),
>   value8  bigint,
>   value9  bigint,
>   value10 real,
>   value11 bigint,
>   value12 smallint,
>   value13 double precision,
>   value14 real,
>   value15 real,
>   value16 char(1),
>   value17 varchar(18),
>   value18 bigint,
>   value19 char(4)
> );
> 
> performed with JDBC
> 
> insert into ciu_data_type (id, loc_id, value3, value5, value8, value9, 
> value10, value11 ) values (?,?,?,?,?,?,?,?)
> 
> select * into ciu_data_type_copy from ciu_data_type
> 
> > 1) If there are any constraints on the original table, the INSERT will 
> > be checking those constraints. AIUI, SELECT INTO does not generate any 
> > table constraints.
> 
> No constraints in this test.
> 
> > 2a) Are you using INSERT INTO foo (foo1, foo2, foo2) SELECT foo1, foo2, 
> > foo3 FROM pre_foo or individual inserts for each row? The former would 
> > be faster than the latter.
> > 
> > 2b) If you are doing individual inserts, are you wrapping them in a 
> > transaction? The latter would be faster.
> 
> disabling autocommit, but nothing more than that
> 
> 
> I havent done this test in a stored function yet, nor have I tried it 
> with a C client so far, so there is the chance that it is java/jdbc that 
> makes the insert so slow. I'll get to that test soon if there is any 
> chance my theory makes sence.
> 
> regards
> 
> thomas
> 
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Optmal tags design?

2007-07-18 Thread Mark Lewis


On Wed, 2007-07-18 at 14:26 -0700, [EMAIL PROTECTED] wrote:
> I am planning to add a tags (as in the "web 2.0" thing) feature to my web
> based application. I would like some feedback from the experts here on
> what the best database design for that would be.
> 
> The possibilities I have come up with are:
> * A tags table containing the tag and id number of what it links to.
> select pid from tags where tag='bla'
> select tag from tags where pid=xxx.

Properly indexed, this schema can handle common lookups such as 'show me
all pictures with tag X'.

The problem here is that any operation involving all tags (for example,
'show me a list of all tags in the database') may be slow and/or
awkward.

> * a tags table where each tag exists only once, and a table with the tag
> ID and picture ID to link them together.

This sounds the most reasonable, and is the "right way" to do it in the
relational model.  Can handle common queries such as 'show me all
pictures with tag X'.  Can also easily perform queries such as 'show me
a list of all tags in the database'.

This also gives you a logical place to store additional information for
each tag, such as the user and timestamp of the first usage of the tag,
or a cache of the approximate number of pictures with that tag (for a
future performance optimization, maybe), or whatever else you can think
up that might be useful to store on a per-tag level.

> select pid from tags inner join picture_tags using(tag_id) where tag='bla'
> select tag from tags inner join picture_tags using(tag_id) where pid='xxx'
> 
> * A full text index in the picture table containing the tags
> 
> select pid from pictures where tags @@ to_tsquery('bla')
> (or the non-fti version)
> select pid from pictures where tags ~* '.*bla.*'
> 
> select tags from pictures where pid=xxx;

I'm not experienced with full text indexing so perhaps I'm wrong about
this, but it seems like it would give you approximately the same
flexibility as #1 in terms of your data model.  The only reason I can
think of why you might want this over #1 would be for a performance
improvement, but if there's a reasonably small number of distinct tags
and/or distinct tags per picture I can't imagine it being much faster
than #1.

-- Mark Lewis

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] disk filling up

2007-07-26 Thread Mark Lewis
On Thu, 2007-07-26 at 09:18 -0700, Brandon Shalton wrote:
> Hello all,
> 
> My hard disk is filling up in the /base  directory  to where it has consumed 
> all 200gig of that drive.
> 
> All the posts that i see keep saying move to a bigger drive, but at some 
> point a bigger drive would just get consumed.
> 
> How can i keep the disk from filling up other than get like a half TB setup 
> just to hold the ./base/*  folder


Ummm, don't put more 200G worth of data in there?  :)

You didn't give us any information about what you're using the database
for, why you think that using 200G is excessive, what version of the
database you're running, stuff like that.  So really there's nothing
that we can help you out with, except for the normal "read the manual
about vacuuming and make sure you're doing it" newbie answer.

-- Mark

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Performance problems with large telemetric datasets on 7.4.2

2007-08-03 Thread Mark Lewis
On Fri, 2007-08-03 at 06:52 -0700, Sven Clement wrote:
> Hello everybody,
> 
> as I'm new to this list I hope that it is the right place to post this
> and also the right format, so if I'm committing an error, I apologize
> in advance.
> 
> First the background of my request: 
> 
> I'm currently employed by an enterprise which has approx. 250 systems
> distributed worldwide which are sending telemetric data to the main
> PostgreSQL.
> The remote systems are generating about 10 events per second per
> system which accumulates to about 2500/tps. 
> The data is stored for about a month before it is exported and finally
> deleted from the database.
> On the PostgreSQL server are running to databases one with little
> traffic (about 750K per day) and the telemetric database with heavy
> write operations all around the day (over 20 million per day). 
> We already found that the VACUUM process takes excessively long and as
> consequence the database is Vacuumed permanently.
> 
> The hardware is a IBM X306m Server, 3.2 GHz HT (Pentium IV), 1 GB RAM
> and 2x 250 GB HDD (SATA-II) with ext3 fs, one of the HDD is dedicated
> to database. 
> OS is Debian 3.1 Sarge with PostgreSQL 7.4.7 (7.4.7-6sarge1) with the
> libpq frontend library.
> 
> Now the problem:
> 
> The problem we are experiencing is that our queries are slowing down
> continuously even if we are performing queries on the index which is
> the timestamp of the event, a simple SELECT query with only a simple
> WHERE clause (< or >) takes very long to complete. So the database
> becomes unusable for production use as the data has to be retrieved
> very quickly if we want to act based on the telemetric data. 
> 
> So I'm asking me if it is useful to update to the actual 8.2 version
> and if we could experience performance improvement only by updating.
> 
> Thank you for your answers,
> Sven Clement

Upgrading from 7.4.x to 8.2.x will probably give you a performance
benefit, yes.  There have been numerous changes since the days of 7.4.

But you didn't really give any information about why the query is
running slow.  Specifically, could you provide the query itself, some
information about the tables/indexes/foreign keys involved, and an
EXPLAIN ANALYZE for one of the problematic queries?

Also, what kind of vacuuming regimen are you using?  Just a daily cron
maybe?  Are you regularly analyzing the tables?

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Postgres optimizer

2007-08-03 Thread Mark Lewis
On Fri, 2007-08-03 at 13:58 -0400, Mouhamadou Dia wrote:
> Hello,
> 
> I have a Postgres instance (version 8.1) running on a Solaris 10
> machine. When I run the following query
> 
>  
> 
> SELECT *  FROM PROR_ORG,  ( ( ( ( (PRPT_PRT LEFT OUTER JOIN
> PRPT_PRTADR ON 
> 
> PRPT_PRT.PRT_NRI = PRPT_PRTADR.PRT_NRI AND
> PRPT_PRTADR.ADR_F_DEF=true)  
> 
> LEFT OUTER JOIN PLGE_CTY ON PRPT_PRTADR.CTY_NRI = PLGE_CTY.CTY_NRI)
> LEFT 
> 
> OUTER JOIN PLGE_CTY1 PLGE_CTY_PLGE_CTY1 ON PLGE_CTY.CTY_NRI = 
> 
> PLGE_CTY_PLGE_CTY1.CTY_NRI AND PLGE_CTY_PLGE_CTY1.LNG_CD = 'fr')
> LEFT 
> 
> OUTER JOIN PLGE_CTRSD ON PRPT_PRTADR.CTRSD_CD = PLGE_CTRSD.CTRSD_CD 
> 
> AND PRPT_PRTADR.CTR_ISO_CD = PLGE_CTRSD.CTR_ISO_CD)  LEFT OUTER JOIN 
> 
> PLGE_CTR ON PRPT_PRTADR.CTR_ISO_CD = PLGE_CTR.CTR_ISO_CD) , PROR_ORG1 
> 
> PROR_ORG_PROR_ORG1, PROR_ORGT, PROR_ORGT1 PROR_ORGT_PROR_ORGT1 
> 
> WHERE (  (PROR_ORG.ORGT_CD = PROR_ORGT.ORGT_CD) AND 
> 
> (PROR_ORGT.ORGT_CD = PROR_ORGT_PROR_ORGT1.ORGT_CD AND 
> 
> PROR_ORGT_PROR_ORGT1.LNG_CD = 'fr') AND (PROR_ORG.PRT_NRI = 
> 
> PROR_ORG_PROR_ORG1.PRT_NRI AND PROR_ORG_PROR_ORG1.LNG_CD = 'fr') AND 
> 
> (PROR_ORG.PRT_NRI = PRPT_PRT.PRT_NRI) )  AND ( ((PROR_ORG.ORGT_CD
> ='CHAIN')) )
> 
>  
> 
> it takes 45 seconds to run. In this case the optimizer does a
> sequential scan of the PRPT_PRT table (which is the largest one)
> despite the existence of an index on PRT_NRI column of PRPT_PRT table.
> 
>  I’ve activated the GEQO but it still takes nearly the same time to
> run (between 40 and 45s).
> 
> When I change the order of PRPT_PRT and PROR_ORG tables, it takes only
> 30 milliseconds to run. In this case the optimizer uses the index on
> PRT_NRI column of PRPT_PRT table, what is normal and what I was
> expecting.
> 
> Is there a known problem with the Postgres optimizer?
> 
> For your information, the same query takes 20 milliseconds to run on
> Informix and 60 milliseconds to run on Oracle independently of the
> order of the tables in the query.
> 
>  
> 
> PRPT_PRT has 1.3 millions rows
> 
> PRPT_PRTADR has 300.000 rows
> 
> PROR_ORG has 1500 rows
> 
> These are the largest tables, all the others are small tables. All
> statistics are up to date.

If I recall correctly, PG 8.2 was the first version where the planner
supported reordering outer joins.  Prior releases would get poor
performance unless the joins were listed in the right order.

So it is quite possible that upgrading to 8.2 would solve your problem.
Do you have the ability to try that?

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Indexscan is only used if we use "limit n"

2007-08-15 Thread Mark Lewis
On Wed, 2007-08-15 at 16:36 -0300, Sebastián Baioni wrote:
> Hello,
> Whe are running PostgreSQL 8.2.0 on amd64-portbld-freebsd6.2, compiled
> by GCC cc (GCC) 3.4.6 [FreeBSD] 20060305.
> The query only uses the index if we have a "limit n":
> 
> Without "Limit n"
> explain
> select esapcuit, esapcuil
> from esact00 t1
> order by esapcuit, esapcuil
> 
> Sort  (cost=843833.82..853396.76 rows=3825177 width=30)
>   Sort Key: esapcuit, esapcuil
>   ->  Seq Scan on esact00 t1  (cost=0.00..111813.77 rows=3825177
> width=30)
> 
> With "Limit n"
> explain
> select esapcuit, esapcuil
> from esact00 t1
> order by esapcuit, esapcuil
> limit 1

This isn't really unexpected-- it's faster to do a full sequential scan
of a table than it is to do a full index traversal over the table.  And
usually it's still cheaper even after sorting the results of the full
table scan.

So as near as we can tell, PG is just doing what it's supposed to do and
picking the best plan it can.

You didn't really ask a question-- is this causing problems somehow, or
were you just confused by the behavior?

-- Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: Fwd: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Mark Lewis
It looks like your view is using a left join to look for rows in one
table without matching rows in the other, i.e. a SQL construct similar
in form to the query below:

SELECT ...
FROM A LEFT JOIN B ON (...)
WHERE B.primary_key IS NULL

Unfortunately there has been a planner regression in 8.2 in some cases
with these forms of queries.  This was discussed a few weeks (months?)
ago on this forum.  I haven't looked closely enough to confirm that this
is the problem in your case, but it seems likely.  Is it possible to
refactor the query to avoid using this construct to see if that helps?

We've been holding back from upgrading to 8.2 because this one is a
show-stopper for us.

-- Mark Lewis

On Tue, 2007-08-28 at 11:24 -0500, Evan Carroll wrote:
> -- Forwarded message --
> From: Evan Carroll <[EMAIL PROTECTED]>
> Date: Aug 28, 2007 11:23 AM
> Subject: Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)
> To: Scott Marlowe <[EMAIL PROTECTED]>
> 
> 
> On 8/28/07, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> > I looked through your query plan, and this is what stood out in the 8.2 
> > plan:
> >
> >  ->  Nested Loop Left Join  (cost=8830.30..10871.27 rows=1
> > width=102) (actual time=2148.444..236018.971 rows=62 loops=1)
> >Join Filter: ((public.contact.pkid =
> > public.contact.pkid) AND (public.event.ts_in > public.event.ts_in))
> >Filter: (public.event.pkid IS NULL)
> >
> > Notice the misestimation is by a factor of 62, and the actual time
> > goes from 2149 to 236018 ms.
> >
> > Again, have you analyzed your tables  / databases?
> >
> contacts=# \o scott_marlowe_test
> contacts=# VACUUM FULL ANALYZE;
> contacts=# SELECT * FROM test_view WHERE U_ID = 8;
> Cancel request sent
> ERROR:  canceling statement due to user request
> contacts=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE U_ID = 8;
> 
> output found at http://rafb.net/p/EQouMI82.html
> 
> --
> Evan Carroll
> System Lord of the Internets
> [EMAIL PROTECTED]
> 832-445-8877
> 
> 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] LIKE query verses =

2007-08-29 Thread Mark Lewis
On Wed, 2007-08-29 at 18:01 +0530, Karthikeyan Mahadevan wrote:
> 
> *
>  
> 1)   
> 
> EXPLAIN ANALYSE SELECT
> job_category.job_id,job.name,job.state,job.build_id,cat.name as
> reporting_group 
> FROM category,job_category,job,category as cat 
> WHERE job.job_id=job_category.job_id 
> AND job_category.category_id=category.category_id 
> AND cat.build_id=category.build_id 
> AND category.name = 'build_id.pap3260-20070828_01' 
> AND cat.name like ('reporting_group.Tier2%'); 
> 
> QUERY PLAN 
> --
>  
>  Nested Loop  (cost=0.00..291.53 rows=8 width=103) (actual
> time=98.999..385.590 rows=100 loops=1) 
>->  Nested Loop  (cost=0.00..250.12 rows=9 width=34) (actual
> time=98.854..381.106 rows=100 loops=1) 
>  ->  Nested Loop  (cost=0.00..123.22 rows=1 width=34) (actual
> time=98.717..380.185 rows=1 loops=1) 
>->  Index Scan using idx_cat_by_name on category cat
>  (cost=0.00..5.97 rows=1 width=34) (actual time=95.834..245.276
> rows=977 loops=1) 
>  Index Cond: (((name)::text >=
> 'reporting'::character varying) AND ((name)::text <
> 'reportinh'::character varying)) 
>  Filter: ((name)::text ~~
> 'reporting_group.Tier2%'::text) 
>->  Index Scan using idx_cat_by_bld_id on category
>  (cost=0.00..117.24 rows=1 width=8) (actual time=0.134..0.134 rows=0
> loops=977) 
>  Index Cond: ("outer".build_id =
> category.build_id) 
>  Filter: ((name)::text =
> 'build_id.pap3260-20070828_01'::text) 
>  ->  Index Scan using idx_jcat_by_cat_id on job_category
>  (cost=0.00..126.00 rows=71 width=8) (actual time=0.126..0.569
> rows=100 loops=1) 
>Index Cond: (job_category.category_id =
> "outer".category_id) 
>->  Index Scan using job_pkey on job  (cost=0.00..4.59 rows=1
> width=73) (actual time=0.033..0.036 rows=1 loops=100) 
>  Index Cond: (job.job_id = "outer".job_id) 
> 
>  Total runtime: 385.882 ms 
> --
>  

Remember that using LIKE causes PG to interpret an underscore as 'any
character', which means that it can only scan the index for all records
that start with 'reporting', and then it needs to apply a filter to each
match.  This is going to be slower than just going directly to the
matching index entry.

What you probably want to do is tell PG that you're looking for a
literal underscore and not for any matching character by escaping the
underscore, that will allow it to do a much quicker index scan.
Something like:

cat.name like 'reporting|_group.Tier2%' ESCAPE '|'

-- Mark Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] [Solved] Postgres performance problem

2007-08-30 Thread Mark Lewis
Perhaps you had a long-running transaction open (probably a buggy or
hung application) that was preventing dead rows from being cleaned up.
Restarting PG closed the offending connection and rolled back the
transaction, which allowed vacuum to clean up all the dead rows.

If you're not running regular VACUUMs at all but are instead exclusively
running VACUUM FULL, then I don't think you would see warnings about
running out of fsm enties, which would explain why you did not notice
the bloat.  I haven't confirmed that though, so I might be wrong.

-- Mark Lewis

On Thu, 2007-08-30 at 11:50 +0200, Ruben Rubio wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> 
> 
> Hi ...
> 
> Seems its solved. But the problem is not found.
> 
> As you may know, I do a vacuum full and a reindex database each day. I
> have logs that confirm that its done and I can check that everything was
>  fine.
> 
> So, this morning, I stopped the website, I stopped database, started it
> again. (I was around 200  days without restarting), then I vacuum
> database and reindex it (Same command as everyday) . Restart again, and
> run again the website.
> 
> Now seems its working fine. But I really does not know where is the
> problem. Seems vacuum its not working fine?  Maybe database should need
> a restart? I really don't know.
> 
> Does someone had a similar problem?
> 
> Thanks in advance,
> Ruben Rubio
> 
> 
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> 
> iD8DBQFG1pLMIo1XmbAXRboRAqgQAKCkWcZYE8RDppEVI485wDLnIW2SfQCfV+Hj
> e8PurQb2TOSYDPW545AJ83c=
> =dQgM
> -END PGP SIGNATURE-
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] 8.2 Autovacuum BUG ?

2007-08-31 Thread Mark Lewis
On Fri, 2007-08-31 at 12:25 -0400, Pallav Kalva wrote:
> Can you please correct me if I am wrong, I want to understand how this 
> works.
> Based on what you said, it will run autovacuum again when it passes 200M 
> transactions, as SELECTS are transactions too and are going on these 
> tables.
> But the next time when it runs autovacuum, it shouldnt freeze the tuples 
> again as they are already frozen and wont generate lot of archive logs ?
> Or is this because of it ran autovacuum for the first time on this db ? 
> just the first time it does this process ?

That is correct.  The tuples are now frozen, which means that they will
not need to be frozen ever again unless you insert/update any records.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: Fwd: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-31 Thread Mark Lewis
On Fri, 2007-08-31 at 19:39 -0400, Tom Lane wrote:
> I wrote:
> > Mark Lewis <[EMAIL PROTECTED]> writes:
> >> We've been holding back from upgrading to 8.2 because this one is a
> >> show-stopper for us.
> 
> > Well, you could always make your own version with this patch reverted:
> > http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php
> > I might end up doing that in the 8.2 branch if a better solution
> > seems too large to back-patch.
> 
> I thought of a suitably small hack that should cover at least the main
> problem without going so far as to revert that patch entirely.  What we
> can do is have the IS NULL estimator recognize when the clause is being
> applied at an outer join, and not believe the table statistics in that
> case.  I've applied the attached patch for this --- are you interested
> in trying it out on your queries before 8.2.5 comes out?

Wish I could, but I'm afraid that I'm not going to be in a position to
try out the patch on the application that exhibits the problem for at
least the next few weeks.

-- Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] join tables vs. denormalization by trigger

2007-09-04 Thread Mark Lewis
On Tue, 2007-09-04 at 20:53 +0200, Walter Mauritz wrote:
> Hi,
> 
> I wonder about differences in performance between two scenarios:
> 
> Background:
> Table A, ~50,000 records
> Table B, ~3,000,000 records (~20 cols)
> Table C, ~30,000,000 records (~10 cols)
> 
> a query every 3sec. with limit 10
> 
> Table C depends on Table B wich depends on Table A, int8 foreign key, btree 
> index
> 
> * consider it a read only scenario (load data only in night, with time for 
> vacuum analyze daily)
> * im required to show records from Table C, but also with some (~5cols) info 
> from Table B 
> * where clause always contains the foreign key to Table A
> * where clause may contain further 1-10 search parameter
> 
> 
> Scenario A)
> simply inner join Table B + C
> 
> Scenario B)
> with use of trigger on insert/update I could push the required information 
> from table B down to table C.
> -> so i would only require to select from table C.
> 
> 
> My question:
> 1) From your experience ... how much faster (approximately) in percent do you 
> regard Scenario B faster than A ?

You're assuming that B is always going to be faster than A, which
certainly isn't a foregone conclusion.  Let's say that you average 10
bytes per column.  In scenario A, the total data size is then roughly
3,000,000 * 20 * 10 + 30,000,000 * 10 * 10 = 3.6 GiB.  In scenario B due
to your denormalization, the total data size is more like 30,000,000 *
30 * 10 = 9 GiB, or 2.5 times more raw data.

That's a lot of extra disk IO, unless your database will always fit in
memory in both scenarios.

Although you didn't provide enough data to answer with certainty, I
would go on the assumption that A is going to be faster than B.  But
even if it weren't, remember that premature optimization is the root of
all evil.  If you try A and it doesn't perform fast enough, then you can
always try B later to see if it works any better.

-- Mark Lewis

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] SAN vs Internal Disks

2007-09-06 Thread Mark Lewis
On Thu, 2007-09-06 at 18:05 +0530, Harsh Azad wrote:
> Hi,
> 
> We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon,
> 8GB RAM, 4x SAS 146 GB 15K RPM on RAID 5.
> 
> The current data size is about 50GB, but we want to purchase the
> hardware to scale to about 1TB as we think our business will need to
> support that much soon. 
> - Currently we have a 80% read and 20% write perecntages. 
> - Currently with this configuration the Database is showing signs of
> over-loading.
> - Auto-vaccum, etc run on this database, vaccum full runs nightly.
> - Currently CPU loads are about 20%, memory utilization is full (but
> this is also due to linux caching disk blocks) and IO waits are
> frequent.
> - We have a load of about 400 queries per second
> 
> Now we are considering to purchase our own servers and in the process
> are facing the usual dilemmas. First I'll list out what machine we
> have decided to use: 
> 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now)
> 32 GB RAM
> OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1
> (Data Storage mentioned below)
> 
> We have already decided to split our database into 3 machines on the
> basis on disjoint sets of data. So we will be purchasing three of
> these boxes. 
> 
> HELP 1: Does something look wrong with above configuration, I know
> there will be small differences b/w opetron/xeon. But do you think
> there is something against going for 2.4Ghz Quad Xeons (clovertown i
> think)?
> 
> HELP 2: The main confusion is with regards to Data Storage. We have
> the option of going for:
> 
> A: IBM N-3700 SAN Box, having 12x FC 300GB disks,  Partitioned into 3
> disks into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2
> hot spare. We are also considering similar solution from EMC -
> CX310C. 
> 
> B: Go for Internal of DAS based storage. Here for each server we
> should be able to have: 2x disks on RAID-1 for logs, 6x disks on
> RAID-10 for tablespace1 and 6x disks on RAID-10 for tablespace2. Or
> maybe 12x disks on RAID-10 single table-space. 
> 
> What do I think? Well..
> SAN wins on manageability, replication (say to a DR site), backup,
> etc...
> DAS wins on cost
> 
> But for a moment keeping these aside, i wanted to discuss, purely on
> performance side which one is a winner? It feels like internal-disks
> will perform better, but need to understand a rough magnitude of
> difference in performance to see if its worth loosing the
> manageability features. 
> 
> Also if we choose to go with DAS, what would be the best tool to do
> async replication to DR site and maybe even as a extra plus a second
> read-only DB server to distribute select loads.

Sounds like a good candidate for Slony replication for backups /
read-only slaves.

I haven't seen a SAN yet whose DR / replication facilities are on par
with a good database replication solution.  My impression is that those
facilities are mostly for file servers, mail servers, etc.  It would be
difficult for a SAN to properly replicate a database given the strict
ordering, size and consistency requirements for the data files.  Not
impossible, but in my limited experience I haven't found one that I
trust to do it reliably either, vendor boastings to the contrary
notwithstanding.  (Hint: make sure you know exactly what your vendor's
definition of the term 'snapshot' really means).

So before you invest in a SAN, make sure that you're actually going to
be able to (and want to) use all the nice management features you're
paying for.  We have some SAN's that are basically acting just as
expensive external RAID arrays because we do the database
replication/backup in software anyway.

-- Mark Lewis



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] SAN vs Internal Disks

2007-09-06 Thread Mark Lewis
On Thu, 2007-09-06 at 22:28 +0530, Harsh Azad wrote:
> Thanks Mark.
> 
> If I replicate a snapshot of Data and log files (basically the entire
> PG data directory) and I maintain same version of postgres on both
> servers, it should work right?
> 
> I am also thinking that having SAN storage will provide me with
> facility of keeping a warm standby DB. By just shutting one server
> down and starting the other mounting the same File system I should be
> able to bing my DB up when the primary inccurs a physical failure. 
> 
> I'm only considering SAN storage for this feature - has anyone ever
> used SAN for replication and warm standy-by on Postgres?
> 
> Regards,
> Harsh


We used to use a SAN for warm standby of a database, but with Oracle and
not PG.  It worked kinda sorta, except for occasional crashes due to
buggy drivers.

But after going through the exercise, we realized that we hadn't gained
anything over just doing master/slave replication between two servers,
except that it was more expensive, had a tendency to expose buggy
drivers, had a single point of failure in the SAN array, failover took
longer and we couldn't use the warm standby server to perform read-only
queries.  So we reverted back and just used the SAN as expensive DAS and
set up a separate box for DB replication.

So if that's the only reason you're considering a SAN, then I'd advise
you to spend the extra money on more DAS disks.

Maybe I'm jaded by past experiences, but the only real use case I can
see to justify a SAN for a database would be something like Oracle RAC,
but I'm not aware of any PG equivalent to that.

-- Mark Lewis

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Index files

2007-09-14 Thread Mark Lewis
On Sat, 2007-09-15 at 01:51 +0530, Harsh Azad wrote:
> Great, creating new tablespace for indexes worked! Now the question is
> whether existing tables/index can be moved to the new tablespace using
> an alter command or the only way possible is to drop and recreate
> them?

You can alter an existing index: 

http://www.postgresql.org/docs/8.2/static/sql-alterindex.html



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Low CPU Usage

2007-09-19 Thread Mark Lewis
On Wed, 2007-09-19 at 10:38 -0700, [EMAIL PROTECTED] wrote:
> Hi all.
> Recently I have installed a brand new server with a Pentium IV 3.2
> GHz, SATA Disk, 2GB of Ram in Debian 4.0r1 with PostgreSQL 8.2.4
> (previously a 8.1.9).
> I have other similar server with an IDE disk, Red Hat EL 4 and
> PostgreSQL 8.2.3
> 
> I have almost the same postgresql.conf in both servers, but in the new
> one (I have more work_mem than the other one) things go really slow.
> I began to monitor i/o disk and it's really ok, I have test disk with
> hdparm and it's 5 times faster than the IDE one.
> Running the same queries in both servers in the new one it envolves
> almost 4 minutes instead of 18 seconds in the old one.
> Both databases are the same, I have vacuum them and I don't know how
> to manage this issue.
> The only weird thing is than in the older server running the query it
> uses 30% of CPU instead of 3 o 5 % of the new one!!!
> What's is happening with this server? I upgrade from 8.1.9 to 8.2.4
> trying to solve this issue but I can't find a solution.
> 
> Any ideas?

It could be a planning issue.  Have you analyzed the new database to
gather up-to-date statistics?  A comparison of EXPLAIN ANALYZE results
for an example query in both systems should answer that one.

Another possibility because you're dealing with lower-end drives is that
you have a case of one drive lying about fsync where the other is not.
If possible, try running your test with fsync=off on both servers.  If
there's a marked improvement on the new server but no significant change
on the old server then you've found your culprit.

-- Mark Lewis


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] quickly getting the top N rows

2007-10-04 Thread Mark Lewis
On Thu, 2007-10-04 at 11:00 -0700, Ben wrote:
> If I have this:
> 
> create table foo (bar int primary key);
> 
> ...then in my ideal world, Postgres would be able to use that index on bar 
> to help me with this:
> 
> select bar from foo order by bar desc limit 20;
> 
> But in my experience, PG8.2 is doing a full table scan on foo, then 
> sorting it, then doing the limit. I have a more complex primary key, but I 
> was hoping the same concept would still apply. Am I doing something wrong, 
> or just expecting something that doesn't exist?

It has to do with the way that NULL values are stored in the index.
This page has details and instructions for how to get it to work:

http://developer.postgresql.org/pgdocs/postgres/indexes-ordering.html

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Autovacuum running out of memory

2007-10-16 Thread Mark Lewis
On Tue, 2007-10-16 at 10:14 -0400, Jason Lustig wrote:
> I ran "ulimit -a" for the postgres user, and here's what I got:
...
> max memory size (kbytes, -m) 20
> open files  (-n) 100
> max user processes  (-u) 100
> virtual memory  (kbytes, -v) 20
...

These settings are all quite low for a dedicated database server, they
would be more appropriate for a small development instance of PG sharing
a machine with several other processes.

Others have commented on the memory settings, but depending on the
maximum number of connections you expect to have open at any time you
may want to consider increasing the max user processes and open files
settings as well.

-- Mark Lewis

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] help tuning queries on large database

2006-01-10 Thread Mark Lewis
Ron,

A few days back you mentioned:

> Upgrade your kernel to at least  2.6.12
> There's a known issue with earlier versions of the 2.6.x kernel and 
> 64b CPUs like the Opteron.  See kernel.org for details.
> 

I did some searching and couldn't find any obvious mention of this issue
(I gave up after searching through the first few hundred instances of
"64" in the 2.6.12 changelog).

Would you mind being a little more specific about which issue you're
talking about?  We're about to deploy some new 16GB RAM Opteron DB
servers and I'd like to check and make sure RH backported whatever the
fix was to their current RHEL4 kernel.

Thanks,
Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Extremely irregular query performance

2006-01-11 Thread Mark Lewis
If this is a query that will be executed more than once, you can also
avoid incurring the planning overhead multiple times by using PREPARE.

-- Mark Lewis

On Wed, 2006-01-11 at 18:50 -0500, Jean-Philippe Côté wrote:
> Thanks a lot for this info, I was indeed exceeding the genetic
> optimizer's threshold.  Now that it is turned off, I get
> a very stable response time of 435ms (more or less 5ms) for
> the same query. It is about three times slower than the best
> I got with the genetic optimizer on, but the overall average
> is much lower.
> 
> I'll also try to play with the geqo parameters and see if things
> improve.
> 
> Thanks again,
> 
> J-P


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Suspending SELECTs

2006-01-16 Thread Mark Lewis
On Mon, 2006-01-16 at 11:13 +0100, Alessandro Baretta wrote:
> I am aware that what I am dreaming of is already available through cursors, 
> but 
> in a web application, cursors are bad boys, and should be avoided. What I 
> would 
> like to be able to do is to plan a query and run the plan to retreive a 
> limited 
> number of rows as well as the executor's state. This way, the burden of 
> maintaining the cursor "on hold", between activations of the web resource 
> which 
> uses it, is transferred from the DBMS to the web application server, and, 
> most 
> importantly, the responsibility for garbage-collecting stale cursors is 
> implicitely delegated to the garbage-collector of active user sessions. 
> Without 
> this mechanism, we are left with two equally unpleasant solutions: first, any 
> time a user instantiates a new session, a new cursor would have to be 
> declared 
> for all relevant queries, and an ad-hoc garbage collection daemon would have 
> to 
> be written to periodically scan the database for stale cursors to be closed; 
> otherwise, instead of using cursors, the web application could resort to 
> OFFSET-LIMIT queries--no garbage collection issues but pathetic performance 
> and 
> server-load.
> 
> Do we have any way out?
> 
> Alex

I know that Tom has pretty much ruled out any persistent cursor
implementation in the database, but here's an idea for a workaround in
the app:

Have a pool of connections used for these queries.  When a user runs a
query the first time, create a cursor and remember that this user
session is associated with that particular connection.  When the user
tries to view the next page of results, request that particular
connection from the pool and continue to use the cursor.  Between
requests, this connection could of course be used to service other
users.

This avoids the awfulness of tying up a connection for the entire course
of a user session, but still allows you to use cursors for
performance.  

When a user session is invalidated or times out, you remove the mapping
for this connection and close the cursor.  Whenever there are no more
mappings for a particular connection, you can use the opportunity to
close the current transaction (to prevent eternal transactions).

If the site is at all busy, you will need to implement a pooling policy
such as 'do not open new cursors on the connection with the oldest
transaction', which will ensure that all transactions can be closed in a
finite amount of time, the upper bound on the duration of a transaction
is (longest_session_duration * connections in pool).

Limitations:

1. You shouldn't do anything that acquires write locks on the database
using these connections, because the transactions will be long-running.
To mitigate this, use a separate connection pool.

2. Doesn't work well if some queries take a long time to run, because
other users may need to wait for the connection, and another connection
won't do.

3. If this is a busy web site, you might end up with potentially many
thousands of open cursors.  I don't know if this introduces an
unacceptable performance penalty or other bottleneck in the server?

-- Mark Lewis

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Suspending SELECTs

2006-01-17 Thread Mark Lewis
> I am developing my applications in Objective Caml, and I have written the 
> middleware layer myself. I could easily implement a cursor-pooling strategy, 
> but 
> there is no perfect solution to the problem of guaranteeing that cursors be 
> closed. Remember that web applications require the user to "open a session" 
> by 
> connecting the appropriate HTTP resource, but users as never required to log 
> out. Hence, in order to eventually reclaim all cursors, I must use magical 
> "log-out detection" algorithm, which is usually implemented with a simple 
> timeout. This guarantees the required property of safety (the population of 
> cursors does not diverge) but does not guarantee the required property of 
> liveness (a user connecting to the application, who has opened a session but 
> has 
> not logged out, and thus possesses a session token, should have access the 
> execution context identified by his token).

I fail to see the problem here.  Why should "liveness" be a required
property?  If is it simply that you can't promptly detect when a user is
finished with their web session so you can free resources, then remember
that there is no requirement that you dedicate a connection to their
session in the first place.  Even if you're using your own custom
middleware, it isn't a very complicated or conceptually difficult thing
to implement (see my previous post).  Certainly it's simpler than
allowing clients to pass around runtime state.

As far as implementing this sort of thing in the back-end, it would be
really hard with the PostgreSQL versioning model.  Oracle can more
easily (and kind of does) support cursors like you've described because
they implement MVCC differently than PostgreSQL, and in their
implementation you're guaranteed that you always have access to the most
recent x megabytes of historical rows, so even without an open
transaction to keep the required rows around you can still be relatively
sure they'll be around for "long enough".  In PostgreSQL, historical
rows are kept in the tables themselves and periodically vacuumed, so
there is no such guarantee, which means that you would need to either
implement a lot of complex locking for little material gain, or just
hold the cursors in moderately long-running transactions, which leads
back to the solution suggested earlier.

-- Mark Lewis



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Query planner issue

2006-01-30 Thread Mark Lewis
You have lots of dead rows.  Do a vacuum full to get it under control,
then run VACUUM more frequently and/or increase your FSM settings to
keep dead rows in check.  In 7.2 vacuum is pretty intrusive; it will be
much better behaved once you can upgrade to a more recent version.

You really, really want to upgrade as soon as possible, and refer to the
on-line docs about what to do with your FSM settings.

-- Mark Lewis


On Mon, 2006-01-30 at 23:57 +0100, Emmanuel Lacour wrote:
> Hi everybody,
> 
> I have the following problem, on a test server, if I do a fresh import
> of production data then run 
> 'explain analyze select count(*) from mandats;'
> 
> I get this result:
> 
> Aggregate  (cost=6487.32..6487.32 rows=1 width=0) (actual time=607.61..607.61 
> rows=1 loops=1)
>   ->  Seq Scan on mandats  (cost=0.00..6373.26 rows=45626 width=0) (actual 
> time=0.14..496.20 rows=45626 loops=1)
>   Total runtime: 607.95 msec
> 
> 
> On the production server, if I do the same (without other use of the server), 
> I get:
> 
> Aggregate  (cost=227554.33..227554.33 rows=1 width=0) (actual 
> time=230705.79..230705.79 rows=1 loops=1)
>   ->  Seq Scan on mandats  (cost=0.00..227440.26 rows=45626 width=0) (actual 
> time=0.03..230616.64 rows=45760 loops=1)
>   Total runtime: 230706.08 msec
> 
> 
> 
> Is there anyone having an idea on how yo solve this poor performances? I
> think it is caused by many delete/insert on this table every day, but
> how to solve it, I need to run this qury each hour :(. I run
> vacuum each night, postgresql is unfortunatly 7.2.1 :( (no upgrade
> before 2 or 3 months).
> 

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Mark Lewis
Machine 1: $2000
Machine 2: $2000
Machine 3: $2000

Knowing how to rig them together and maintain them in a fully fault-
tolerant way: priceless.


(Sorry for the off-topic post, I couldn't resist).

-- Mark Lewis

On Wed, 2006-02-15 at 09:19 -0800, Craig A. James wrote:
> Jeremy Haile wrote:
> > We are a small company looking to put together the most cost effective
> > solution for our production database environment.  Currently in
> > production Postgres 8.1 is running on this machine:
> > 
> > Dell 2850
> > 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
> > 4 GB DDR2 400 Mhz
> > 2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
> > 4 x 146 GB 10K SCSI RAID 10 (for postgres data)
> > Perc4ei controller
> > 
> > ... I sent our scenario to our sales team at Dell and they came back with
> > all manner of SAN, DAS, and configuration costing as much as $50k.
> 
> Given what you've told us, a $50K machine is not appropriate.
> 
> Instead, think about a simple system with several clones of the database and 
> a load-balancing web server, even if one machine could handle your load.  If 
> a machine goes down, the load balancer automatically switches to the other.
> 
> Look at the MTBF figures of two hypothetical machines:
> 
>  Machine 1: Costs $2,000, MTBF of 2 years, takes two days to fix on average.
>  Machine 2: Costs $50,000, MTBF of 100 years (!), takes one hour to fix on 
> average.
> 
> Now go out and buy three of the $2,000 machines.  Use a load-balancer front 
> end web server that can send requests round-robin fashion to a "server farm". 
>  Clone your database.  In fact, clone the load-balancer too so that all three 
> machines have all software and databases installed.  Call these A, B, and C 
> machines.
> 
> At any given time, your Machine A is your web front end, serving requests to 
> databases on A, B and C.  If B or C goes down, no problem - the system keeps 
> running.  If A goes down, you switch the IP address of B or C and make it 
> your web front end, and you're back in business in a few minutes.
> 
> Now compare the reliability -- in order for this system to be disabled, you'd 
> have to have ALL THREE computers fail at the same time.  With the MTBF and 
> repair time of two days, each machine has a 99.726% uptime.  The "MTBF", that 
> is, the expected time until all three machines will fail simultaneously, is 
> well over 100,000 years!  Of course, this is silly, machines don't last that 
> long, but it illustrates the point:  Redundancy is beats reliability (which 
> is why RAID is so useful). 
> 
> All for $6,000.
> 
> Craig
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Mark Lewis
On Thu, 2006-02-16 at 12:15 -0500, Tom Lane wrote:
> Once or twice we've kicked around the idea of having some
> datatype-specific sorting code paths alongside the general-purpose one,
> but I can't honestly see this as being workable from a code maintenance
> standpoint.
> 
>   regards, tom lane


It seems that instead of maintaining a different sorting code path for
each data type, you could get away with one generic path and one
(hopefully faster) path if you allowed data types to optionally support
a 'sortKey' interface by providing a function f which maps inputs to 32-
bit int outputs, such that the following two properties hold:

f(a)>=f(b) iff a>=b
if a==b then f(a)==f(b)

So if a data type supports the sortKey interface you could perform the
sort on f(value) and only refer back to the actual element comparison
functions when two sortKeys have the same value.

Data types which could probably provide a useful function for f would be
int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII).

Depending on the overhead, you might not even need to maintain 2
independent search code paths, since you could always use f(x)=0 as the
default sortKey function which would degenerate to the exact same sort
behavior in use today.

-- Mark Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Mark Lewis
On Thu, 2006-02-16 at 17:51 -0500, Greg Stark wrote:
> > > Data types which could probably provide a useful function for f would be
> > > int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII).
> 
> How exactly do you imagine doing this for text?
> 
> I could see doing it for char(n)/varchar(n) where n<=4 in SQL_ASCII though.


In SQL_ASCII, just take the first 4 characters (or 8, if using a 64-bit
sortKey as elsewhere suggested).  The sorting key doesn't need to be a
one-to-one mapping.

-- Mark Lewis

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Mark Lewis
On Thu, 2006-02-16 at 21:33 -0800, David Lang wrote:
> > In SQL_ASCII, just take the first 4 characters (or 8, if using a 64-bit
> > sortKey as elsewhere suggested).  The sorting key doesn't need to be a
> > one-to-one mapping.
> 
> that would violate your second contraint ( f(a)==f(b) iff (a==b) )
> 
> if you could drop that constraint (the cost of which would be extra 'real' 
> compares within a bucket) then a helper function per datatype could work 
> as you are talking.

I think we're actually on the same page here; you're right that the
constraint above ( f(a)==f(b) iff a==b ) can't be extended to data types
with more than 32 bits of value space.  But the constraint I listed was
actually:

if a==b then f(a)==f(b)

Which doesn't imply 'if and only if'.  It's a similar constraint to
hashcodes; the same value will always have the same hash, but you're not
guaranteed that the hashcodes for two distinct values will be unique.

-- Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Postgres and Ingres R3 / SAN

2006-03-07 Thread Mark Lewis
On Tue, 2006-03-07 at 13:00 -0600, Jim C. Nasby wrote:

...

> PostgreSQL on a SAN won't buy you what I think you think it will. It's
> essentially impossible to safely run two PostgreSQL installs off the
> same data files without destroying your data. What a SAN can buy you is
> disk-level replication, but I've no experience with that.

It is possible to run two instances against the same SAN using tools
such as RedHat's Cluster Suite.  We use that in-house as a cheap
alternative for Oracle clustering, although we're not using it for our
PostgreSQL servers yet.  It's not for load balancing, just
active/passive fault tolerance.

-- Mark Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Database possible corruption , unsolvable mystery

2006-03-29 Thread Mark Lewis
Can you post an explain analyze for the delete query?  That will at
least tell you if it is the delete itself which is slow, or a trigger /
referential integrity constraint check.  Which version of PG is this?

-- Mark Lewis

On Wed, 2006-03-29 at 12:58 -0500, Eric Lauzon wrote:
> Greetings,
> 
>   We have an issue where we have a database with many tables.
>   The layout of the database is 3 set of look alike tables with different 
> names.
>   Each set of tables has some referential integrety that point back to 
> the main
>   control table.
> 
>   On two set of tables we are able to efficiently delete referential and 
> main record
>   without a problems, but on the 3rd set we have an issue where the 
> control table is clugged
>   and delete seem to take forever , as example on the first two set a 
> delete of 60K record take about
>   4 second to 10 second but on the 3rd set it can take as long as 3hours.
> 
>   This seem to be only affecting one database , the schema and way of 
> doing is replicated elsewhere
>   and if efficient.
> 
>   The question is, even after droping 3rd set integrity , dumping the 
> table data , deleting the table,
>   doing a manual checkpoint and recreating the table with the dump data , 
> with or without referential
>   integrity , the problems still araise.
> 
>   If we copy the data from the live table and do a create table aaa as 
> select * from problematic_table;
>   the table aaa operations are normal and efficient.
> 
>   This is why our investigation brought us to the folowing questions:
> 
>   1. Are postgresql data file name are hashed references to table name(as 
> oracle)? [~path to data EX:/var/log/pgsql/data/[arbitraty 
>  numbers]/[datafile]]?
> 
>   2. If the data files are corrupted and we re-create is it possible it 
> uses the same files thus creating the same issue?
> 
>   3. Since we know that all the tables has that problems is there an 
> internal table with undisclosed references to tables data files?
>   
>   I hope the questions were clear.
> 
>   Have a good day, and thank you in advance.
> 
> 
> Eric Lauzon
> [Recherche & Développement]
> Above Sécurité / Above Security
> Tél  : (450) 430-8166
> Fax : (450) 430-1858
> 
> AVERTISSEMENT CONCERNANT LA CONFIDENTIALITÉ 
> 
> Le présent message est à l'usage exclusif du ou des destinataires mentionnés 
> ci-dessus. Son contenu est confidentiel et peut être assujetti au secret 
> professionnel. Si vous avez reçu le présent message par erreur, veuillez nous 
> en aviser immédiatement et le détruire en vous abstenant d'en faire une 
> copie, d'en divulguer le contenu ou d'y donner suite.
> 
> CONFIDENTIALITY NOTICE
> 
> This communication is intended for the exclusive use of the addressee 
> identified above. Its content is confidential and may contain privileged 
> information. If you have received this communication by error, please notify 
> the sender and delete the message without copying or disclosing it.
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Hardware: HP StorageWorks MSA 1500

2006-04-20 Thread Mark Lewis
Hmmm.  We use an MSA 1000 with Fibre Channel interconnects.  No real
complaints, although I was a little bit disappointed by the RAID
controller's battery-backed write cache performance; tiny random writes
are only about 3 times as fast with write caching enabled as with it
disabled, I had (perhaps naively) hoped for more.  Sequential scans from
our main DB (on a 5-pair RAID 10 set with 15k RPM drives) get roughly
80MB/sec.

Getting the redundant RAID controllers to fail over correctly on Linux
was a big headache and required working the tech support phone all day
until we finally got to the deep guru who knew the proper undocumented
incantations.

-- Mark Lewis

On Thu, 2006-04-20 at 20:00 +0200, Mikael Carneholm wrote:
> We're going to get one for evaluation next week (equipped with dual
> 2Gbit HBA:s and 2x14 disks, iirc). Anyone with experience from them,
> performance wise?
> 
> Regards,
> Mikael
> 
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] hardare config question

2006-04-28 Thread Mark Lewis
It's also possible that the single SATA drive you were testing (or the
controller it was attached to) is lying about fsync and performing write
caching behind your back, whereas your new controller and drives are
not.

You'll find a lot more info on the archives of this list about it, but
basically if your application is committing a whole lot of small
transactions, then it will run fast (but not safely) on a drive which
lies about fsync, but slower on a better disk subsystem which doesn't
lie about fsync.

Try running a test with fsync=off with your new equipment and if it
suddenly starts running faster, then you know that's the problem.
You'll either have a choice of losing all of your data the next time the
system shuts down uncleanly but being fast, or of running slow, or of
fixing the applications to use chunkier transactions.

-- Mark

On Fri, 2006-04-28 at 13:36 -0400, Vivek Khera wrote:
> On Apr 28, 2006, at 11:37 AM, Erik Myllymaki wrote:
> 
> > When I had this installed on a single SATA drive running from the  
> > PE1800's on-board SATA interface, this operation took anywhere from  
> > 65-80 seconds.
> >
> > With my new RAID card and drives, this operation took 272 seconds!?
> 
> switch it to RAID10 and re-try your experiment.  if that is fast,  
> then you know your raid controller does bad RAID5.
> 
> anyhow, I have in one server (our office mail server and part-time  
> development testing box) an adaptec SATA RAID from dell.  it is  
> configured for RAID5 and does well for normal office stuff, but when  
> we do postgres tests on it, it just is plain old awful.
> 
> but I have some LSI based cards on which RAID5 is plenty fast and  
> suitable for the DB, but those are SCSI.
> 
> For what it is worth, the Dell PE1850 internal PERC4/Si card is  
> wicked fast when hooked up with a pair of U320 SCSI drives.
> 
> 
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] hardare config question

2006-05-01 Thread Mark Lewis
A UPS will make it less likely that the system will reboot and destroy
your database due to a power failure, but there are other causes for a
system reboot.

With a BBU, the only component that can fail and cause catastrophic data
loss is the RAID itself.

With a UPS, you are additionally vulnerable to OS crashes, failures in
non-RAID hardware, UPS failures, or anything else that would necessitate
a hard reboot.  

So a UPS is a decent replacement for a BBU only if you trust your app
server/OS more than you value your data.

-- Mark Lewis


On Mon, 2006-05-01 at 10:58 -0700, Erik Myllymaki wrote:
> I have been in discussion with 3ware support and after adjusting some 
> settings, 
> the 3ware card in RAID 1 gets better performance than the single drive. I 
> guess 
> this had everything to do with the write (and maybe read?) cache.
> 
> Of course now i am in a dangerous situation - using volatile write cache 
> without a BBU.
> 
> If I were to use a UPS to ensure a soft shutdown in the event of power loss, 
> am 
> I somewhat as safe as if I were to purchase a BBU for this RAID card?
> 
> 
> 
> Thanks.
> 
> Mark Lewis wrote:
> > It's also possible that the single SATA drive you were testing (or the
> > controller it was attached to) is lying about fsync and performing write
> > caching behind your back, whereas your new controller and drives are
> > not.
> > 
> > You'll find a lot more info on the archives of this list about it, but
> > basically if your application is committing a whole lot of small
> > transactions, then it will run fast (but not safely) on a drive which
> > lies about fsync, but slower on a better disk subsystem which doesn't
> > lie about fsync.
> > 
> > Try running a test with fsync=off with your new equipment and if it
> > suddenly starts running faster, then you know that's the problem.
> > You'll either have a choice of losing all of your data the next time the
> > system shuts down uncleanly but being fast, or of running slow, or of
> > fixing the applications to use chunkier transactions.
> > 
> > -- Mark
> > 
> > On Fri, 2006-04-28 at 13:36 -0400, Vivek Khera wrote:
> >> On Apr 28, 2006, at 11:37 AM, Erik Myllymaki wrote:
> >>
> >>> When I had this installed on a single SATA drive running from the  
> >>> PE1800's on-board SATA interface, this operation took anywhere from  
> >>> 65-80 seconds.
> >>>
> >>> With my new RAID card and drives, this operation took 272 seconds!?
> >> switch it to RAID10 and re-try your experiment.  if that is fast,  
> >> then you know your raid controller does bad RAID5.
> >>
> >> anyhow, I have in one server (our office mail server and part-time  
> >> development testing box) an adaptec SATA RAID from dell.  it is  
> >> configured for RAID5 and does well for normal office stuff, but when  
> >> we do postgres tests on it, it just is plain old awful.
> >>
> >> but I have some LSI based cards on which RAID5 is plenty fast and  
> >> suitable for the DB, but those are SCSI.
> >>
> >> For what it is worth, the Dell PE1850 internal PERC4/Si card is  
> >> wicked fast when hooked up with a pair of U320 SCSI drives.
> >>
> >>
> >>
> >> ---(end of broadcast)---
> >> TIP 3: Have you checked our extensive FAQ?
> >>
> >>http://www.postgresql.org/docs/faq
> > 
> > ---(end of broadcast)---
> > TIP 9: In versions below 8.0, the planner will ignore your desire to
> >choose an index scan if your joining column's datatypes do not
> >match

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow restoration question

2006-05-02 Thread Mark Lewis
They are not equivalent.  As I understand it, RAID 0+1 performs about
the same as RAID 10 when everything is working, but degrades much less
nicely in the presence of a single failed drive, and is more likely to
suffer catastrophic data loss if multiple drives fail.

-- Mark

On Tue, 2006-05-02 at 12:40 -0600, Brendan Duddridge wrote:
> Everyone here always says that RAID 5 isn't good for Postgres. We  
> have an Apple Xserve RAID configured with RAID 5. We chose RAID 5  
> because Apple said their Xserve RAID was "optimized" for RAID 5. Not  
> sure if we made the right decision though. They give an option for  
> formatting as RAID 0+1. Is that the same as RAID 10 that everyone  
> talks about? Or is it the reverse?
> 
> Thanks,
> 
> 
> Brendan Duddridge | CTO | 403-277-5591 x24 |  [EMAIL PROTECTED]
> 
> ClickSpace Interactive Inc.
> Suite L100, 239 - 10th Ave. SE
> Calgary, AB  T2G 0V9
> 
> http://www.clickspace.com
> 
> On May 2, 2006, at 11:16 AM, Jim C. Nasby wrote:
> 
> > On Wed, Apr 26, 2006 at 05:14:41PM +0930, Eric Lam wrote:
> >> all dumpfiles total about 17Gb. It has been running for 50ish hrs  
> >> and up
> >> to about the fourth file (5-6 ish Gb) and this is on a raid 5 server.
> >
> > RAID5 generally doesn't bode too well for performance; that could be
> > part of the issue.
> > -- 
> > Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
> > Pervasive Software  http://pervasive.comwork: 512-231-6117
> > vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
> >
> > ---(end of  
> > broadcast)---
> > TIP 4: Have you searched our list archives?
> >
> >http://archives.postgresql.org
> >
> 
> 
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] performance question (something to do w/

2006-05-08 Thread Mark Lewis
Doing a SELECT with a large list of variables inside an IN runs slowly
on every database we've tested.  We've tested mostly in Oracle and
PostgreSQL, and both get very slow very quickly (actually Oracle refuses
to process the query at all after it gets too many bind parameters).

In our case, we have a (potentially quite large) set of external values
that we want to look up in the database.  We originally thought that
doing a single select with a large IN clause was the way to go, but then
we did some performance analysis on the optimal batch size (number of
items to include per IN clause), and discovered that for most databases,
the optimal batch size was 1.  For PostgreSQL I think it was 2.

The moral of the story is that you're probably better off running a
bunch of small selects than in trying to optimize things with one
gargantuan select.

-- Mark Lewis

On Mon, 2006-05-08 at 13:29 -0400, Jeffrey Tenny wrote:
> Why does this query take so long?  (PostgreSQL 8.0.3, FC4)
> Hopefully I have provided enough information below.
> 
> LOG:  statement: SELECT * FROM x WHERE f IN 
> ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,\
> $25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63\
> ,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100,$101,\
> $102,$103,$104,$105,$106,$107,$108,$109,$110,$111,$112,$113,$114,$115,$116,$117,$118,$119,$120,$121,$122,$123,$124,$125,$126,$127,$128,$129,$130,$131,$132,\
> $133,$134,$135,$136,$137,$138,$139,$140,$141,$142,$143,$144,$145,$146,$147,$148,$149,$150,$151,$152,$153,$154,$155,$156,$157,$158,$159,$160,$161,$162,$163,\
> $164,$165,$166,$167,$168,$169,$170,$171,$172,$173,$174,$175,$176,$177,$178,$179,$180,$181,$182,$183,$184,$185,$186,$187,$188,$189,$190,$191,$192,$193,$194,\
> $195,$196,$197,$198,$199,$200,$201,$202,$203,$204,$205,$206,$207,$208,$209,$210,$211,$212,$213,$214,$215,$216,$217,$218,$219,$220,$221,$222,$223,$224,$225,\
> $226,$227,$228,$229,$230,$231,$232,$233,$234,$235,$236,$237,$238,$239,$240,$241,$242,$243,$244,$245,$246,$247,$248,$249,$250,$251,$252,$253,$254,$255,$256,\
> $257,$258,$259,$260,$261,$262,$263,$264,$265,$266,$267,$268,$269,$270,$271,$272,$273,$274,$275,$276,$277,$278,$279,$280,$281,$282,$283,$284,$285,$286,$287,\
> $288,$289,$290,$291,$292,$293,$294,$295,$296,$297,$298,$299,$300,$301,$302,$303,$304,$305,$306,$307,$308,$309,$310,$311,$312,$313,$314,$315,$316,$317,$318,\
> $319,$320,$321,$322,$323,$324,$325,$326,$327,$328,$329,$330,$331,$332,$333,$334,$335,$336,$337,$338,$339,$340,$341,$342,$343,$344,$345,$346,$347,$348,$349,\
> $350,$351,$352,$353,$354,$355,$356,$357,$358,$359,$360,$361,$362,$363,$364,$365,$366,$367,$368,$369,$370,$371,$372,$373,$374,$375,$376,$377,$378,$379,$380,\
> $381,$382,$383,$384,$385,$386,$387,$388,$389,$390,$391,$392,$393,$394,$395,$396,$397,$398,$399,$400,$401,$402,$403,$404,$405,$406,$407,$408,$409,$410,$411,\
> $412,$413,$414,$415,$416,$417,$418,$419,$420,$421,$422,$423,$424,$425,$426,$427,$428,$429,$430,$431,$432,$433,$434,$435,$436,$437,$438,$439,$440,$441,$442,\
> $443,$444,$445,$446,$447,$448,$449,$450,$451,$452,$453,$454,$455,$456,$457,$458,$459,$460,$461,$462,$463,$464,$465,$466,$467,$468,$469,$470,$471,$472,$473,\
> $474,$475,$476,$477,$478,$479,$480,$481,$482,$483,$484,$485,$486,$487,$488,$489,$490,$491,$492,$493,$494,$495,$496,$497,$498,$499,$500,$501,$502,$503,$504,\
> $505,$506,$507,$508,$509,$510,$511,$512,$513,$514,$515,$516,$517,$518,$519,$520,$521,$522,$523,$524,$525,$526,$527,$528,$529,$530,$531,$532,$533,$534,$535,\
> $536,$537,$538,$539,$540,$541,$542,$543,$544,$545,$546,$547,$548,$549,$550,$551,$552,$553,$554,$555,$556,$557,$558,$559,$560,$561,$562,$563,$564,$565,$566,\
> $567,$568,$569,$570,$571,$572,$573,$574,$575,$576,$577,$578,$579,$580,$581,$582,$583,$584,$585,$586,$587,$588,$589,$590,$591,$592,$593,$594,$595,$596,$597,\
> $598,$599,$600,$601,$602,$603,$604,$605,$606,$607,$608,$609,$610,$611,$612,$613,$614,$615,$616,$617,$618,$619,$620,$621,$622,$623,$624,$625,$626,$627,$628,\
> $629,$630,$631,$632,$633,$634,$635,$636,$637,$638,$639,$640,$641,$642,$643,$644,$645,$646,$647,$648,$649,$650)
>  
> ORDER BY f,c
> 
> LOG:  EXECUTOR STATISTICS
> DETAIL:  ! system usage stats:
>  !   10.282945 elapsed 10.23 user 0.048992 system sec
>  !   [25.309152 user 0.500923 sys total]
>  !   0/0 [0/0] filesystem blocks in/out
>  !   0/0 [0/10397] page faults/reclaims, 0 [0] swaps
>  !   0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
>  !   0/15 [291/55] voluntary/involuntary context switches
>  ! buffer usage stats:
>  !   Shared blocks:   

Re: [PERFORM] Getting even more insert performance (250m+rows/day)

2006-05-24 Thread Mark Lewis
> The data gets inserted in batches every 5 minutes and I potentially  
> have people querying it constantly, so I can't remove and re-create  
> the index.

How live does your data need to be?  One possibility would be to use a
separate table for each batch instead of a separate table per day,
create the indexes after the import and only after the indexes have been
created make the table available for user queries.

You'd be trading latency for throughput in that case.

Also, you mentioned that you're CPU-bound, but that you have multiple
CPU's.  In that case, performing N concurrent imports (where N is the
number of processor cores available) might be a win over a single-
threaded import.

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] is it possible to make this faster?

2006-05-25 Thread Mark Lewis
On Thu, 2006-05-25 at 16:52 -0400, Tom Lane wrote:
> "Merlin Moncure" <[EMAIL PROTECTED]> writes:
> > been doing a lot of pgsql/mysql performance testing lately, and there
> > is one query that mysql does much better than pgsql...and I see it a
> > lot in normal development:
> 
> > select a,b,max(c) from t group by a,b;
> 
> > t has an index on a,b,c.
> 
> The index won't help, as per this comment from planagg.c:
> 
>* We don't handle GROUP BY, because our current implementations of
>* grouping require looking at all the rows anyway, and so there's not
>* much point in optimizing MIN/MAX.
> 
> Given the numbers you mention (300k rows in 2000 groups) I'm not
> convinced that an index-based implementation would help much; we'd
> still need to fetch at least one record out of every 150, which is
> going to cost near as much as seqscanning all of them.

Well, if the MySQL server has enough RAM that the index is cached (or
index + relevant chunks of data file if using InnoDB?) then that would
explain how MySQL can use an index to get fast results.

-- Mark Lewis

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Initial database loading and IDE x SCSI

2006-06-02 Thread Mark Lewis
On Fri, 2006-06-02 at 15:25 -0300, [EMAIL PROTECTED] wrote:
> Hi,
> 
> I would like to know if my supposition is right.
> 
> Considering an environment with only one hard disk attached to a server, an
> initial loading of the database probably is much faster using an IDE/ATA
> interface with write-back on than using an SCSI interface. That´s because of
> the SCSI command interface overhead.

No, it's because the SCSI drive is honoring the database's request to
make sure the data is safe.

> Then main advantage of SCSI interfaces, the multiuser environment is lost in
> this scenery.
> 
> Am I right? Am I missing something here?
> 
> Even if I´m right, is something that could be done too improove SCSI loading
> performance in this scenery?

You can perform the initial load in large transactions.  The extra
overhead for ensuring that data is safely written to disk will only be
incurred once per transaction, so try to minimize the number of
transactions.

You could optionally set fsync=off in postgresql.conf, which means that
the SCSI drive will operate with no more safety than an IDE drive.  But
you should only do that if you're willing to deal with catastrophic data
corruption.  But if this is for a desktop application where you need to
support IDE drives, you'll need to deal with that anyway.

> Thanks in advance!
> 
> Reimer
> 
> 
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: RES: [PERFORM] Initial database loading and IDE x SCSI

2006-06-02 Thread Mark Lewis
On Fri, 2006-06-02 at 16:54 -0300, [EMAIL PROTECTED] wrote:
> > <[EMAIL PROTECTED]> writes:
> > > I would like to know if my supposition is right.
> >
> > > Considering an environment with only one hard disk attached to
> > a server, an
> > > initial loading of the database probably is much faster using an IDE/ATA
> > > interface with write-back on than using an SCSI interface.
> > That´s because of
> > > the SCSI command interface overhead.
> >
> > I *seriously* doubt that.
> >
> > If you see a difference in practice it's likely got more to do with the
> > SCSI drive not lying about write-complete ...
> >
> 
> Many thanks for the answers! There are some more thinks I could not
> understand about this issue?
> 
> I was considering it but if you have a lot of writes operations, will not
> the disk cache full quickly?
> 
> If it´s full will not the system wait until something could be write to the
> disk surface?
> 
> If you have almost all the time the cache full will it not useless?
> 
> Should not, in this scenary, with almost all the time the cache full, IDE
> and SCSI write operations have almost the same performance?
> 

This is the ideal case.  However, you only get to that case if you use
large transactions or run with fsync=off or run with a write cache (like
IDE drives, or nice RAID controllers which have a battery-backed cache).

Remember that one of the important qualities of a transaction is that
it's durable, so once you commit it the data is definitely stored on the
disk and one nanosecond later you could power the machine off and it
would still be there.

To achieve that durability guarantee, the system needs to make sure that
if you commit a transaction, the data is actually written to the
physical platters on the hard drive.

This means that if you take the naive approach to importing data (one
row at a time, each in its own transaction), then instead of blasting
data onto the hard drive at maximum speed, the application will wait for
the platter to rotate to the right position, write one row's worth of
data, then wait for the platter to rotate to the right position again
and insert another row, etc.  This approach is very slow.

The naive approach works on IDE drives because they don't (usually)
honor the request to write the data immediately, so it can fill its
write cache up with several megabytes of data and write it out to the
disk at its leisure.

> Thanks in advance,
> 
> Reimer
> 

-- Mark

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: RES: RES: [PERFORM] Initial database loading and IDE x SCSI

2006-06-02 Thread Mark Lewis
On Fri, 2006-06-02 at 17:37 -0300, [EMAIL PROTECTED] wrote:
> Many thanks Mark,
> 
> I will consider fsync=off only to do an initial load, not for a database 
> normal operation.
> 

This approach works well.  You just need to remember to shut down the
database and start it back up again with fsync enabled for it to be safe
after the initial load.

> I was just thinking about this hipotetical scenario: 
> a) a restore database operation
> b) fsync off
> c) write-back on (IDE)
> 
> As I could understand, in this sceneraio, it´s normal the IDE drive be faster 
> than the SCSI, ok?
> 

If fsync is off, then the IDE drive loses its big advantage, so IDE and
SCSI should be about the same speed.

> Of course, the database is exposed because of the fsync=off, but if you 
> consider only the system performance, then it is true. Isn´t it?



> Thanks,
> 
> Reimer
> 
> 

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Mark Lewis
No.  You need fsync on in order to force the data to get TO the NetApp
at the right time.  With fsync off, the data gets cached in the
operating system.

-- Mark Lewis

On Wed, 2006-06-14 at 14:48 -0700, Dan Gorman wrote:
> All,
>So I thought I'd pose this question:
> 
> If I have a pg database attached to a powervault (PV) with just an  
> off-the-shelf SCSI card I generally want fsync on to prevent data  
> corruption in case the PV should loose power.
> However, if I have it attached to a NetApp that ensures data writes  
> to via the NVRAM can I safely turn fsync off to gain additional  
> performance?
> 
> Best Regards,
> Dan Gorman
> 
> 
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Postgres consuming way too much memory???

2006-06-15 Thread Mark Lewis
On Thu, 2006-06-15 at 11:34 -0400, Tom Lane wrote:
> "jody brownell" <[EMAIL PROTECTED]> writes:
> > When postgresql starts to go into this bloating state, I can only make it 
> > happen from my java app.
> 
> That's interesting.  The JDBC driver uses protocol features that aren't
> used by psql, so it's possible that the leak is triggered by one of
> those features.  I wouldn't worry too much about duplicating the problem
> from psql anyway --- a Java test case will do fine.
> 
> > I am going to try closing the connection after each TX to see if this
> > resolves it for now. If not, I will write a java app, stored procedure
> > (table etc) reproduce it without our application.


Just to mention another possible culprit; this one doesn't seem all that
likely to me, but at least it's easy to investigate.

With DBCP and non-ancient versions of the JDBC driver that use v3
protocol and real prepared statements, it is possible to (mis)configure
the system to create an unbounded number of cached prepared statements
on any particular connection.  Older versions of DBCP were also known to
have bugs which aggravated this issue when prepared statement caching
was enabled, IIRC.

-- Mark Lewis

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Optimizer internals

2006-06-15 Thread Mark Lewis
On Thu, 2006-06-15 at 14:05 -0400, John Vincent wrote:
> Now I've been told by our DBA that we should have been able to wholy
> satisfy that query via the indexes.

DB2 can satisfy the query using only indexes because DB2 doesn't do
MVCC.

Although MVCC is generally a win in terms of making the database easier
to use and applications less brittle, it also means that the database
must inspect the visibility information for each row before it can
answer a query.  For most types of queries this isn't a big deal, but
for count(*) type queries, it slows things down.

Since adding the visibility information to indexes would make them
significantly more expensive to use and maintain, it isn't done.
Therefore, each row has to be fetched from the main table anyway.

Since in this particular query you are counting all rows of the
database, PG must fetch each row from the main table regardless, so the
sequential scan is much faster because it avoids traversing the index
and performing random read operations.

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Optimizer internals

2006-06-15 Thread Mark Lewis
On Thu, 2006-06-15 at 14:46 -0400, John Vincent wrote:

> One question that we came up with is how does this affect other
> aggregate functions like MAX,MIN,SUM and whatnot? Being that this is
> our data warehouse, we use these all the time. As I've said
> previously, I didn't know a human could generate some of the queries
> we've passed through this system. 

Previously, MIN and MAX would also run slowly, for the same reason as
COUNT(*).  But there really isn't a need for that, since you can still
get a big speedup by scanning the index in order, looking up each row
and stopping as soon as you find a visible one.

This has been fixed so newer versions of PG will run quickly and use the
index for MIN and MAX.  I don't remember which version had that change;
it might not be until 8.2.  You can dig the archives to find out for
sure. 

For older versions of PG before the fix, you can make MIN and MAX run
quickly by rewriting them in the following form:

SELECT column FROM table ORDER BY column LIMIT 1;

Unfortunately SUM is in the same boat as COUNT; in order for it to
return a meaningful result it must inspect visibility information for
all of the rows.

-- Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] SAN performance mystery

2006-06-15 Thread Mark Lewis
On Thu, 2006-06-15 at 18:24 -0400, Tom Lane wrote:
> I agree with Brian's suspicion that the SATA drive isn't properly
> fsync'ing to disk, resulting in bogusly high throughput.  However,
> ISTM a well-configured SAN ought to be able to match even the bogus
> throughput, because it should be able to rely on battery-backed
> cache to hold written blocks across a power failure, and hence should
> be able to report write-complete as soon as it's got the page in cache
> rather than having to wait till it's really down on magnetic platter.
> Which is what the SATA drive is doing ... only it can't keep the promise
> it's making for lack of any battery backup on its on-board cache.

It really depends on your SAN RAID controller.  We have an HP SAN; I
don't remember the model number exactly, but we ran some tests and with
the battery-backed write cache enabled, we got some improvement in write
performance but it wasn't NEARLY as fast as an SATA drive which lied
about write completion.

The write-and-fsync latency was only about 2-3 times better than with no
write cache at all.  So I wouldn't assume that just because you've got a
write cache on your SAN, that you're getting the same speed as
fsync=off, at least for some cheap controllers.

-- Mark Lewis

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Commit slower on faster PC

2006-07-12 Thread Mark Lewis
The IDE drive is almost certainly lying about flushing data to the disk.
Lower-end consumer drives often do.

What this means is that commits will be a whole lot faster, but the
database loses its ACID guarantees, because a power failure at the wrong
moment could corrupt the whole database.

If you don't care about your data and want the SCSI drive to perform
fast just like the IDE drive, you can set fsync = off in your
configuration file.

-- Mark

On Wed, 2006-07-12 at 10:16 -0600, Koth, Christian (DWBI) wrote:
> Hi,
> 
> please help me with the following problem:
> 
> I have noticed a strange performance behaviour using a commit statement on 
> two different machines. On one of the machines the commit is many times 
> faster than on the other machine which has faster hardware. Server and client 
> are running always on the same machine.
> 
> Server version (same on both machines): PostgreSQL 8.1.3. (same binaries as 
> well)
> 
> PC1:
> 
> Pentium 4 (2.8 GHz)
> 1GB RAM
> IDE-HDD (approx. 50 MB/s rw), fs: ext3
> Mandrake Linux: Kernel 2.4.22
> 
> 
> PC2:
> 
> Pentium 4 (3.0 GHz)
> 2GB RAM
> SCSI-HDD (approx. 65 MB/s rw), fs: ext3
> Mandrake Linux: Kernel 2.4.32
> 
> 
> Both installations of the database have the same configuration, different 
> from default are only the following settings on both machines:
> 
> shared_buffers = 2
> listen_addresses = '*'
> max_stack_depth = 4096
> 
> 
> pgbench gives me the following results:
> PC1:
> 
> transaction type: TPC-B (sort of)
> scaling factor: 1
> number of clients: 1
> number of transactions per client: 10
> number of transactions actually processed: 10/10
> tps = 269.905533 (including connections establishing)
> tps = 293.625393 (excluding connections establishing)
> 
> PC2:
> 
> transaction type: TPC-B (sort of)
> scaling factor: 1
> number of clients: 1
> number of transactions per client: 10
> number of transactions actually processed: 10/10
> tps = 46.061935 (including connections establishing)
> tps = 46.519634 (excluding connections establishing)
> 
> 
> My own performance test sql script which inserts and (auto)commits some data 
> into a simple table produces the following log output in the server log:
> 
> PC1:
> 
> LOG:  duration: 1.441 ms  statement: INSERT INTO performance_test VALUES 
> (500938362, 'Xawhefjmd');
> STATEMENT:  INSERT INTO performance_test VALUES (500938362, 'Xawhefjmd');
> 
> PC2:
> 
> LOG:  duration: 29.979 ms  statement: INSERT INTO performance_test VALUES 
> (500938362, 'Xawhefjmd');
> STATEMENT:  INSERT INTO performance_test VALUES (500938362, 'Xawhefjmd');
> 
> 
> I created a 'strace' one both machines which is interesting:
> 
> Opening the socket:
> ---
> PC1: socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 10 <0.21>
> PC2: socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 8 <0.15>
> 
> PC1: bind(10, {sa_family=AF_INET, sin_port=htons(0), 
> sin_addr=inet_addr("0.0.0.0")}, 16) = 0 <0.07>
> PC2: bind (8, {sin_family=AF_INET, sin_port=htons(0), 
> sin_addr=inet_addr("0.0.0.0")}}, 16) = 0 <0.07>
> 
> PC1: getsockname(10, {sa_family=AF_INET, sin_port=htons(32820), 
> sin_addr=inet_addr("0.0.0.0")}, [16]) = 0 <0.05>
> PC2: getsockname( 8, {sin_family=AF_INET, sin_port=htons(36219), 
> sin_addr=inet_addr("0.0.0.0")}}, [16]) = 0 <0.05>
> 
> PC1: connect(10, {sa_family=AF_INET, sin_port=htons(5432), 
> sin_addr=inet_addr("127.0.0.1")}, 16) = 0 <0.000440>
> PC2: connect( 8, {sin_family=AF_INET, sin_port=htons(5432), 
> sin_addr=inet_addr("127.0.0.1")}}, 16) = 0 <0.000394>
> 
> PC1: setsockopt(10, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.06>
> PC2: setsockopt (8, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.04>
> 
> 
> Inserting and commiting the data: 
> -
> PC1:
> 
> send(10, "B\\0<\0INSERT INTO performance_test VAL"..., 175, 0) = 175 
> <0.15>
> recv(10, "2\0\17INSERT 0 1\0Z\0\0\0\5T", 8192, 0) = 53 <0.07>
> send(10, "B\0\0\0\17\0S_2\0\0\0\0\0\0\0E\0\0\0\t\0\0\0\0\1S\0\0\0\4", 31, 0) 
> = 31 <0.11>
> recv(10, "2\0\0\0\4C\0\0\0\vCOMMIT\0Z\0\0\0\5I", 8192, 0) = 23 <0.000211>
> 
> PC2:
> 
> send(8, "B\\0<\0INSERT INTO performance_test VAL"..., 175, 0) = 175 
> <0.14>
> recv(8, "2\0\17INSERT 0 1\0Z\0\0\0\5T", 8192, 0) = 53 <0.05>
> send(8, "B\0\0\0\17\0S_2\0\0\0\0\0\0\0E\0\0\0\t\0\0\0\0\1S\0\0\0\4", 31, 0) = 
> 31 <0.09>
> recv(8, "2\0\0\0\4C\0\0\0\vCOMMIT\0Z\0\0\0\5I", 8192, 0) = 23 <0.0253>
> 
> Every command is a bit faster on PC2 except the last one which is many times 
> slower.
> Any help or hint where to look at would be highly appreciated because I'm 
> running out of ideas ;-).
> 
> 
> regards,
> Christian
> 
> 
> **
> The information contained in, or attached to, this e-mail, may contain 
> confidential information and is intended solely for the use of the individual 
> or entity to whom they are addressed and may be subject to legal privilege.  
> If you

[PERFORM] Savepoint performance

2006-07-27 Thread Mark Lewis
All,

I support a system that runs on several databases including PostgreSQL.
I've noticed that the other DB's always put an implicit savepoint before
each statement executed, and roll back to that savepoint if the
statement fails for some reason.  PG does not, so unless you manually
specify a savepoint you lose all previous work in the transaction.

So my question is, how expensive is setting a savepoint in PG?  If it's
not too expensive, I'm wondering if it would be feasible to add a config
parameter to psql or other client interfaces (thinking specifically of
jdbc here) to do it automatically.  Doing so would make it a little
easier to work with PG in a multi-db environment.

My main reason for wanting this is so that I can more easily import,
say, 50 new 'objects' (related rows stored across several tables) in a
transaction instead of only one at a time without fear that an error in
one object would invalidate the whole batch.  I could do this now by
manually setting savepoints, but if it's not a big deal performance-wise
to modify the JDBC driver to start an anonymous savepoint with each
statement, then I'd prefer that approach as it seems that it would make
life easier for other folks too.

Thanks in advance for any feedback :)

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Performance with 2 AMD/Opteron 2.6Ghz and 8gig

2006-07-28 Thread Mark Lewis
This isn't all that surprising.  The main weaknesses of RAID-5 are poor
write performance and stupid hardware controllers that make the write
performance even worse than it needs to be.  Your numbers bear that out.
Reads off RAID-5 are usually pretty good.

Your 'dd' test is going to be a little misleading though.  Most DB
access isn't usually purely sequential; while it's easy to see why HW
RAID-5 might outperform HW-RAID-10 in large sequential reads (the RAID
controller would need to be smarter than most to make RAID-10 as fast as
RAID-5), I would expect that HW RAID-5 and RAID-10 random reads would be
about equal or else maybe give a slight edge to RAID-10. 

-- Mark Lewis


On Fri, 2006-07-28 at 13:31 -0400, Jeff Trout wrote:
> I too have a DL385 with a single DC Opteron 270.
> It claims to have a smart array 6i controller and over the last  
> couple of days I've been runnign some tests on it, which have been  
> yielding some suprising results.
> 
> I've got 6 10k U320 disks in it. 2 are in a mirror set.  We'll not  
> pay any attention to them.
> The remaining 4 disks I've been toying with to see what config works  
> best, using hardware raid and software raid.
> 
> system info:
> dl dl385 - 1 opteron 270 - 5GB ram - smart array 6i
> cciss0: HP Smart Array 6i Controller
> Firmware Version: 2.58
> Linux db03 2.6.17-1.2157_FC5 #1 SMP Tue Jul 11 22:53:56 EDT 2006  
> x86_64 x86_64 x86_64 GNU/Linux
> using xfs
> 
> Each drive can sustain 80MB/sec read (dd, straight off device)
> 
> So here are the results I have so far.  (averaged)
> 
> 
> hardware raid 5:
> dd - write 20GB file - 48MB/sec
> dd - read 20GB file - 247MB/sec
> [ didn't do a bonnie run on this yet ]
> pretty terrible write performance. good read.
> 
> hardware raid 10
> dd - write 20GB - 104MB/sec
> dd - read 20GB - 196MB/sec
> bonnie++
> Version  1.03   --Sequential Output-- --Sequential Input-  
> --Random-
>  -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
> Block-- --Seeks--
> MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
> CP  /sec %CP
> db03  9592M 45830  97 129501  31 62981  14 48524  99 185818   
> 19 949.0   1
> 
> software raid 5
> dd - write 20gb - 85MB/sec
> dd - read 20gb - 135MB/sec
> 
> I was very suprised at those results. I was sort of expecting it to  
> smoke the hardware. I repeated the test many times, and kept getting
> these numbers.
> 
> bonnie++:
> Version  1.03   --Sequential Output-- --Sequential Input-  
> --Random-
>  -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
> Block-- --Seeks--
> MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
> CP  /sec %CP
> db03  9592M 44110  97 81481  23 34604  10 44495  95 157063   
> 28 919.3   1
> 
> software 10:
> dd - write - 20GB - 108MB/sec
> dd - read - 20GB - 86MB/sec( WTF? - this is repeatable!!)
> bonnie++
> Version  1.03   --Sequential Output-- --Sequential Input-  
> --Random-
>  -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
> Block-- --Seeks--
> MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
> CP  /sec %CP
> db03  9592M 44539  98 105444  20 34127   8 39830  83 100374   
> 10  1072   1
> 
> 
> so I'm going to be going with hw r5, which went against what I  
> thought going in - read perf is more important for my usage than write.
> 
> I'm still not sure about that software 10 read number. something is  
> not right there...
> 
> --
> Jeff Trout <[EMAIL PROTECTED]>
> http://www.dellsmartexitin.com/
> http://www.stuarthamm.net/
> 
> 
> 
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] index usage

2006-07-28 Thread Mark Lewis
A volatile function has may return a different result for each row;
think of the random() or nextval() functions for example.  You wouldn't
want them to return the same value for each row returned.

-- Mark Lewis

On Fri, 2006-07-28 at 13:59 -0700, Ben wrote:
> It's volatile, but it will always return an integer.
> 
> On Fri, 28 Jul 2006, Daniel Caune wrote:
> 
> >> De : [EMAIL PROTECTED] [mailto:pgsql-performance-
> >> [EMAIL PROTECTED] De la part de Ben
> >> Envoyé : vendredi, juillet 28, 2006 15:21
> >> À : pgsql-performance@postgresql.org
> >> Objet : [PERFORM] index usage
> >>
> >> I have a table with 37000 rows, an integer column, and an index on that
> >> column. I've got a function that returns an integer. When I do a select
> >> where I restrict that column to being equal to a static number, explain
> >> tells me the index will be used. When I do the same thing but use the
> >> function instead of a static number, explain shows me a full scan on the
> >> table.
> >>
> >> I must be missing something, because my understanding is that the function
> >> will be evaluated once for the statement and then collapsed into a static
> >> number for the filtering. But the results of the explain seem to imply
> >> that's not the case?
> >>
> >
> > Is your function IMMUTABLE, STABLE or VOLATILE?
> >
> > --
> > Daniel
> >
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] directory tree query with big planner variation

2006-07-31 Thread Mark Lewis
It seems like you might be able to avoid the expensive directory lookups
entirely without changing the schema by defining an immutable function
dir_depth(path), which would just count the number of forward slashes.
Then create a functional index on dir_depth(path) and in the query do a
check for directories with a given prefix and the expected dir_depth.

In 8.1 and later, this kind of query might be able to use a bitmap index
combining thingamajigger (the actual name escapes me right now).

This is just a hunch and I haven't looked too carefully at the
schema/query requirements to see if its feasible, but seems like a
reasonable approach.

-- Mark Lewis


On Mon, 2006-07-31 at 09:30 -0400, Michael Stone wrote:
> On Mon, Jul 31, 2006 at 01:54:24PM +0200, Axel Rau wrote:
> >Am 31.07.2006 um 13:15 schrieb Michael Stone:
> >>On Mon, Jul 31, 2006 at 12:48:11PM +0200, Axel Rau wrote:
> >>>   WHERE P.path ~ '^%@/[^/]*/$' ) AS NLPC
> >>
> >>This can't be indexed. You might try something like WHERE P.path  
> >>LIKE '[EMAIL PROTECTED]' AND P.path ~ '^%@/[^/]*/$'
> 
> Ignore that, I wasn't awake yet.
> 
> >Why does it quite well in this case:
> >---
> >->  Index Scan using path_name_idx on path p  (cost=0.00..3.02 rows=1  
> >width=97) (actual time=15.480..56.935 rows=27 loops=1)
> >   Index Cond: ((path >= '/Users/axel/Library/ 
> >Preferences/'::text) AND (path < '/Users/axel/Library/ 
> >Preferences0'::text))
> >   Filter: ((path ~ '^/Users/axel/Library/Preferences/[^/]*/ 
> >$'::text) AND (rtrim("replace"(path, '/Users/axel/Library/ 
> >Preferences/'::text, ''::text), '/'::text) <> ''::text))
> >---
> >as compared to this case(ignoring the index on path):
> >---
> >->  Index Scan using path_pkey on path p  (cost=0.00..2567.57  
> >rows=1941 width=97) (actual time=527.805..1521.911 rows=69 loops=1)
> >   Filter: ((path ~ '^/Users/axel/[^/]*/$'::text) AND (rtrim 
> >("replace"(path, '/Users/axel/'::text, ''::text), '/'::text) <>  
> >''::text))
> >---
> >? With all longer path names, I get the above (good)result.
> >Should I put the rtrim/replace on the client side?
> 
> That's not the slow part. The slow part is retrieving every single file 
> for each of the subdirectories in order to determine whether there are 
> any files in the subdirectories. 
> 
> >>The schema could be a lot more intelligent here. (E.g., store path  
> >>seperately from file/directory name, store type (file or directory)  
> >>seperately, etc.) Without improving the schema I don't think this  
> >>will ever be a speed demon.
> 
> >PATH holds complete pathnames of directories, FILENAME holds  
> >filenames and pathname components.
> >Currently the schema is the lowest common denominator between SQLite,  
> >MySQL and pg and the bacula people will stay with that (-;).
> 
> Nothing I suggested raises the bar for the "lowest common denominator". 
> If I understand the intend of this SQL, you're pulling all the entries
> in a directory in two parts. The first part (files) is fairly 
> straightforward. The second part (directories) consists of pulling any 
> file whose parent is a subdirectory of the directory you're looking for 
> (this is *all* children of the directory, since you have to retrieve 
> every element that begins with the directory, then discard those that 
> have an additional / in their name), counting how many of these there 
> are for each subdirectory, and discarding those results except for a 
> binary (yes there are children or no there aren't). This is a lot of 
> useless work to go through, and is going to be slow if you've got a lot 
> of stuff in a subdirectory. An alternative approach would be, for each 
> directory, to store all its children (files and subdirectories) along 
> with a flag indicating which it is. This would allow you to create the 
> collapsed tree view without walking all the children of a subdirectory.
> 
> Assuming you can't make changes to the schema, what about the query?
> You've got this:
> 
> explain analyze  SELECT X.name AS name, COUNT(CH) > 1 AS children
>   FROM
> ( SELECT  RTRIM( REPLACE( NLPC.path, '%@/', ''),'/') AS name,
>  FN.name AS CH

Re: [PERFORM] Inner Join of the same table

2006-08-15 Thread Mark Lewis
Can you provide an EXPLAIN ANALYZE of the query in PG?  Have you
analyzed the PG database?  How many rows is this query expected to
return?  Which version of PG are you running?  What indexes have you
defined?

-- Mark

On Tue, 2006-08-15 at 14:38 +, Sebastián Baioni wrote:
> Hello, I'm migrating from MS SQL Server to PostgreSQL 8.1 and I have a
> serious problem:
> Table: APORTES - Rows: 9,000,000 (9 million)
> *cuiT (char 11)
> *cuiL (char 11)
> *PERI (char 6)
> FAMI (numeric 6)
> 
> I need all the cuiLs whose max(PERI) are from a cuiT, and the Max
> (FAMI) of those cuiLs, so the sentence is:
> 
> SELECT DISTINCT T.cuiT, T.cuiL. U.MAXPERI, U.MAXFAMI
>FROM APORTES T
>INNER JOIN
>(SELECT cuiL, MAX(PERI) AS MAXPERI,
>MAX(FAMI) AS MAXFAMI
> FROM APORTES
> GROUP BY cuiL) AS U
>ON T.cuiL = U.cuiL AND T.PERI=U.MAXPERI
> WHERE T.cuiT='12345678901'
> 
> In MS SQL Server it lasts 1minute, in PostgreSQL for Windows it lasts
> 40minutes and in PostgreSQL for Linux (FreeBSD) it lasts 20minuts.
> 
> Do you know if there is any way to tune the server or optimize this
> sentence?
> 
> Thanks
>  Sebastián Baioni
> 
> Instrumentos musicalesSebastián Baioni Ofertas náuticas
> 
> 
> __
> Preguntá. Respondé. Descubrí.
> Todo lo que querías saber, y lo que ni imaginabas,
> está en Yahoo! Respuestas (Beta).
> Probalo ya! 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL runs a query much slower than BDE and

2006-08-17 Thread Mark Lewis
MSSQL can give either a graphical query plan or a text-based one similar
to PG.  There's no way that I've found to get the equivalent of an
EXPLAIN ANALYZE, but I'm by no means an MSSQL guru.

To get a neat-looking but not very useful graphical query plan from the
Query Analyzer tool, hit .

To get the text-based one, execute "SET SHOWPLAN_ALL ON" which toggles
diagnostic mode on, and each query that you run will return the explain
plan instead of actually running until you execute "SET SHOWPLAN_ALL
OFF".

-- Mark Lewis

On Thu, 2006-08-17 at 09:11 -0400, Tom Lane wrote:
> "Peter Hardman" <[EMAIL PROTECTED]> writes:
> > I wonder whether Paradox and MySQL are just not doing the sort (this
> > seems to be what eats up the time), since the output of the subquery
> > is in fact already in the proper order.
> 
> MSSQL (from the other thread).  I feel fairly safe in assuming that
> MySQL's query optimizer is not nearly in the league to do this query
> effectively.  (I like the theory Arjen mentioned that what you are
> measuring there is the effects of their query cache rather than a
> smart fundamental implementation.)  I wonder whether MSSQL has an
> EXPLAIN equivalent ...
> 
> Anywy, your point about the sort being redundant is a good one, and
> offhand I'd have expected PG to catch that; I'll have to look into
> why it didn't.  But that's not going to explain a 10x speed
> difference, because the sort isn't 90% of the runtime.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How to get higher tps

2006-08-21 Thread Mark Lewis
Not much we can do unless you give us more info about how you're testing
(pgbench setup), and what you've done with the parameters you listed
below.  It would also be useful if you told us more about your drive
array than just "3Par".  We need to know the RAID level, number/speed of
disks, whether it's got a battery-backed write cache that's turned on,
things like this.

Like Jeff just said, it's likely that you're waiting for rotational
latency, which would limit your maximum tps for sequential jobs based on
the number of disks in your array.  For example, a 2-disk array of 10k
RPM disks is going to max out somewhere around 333 tps.  (2*1/60).

-- Mark Lewis

 

On Mon, 2006-08-21 at 16:45 -0400, Marty Jia wrote:
> I'm exhausted to try all performance tuning ideas, like following
> parameters
> 
> shared_buffers 
> fsync 
> max_fsm_pages
> max_connections 
> shared_buffers  
> work_mem 
> max_fsm_pages
> effective_cache_size 
> random_page_cost 
> 
> I believe all above have right size and values, but I just can not get
> higher tps more than 300 testd by pgbench
> 
> Here is our hardware
> 
> 
> Dual Intel Xeon 2.8GHz
> 6GB RAM
> Linux 2.4 kernel
> RedHat Enterprise Linux AS 3
> 200GB for PGDATA on 3Par, ext3
> 50GB for WAL on 3Par, ext3
> 
> With PostgreSql 8.1.4
> 
> We don't have i/o bottle neck. 
> 
> Whatelse I can try to better tps? Someone told me I can should get tps
> over 1500, it is hard to believe.
> 
> Thanks
> 
> Marty
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] How to get higher tps

2006-08-22 Thread Mark Lewis
Well, at least on my test machines running gnome-terminal, my pgbench
runs tend to get throttled by gnome-terminal's lousy performance to no
more than 300 tps or so.  Running with 2>/dev/null to throw away all the
detailed logging gives me 2-3x improvement in scores.  Caveat: in my
case the db is on the local machine, so who knows what all the
interactions are.

Also, when you initialized the pgbench db what scaling factor did you
use?  And does running pgbench with -v improve performance at all?

-- Mark

On Tue, 2006-08-22 at 09:19 -0400, Marty Jia wrote:
> Joshua,
> 
> Here is 
> 
> shared_buffers = 8
> fsync = on
> max_fsm_pages = 35
> max_connections = 1000
> work_mem = 65536
> effective_cache_size = 61
> random_page_cost = 3
>  
> Here is pgbench I used:
> 
> pgbench -c 10 -t 1 -d HQDB
> 
> Thanks
> 
> Marty
> 
> -Original Message-
> From: Joshua D. Drake [mailto:[EMAIL PROTECTED] 
> Sent: Monday, August 21, 2006 6:09 PM
> To: Marty Jia
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] How to get higher tps
> 
> Marty Jia wrote:
> > I'm exhausted to try all performance tuning ideas, like following 
> > parameters
> > 
> > shared_buffers
> > fsync
> > max_fsm_pages
> > max_connections
> > shared_buffers
> > work_mem
> > max_fsm_pages
> > effective_cache_size
> > random_page_cost
> > 
> > I believe all above have right size and values, but I just can not get
> 
> > higher tps more than 300 testd by pgbench
> 
> What values did you use?
> 
> > 
> > Here is our hardware
> > 
> > 
> > Dual Intel Xeon 2.8GHz
> > 6GB RAM
> > Linux 2.4 kernel
> > RedHat Enterprise Linux AS 3
> > 200GB for PGDATA on 3Par, ext3
> > 50GB for WAL on 3Par, ext3
> > 
> > With PostgreSql 8.1.4
> > 
> > We don't have i/o bottle neck. 
> 
> Are you sure? What does iostat say during a pgbench? What parameters are
> you passing to pgbench?
> 
> Well in theory, upgrading to 2.6 kernel will help as well as making your
> WAL ext2 instead of ext3.
> 
> > Whatelse I can try to better tps? Someone told me I can should get tps
> 
> > over 1500, it is hard to believe.
> 
> 1500? Hmmm... I don't know about that, I can get 470tps or so on my
> measily dual core 3800 with 2gig of ram though.
> 
> Joshua D. Drake
> 
> 
> > 
> > Thanks
> > 
> > Marty
> > 
> > ---(end of 
> > broadcast)---
> > TIP 2: Don't 'kill -9' the postmaster
> > 
> 
> 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Is this way of testing a bad idea?

2006-08-24 Thread Mark Lewis
> Monitoring the processes using top reveals that the total amount of
> memory used slowly increases during the test. When reaching insert
> number 4, or somewhere around that, memory is exhausted, and the the
> systems begins to swap. Each of the postmaster processes seem to use a
> constant amount of memory, but the total memory usage increases all the
> same.

So . . . . what's using the memory?  It doesn't sound like PG is using
it, so is it your Java app?

If it's the Java app, then it could be that your code isn't remembering
to do things like close statements, or perhaps the max heap size is set
too large for your hardware.  With early RHEL3 kernels there was also a
quirky interaction with Sun's JVM where the system swaps itself to death
even when less than half the physical memory is in use.

If its neither PG nor Java, then perhaps you're misinterpreting the
results of top.  Remember that the "free" memory on a properly running
Unix box that's been running for a while should hover just a bit above
zero due to normal caching; read up on the 'free' command to see the
actual memory utilization.

-- Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] PowerEdge 2950 questions

2006-08-24 Thread Mark Lewis
> it's worse than that.  if you need to read something that is not in
> the o/s cache, all the disks except for one need to be sent to a
> physical location in order to get the data.  Thats the basic rule with
> striping: it optimizes for sequential i/o in expense of random i/o.
> There are some optimizations that can help, but not much.  caching by
> the controller can increase performance on writes because it can
> optimize the movement across the disks by instituting a delay between
> the write request and the actual write.
> 
> raid 1 (or 1+x) is the opposite.  It allows the drive heads to move
> independantly on reads when combined with some smart algorithms.
> writes however must involve all the disk heads however.  Many
> controllers do not to seem to optimze raid 1 properly although linux
> software raid seems to.
> 
> A 4 disk raid 1, for example, could deliver four times the seek
> performance which would make it feel much faster than a 4 disk raid 0
> under certain conditions.

I understand random mid-sized seeks (seek to x and read 512k) being slow
on RAID5, but if the read size is small enough not to cross a stripe
boundary, this could be optimized to only one seek on one drive.  Do
most controllers just not do this, or is there some other reason that
I'm not thinking of that would force all disks to seek?

-- Mark

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] sql-bench

2006-09-13 Thread Mark Lewis
The last I checked (years ago), sql-bench was very synthetic (i.e.
reflecting no realistic use case).  It's the sort of test suite that's
useful for database developers when testing the effects of a particular
code change or optimization, but not so applicable to real-world uses.  

Historically the test was also bad for PG because it did nasty things
like 10,000 inserts each in separate transactions because the test was
written for MySQL which at the time didn't support transactions.  Not
sure if that's been fixed yet or not.

Can you provide details about the schema and the queries that are slow?

-- Mark

On Wed, 2006-09-13 at 05:24 -0700, yoav x wrote:
> Hi
> 
> I am trying to run sql-bench against PostgreSQL 8.1.4 on Linux.
> Some of the insert tests seems to be ver slow 
> 
> For example: select_join_in
> 
> Are there any tuning parameters that can be changed to speed these queries? 
> Or are these queries
> especially tuned to show MySQL's stgrenths?
> 
> 
> 
> 
> __
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Optimizing DELETE

2006-09-19 Thread Mark Lewis
You do not have indexes on all of the columns which are linked by
foreign key constraints.

For example, let's say that I had a "scientist" table with a single
column "scientist_name" and another table "discovery" which had
"scientist_name" as a column with a foreign key constraint to the
"scientist" table.

If the system were to try to delete a row from the scientist table, then
it would need to scan the discovery table for any row which referenced
that scientist_name.

If there is an index on the scientist_name column in the discovery
table, this is a fast operation.  In your case however, there most
likely isn't an index on that column, so it needs to do a full table
scan of the discovery table for each row deleted from the scientist
table.

If the discovery table has 100,000 rows, and there are 100 scientists,
then deleting those 100 scientists would require scanning 100,000 * 100
= 10M records, so this sort of thing can quickly become a very expensive
operation.

Because of this potential for truly atrocious update/delete behavior,
some database systems (SQL Server at least, and IIRC Oracle as well)
either automatically create the index on discovery.scientist_name when
the foreign key constraint is created, or refuse to create the foreign
key constraint if there isn't already an index.

PG doesn't force you to have an index, which can be desirable for
performance reasons in some situations if you know what you're doing,
but allows you to royally shoot yourself in the foot on deletes/updates
to the parent table if you're not careful.

If you have a lot of constraints and want to track down which one is
unindexed, then doing an EXPLAIN ANALYZE of deleting a single row from
the parent table will tell you how long each of the referential
integrity checks takes, so you can figure out which indexes are missing.

-- Mark Lewis

On Tue, 2006-09-19 at 15:22 +0200, Ivan Voras wrote:
> I've just fired off a "DELETE FROM table" command (i.e. unfiltered 
> DELETE) on a trivially small table but with many foreign key references 
> (on similar-sized tables), and I'm waiting for it to finish. It's been 
> 10 minutes now, which seems very excessive for a table of 9000 rows on a 
> 3 GHz desktop machine.
> 
> 'top' says it's all spent in USER time, and there's a ~~500KB/s write 
> rate going on. Just before this DELETE, I've deleted data from a larger 
> table (5 rows) using the same method and it finished in couple of 
> seconds - maybe it's a PostgreSQL bug?
> 
> My question is: assuming it's not a bug, how to optimize DELETEs? 
> Increasing work_mem maybe?
> 
> (I'm using PostgreSQL 8.1.4 on FreeBSD 6- amd64)
> 
> (I know about TRUNCATE; I need those foreign key references to cascade)
> 
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL and sql-bench

2006-09-21 Thread Mark Lewis
Not to offend, but since most of us are PG users, we're not all that
familiar with what the different tests in MySQL's sql-bench benchmark
do.  So you won't get very far by saying "PG is slow on benchmark X, can
I make it faster?", because that doesn't include any of the information
we need in order to help.

Specifics would be nice, including at least the following:

1. Which specific test case(s) would you like to try to make faster?
What do the table schema look like, including indexes and constraints?

2. What strategy did you settle on for handling VACUUM and ANALYZE
during the test?  Have you confirmed that you aren't suffering from
table bloat?

3. What are the actual results you got from the PG run in question?

4. What is the size of the data set referenced in the test run?

-- Mark Lewis

On Thu, 2006-09-21 at 07:52 -0700, yoav x wrote:
> Hi
> 
> After upgrading DBI and DBD::Pg, this benchmark still picks MySQL as the 
> winner (at least on Linux
> RH3 on a Dell 1875 server with 2 hyperthreaded 3.6GHz CPUs and 4GB RAM).
> I've applied the following parameters to postgres.conf:
> 
> max_connections = 500
> shared_buffers = 3000
> work_mem = 10
> effective_cache_size = 30
> 
> Most queries still perform slower than with MySQL. 
> Is there anything else that can be tweaked or is this a limitation of PG or 
> the benchmark?
> 
> Thanks.
> 
> 
> 
> __
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Large tables (was: RAID 0 not as fast as

2006-09-21 Thread Mark Lewis
> So this might be a dumb question, but the above statements apply to the
> cluster (e.g. postmaster) as a whole, not per postgres
> process/transaction correct? So each transaction is blocked waiting for
> the main postmaster to retrieve the data in the order it was requested
> (i.e. not multiple scanners/aio)?

Each connection runs its own separate back-end process, so these
statements apply per PG connection (=process).

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Decreasing BLKSZ

2006-09-25 Thread Mark Lewis
I'm not sure if decreasing BLKSZ is the way to go.  It would allow you
to have more smaller blocks in memory, but the actual coverage of the
index would remain the same; if only 33% of the index fits in memory
with the 8K BLKSZ then only 33% would fit in memory with a 4k BLKSZ.  I
can see where you're going if the tree nodes for all 15 million key
entries fit in memory as well as the most recent nodes for the logtime
nodes lower down in the index; basically trying to make sure that the
"right" 33% of the index is in memory. 

But it seems like it might be more useful to have two indexes, one on
logtime and one on key.  Inserts into the logtime index would be
correlated with your insert order and as such be cache-friendly so
that's not an issue.  The index on just the key column would be at least
as small as the active subset of a combined index, so performance should
be at least as good as you could possibly achieve by reducing BLKSIZE.

PG 8.1 is smart enough to use a bitmap index scan to combine the two
indexes at query time; if that gives you adequate performance then it
would be simpler than reducing BLKSIZE.

-- Mark Lewis

On Mon, 2006-09-25 at 17:54 -0400, Marc Morin wrote:
> > Would it be possible to change the primary key to 
> > (logtime,key)? This could help keeping the "working window" small.
> 
> No, the application accessing the data wants all the rows between start
> and end time for a particular key value.  
> 
> > 
> > Secondly, the real working set is smaller, as the rows are 
> > all inserted at the end of the table, filling each page until 
> > it's full, so only the last pages are accessed. There's no 
> > relation between the index order, and the order of data on 
> > disk, unless you CLUSTER.
> 
> I'd theorizing that my problem is in updating the index itself and not
> the heap.  Insert order
> Refers to the order by which the applications are inserting the rows and
> as such, the order by
> Which the index is being updated.  This in turn, is causing the b-tree
> to be traverse.  Problem
> Is the working set of blocks at the bottom of the btree is too big for
> my cache.
> 
> > 
> > > Any comment on other affects or gotchas with lowering the size of 
> > > BLKSZ?  Currently, our database is thrashing its cache of blocks we 
> > > we're getting only ~100 inserts/second, every insert results in a 
> > > evict-read-modify operation.
> > 
> > I'm not shure that's the correct diagnosis.
> > 
> > Do you have one transaction per insert? Every transaction 
> > means a forced sync to the disk, so you won't get more than 
> > about 100-200 commits per second, depending on your actual 
> > disk rotation speed.
> 
> No, an insert consists of roughly 10,000+ rows per transaction block.  
> 
> > 
> > To improve concurrency of the "numer of inserters" running in 
> > parallel, try to tweak the config variables commit_delay and 
> > commit_sibling, so you get a higher overall throughput at 
> > cost of an increased delay per connection, and increase the 
> > number of inserters. Using sensible tweaking, the throughput 
> > should scale nearly linear with the number of backens. :-)
> 
> I don't think this will help us here due to large transactions already.
> 
> > 
> > If feasible for your application, you can also bundle several 
> > log entries into a single transaction. If you're CPU bound, 
> > you can use COPY instead of INSERT or (if you can wait for 
> > 8.2) the new multi-row INSERT to further improve performance, 
> > but I doubt that you're CPU bound.
> 
> > 
> > The only way to "really" get over the sync limit is to have 
> > (at least) the WAL on a battery backed ram / SSD media that 
> > has no "spinning disk"
> > physical limit, or abandon crash safety by turning fsync off.
> 
> Again, problem is not with WAL writing, already on it's own raid1 disk
> pair.  The 
> I/O pattern we see is about 1-2% load on WAL and 100% load on the array
> holding the indexes and tables. Throughput is very low, something like
> 150k-200K bytes/second of real  rows being deposited on the disk.
> 
> The disks are busy seeking all over the disk platter to fetch a block,
> add a single row, then seek to another spot and write back a previously
> dirty buffer
> 
> > 
> > Thanks,
> > Markus.
> > --
> > Markus Schaber | Logical Tracking&Tracing International AG
> > Dipl. Inf. | Software Development GIS
> > 
> > Fight against software patents in Europe! www.ffii.org 
> > www.nosoftwarepatents.org
> > 
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] BUG #2658: Query not using index

2006-10-03 Thread Mark Lewis
Have you looked into a materialized view sort of approach?  You could
create a table which had assetid as a primary key, and max_ts as a
column.  Then use triggers to keep that table up to date as rows are
added/updated/removed from the main table.

This approach would only make sense if there were far fewer distinct
assetid values than rows in the main table, and would get slow if you
commonly delete rows from the main table or decrease the value for ts in
the row with the highest ts for a given assetid.

-- Mark Lewis

On Tue, 2006-10-03 at 13:52 -0700, Graham Davis wrote:
> Thanks Tom, that explains it and makes sense.  I guess I will have to 
> accept this query taking 40 seconds, unless I can figure out another way 
> to write it so it can use indexes.  If there are any more syntax 
> suggestions, please pass them on.  Thanks for the help everyone.
> 
> Graham.
> 
> 
> Tom Lane wrote:
> 
> >Graham Davis <[EMAIL PROTECTED]> writes:
> >  
> >
> >>How come an aggreate like that has to use a sequential scan?  I know 
> >>that PostgreSQL use to have to do a sequential scan for all aggregates, 
> >>but there was support added to version 8 so that aggregates would take 
> >>advantage of indexes.
> >>
> >>
> >
> >Not in a GROUP BY context, only for the simple case.  Per the comment in
> >planagg.c:
> >
> >  * We don't handle GROUP BY, because our current implementations of
> >  * grouping require looking at all the rows anyway, and so there's not
> >  * much point in optimizing MIN/MAX.
> >
> >The problem is that using an index to obtain the maximum value of ts for
> >a given value of assetid is not the same thing as finding out what all
> >the distinct values of assetid are.
> >
> >This could possibly be improved but it would take a considerable amount
> >more work.  It's definitely not in the category of "bug fix".
> >
> > regards, tom lane
> >  
> >
> 
> 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] BUG #2658: Query not using index

2006-10-03 Thread Mark Lewis
Hmmm.  How many distinct assetids are there?
-- Mark Lewis

On Tue, 2006-10-03 at 14:23 -0700, Graham Davis wrote:
> The "summary table" approach maintained by triggers is something we are 
> considering, but it becomes a bit more complicated to implement.  
> Currently we have groups of new positions coming in every few seconds or 
> less.  They are not guaranteed to be in order.  So for instance, a group 
> of positions from today could come in and be inserted, then a group of 
> positions that got lost from yesterday could come in and be inserted 
> afterwards. 
> 
> This means the triggers would have to do some sort of logic to figure 
> out if the newly inserted position is actually the most recent by 
> timestamp.  If positions are ever deleted or updated, the same sort of 
> query that is currently running slow will need to be executed in order 
> to get the new most recent position.  So there is the possibility that 
> new positions can be inserted faster than the triggers can calculate 
> and  maintain the summary table.  There are some other complications 
> with maintaining such a summary table in our system too, but I won't get 
> into those.
> 
> Right now I'm just trying to see if I can get the query itself running 
> faster, which would be the easiest solution for now.
> 
> Graham.
> 
> 
> Mark Lewis wrote:
> 
> >Have you looked into a materialized view sort of approach?  You could
> >create a table which had assetid as a primary key, and max_ts as a
> >column.  Then use triggers to keep that table up to date as rows are
> >added/updated/removed from the main table.
> >
> >This approach would only make sense if there were far fewer distinct
> >assetid values than rows in the main table, and would get slow if you
> >commonly delete rows from the main table or decrease the value for ts in
> >the row with the highest ts for a given assetid.
> >
> >-- Mark Lewis
> >
> >On Tue, 2006-10-03 at 13:52 -0700, Graham Davis wrote:
> >  
> >
> >>Thanks Tom, that explains it and makes sense.  I guess I will have to 
> >>accept this query taking 40 seconds, unless I can figure out another way 
> >>to write it so it can use indexes.  If there are any more syntax 
> >>suggestions, please pass them on.  Thanks for the help everyone.
> >>
> >>Graham.
> >>
> >>
> >>Tom Lane wrote:
> >>
> >>
> >>
> >>>Graham Davis <[EMAIL PROTECTED]> writes:
> >>> 
> >>>
> >>>  
> >>>
> >>>>How come an aggreate like that has to use a sequential scan?  I know 
> >>>>that PostgreSQL use to have to do a sequential scan for all aggregates, 
> >>>>but there was support added to version 8 so that aggregates would take 
> >>>>advantage of indexes.
> >>>>   
> >>>>
> >>>>
> >>>>
> >>>Not in a GROUP BY context, only for the simple case.  Per the comment in
> >>>planagg.c:
> >>>
> >>>* We don't handle GROUP BY, because our current implementations of
> >>>* grouping require looking at all the rows anyway, and so there's not
> >>>* much point in optimizing MIN/MAX.
> >>>
> >>>The problem is that using an index to obtain the maximum value of ts for
> >>>a given value of assetid is not the same thing as finding out what all
> >>>the distinct values of assetid are.
> >>>
> >>>This could possibly be improved but it would take a considerable amount
> >>>more work.  It's definitely not in the category of "bug fix".
> >>>
> >>>   regards, tom lane
> >>> 
> >>>
> >>>  
> >>>
> >>
> >>
> 
> 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] BUG #2658: Query not using index

2006-10-03 Thread Mark Lewis
A few hundred is quite a lot for the next proposal and it's kind of an
ugly one, but might as well throw the idea out since you never know.

Have you considered creating one partial index per assetid?  Something
along the lines of "CREATE INDEX asset_index_N ON asset_positions(ts)
WHERE assetid=N"?  I'd guess that the planner probably wouldn't be smart
enough to use the partial indexes unless you issued a separate query for
each assetid, but each one of those queries should be really fast.  Of
course, this is all assuming that PG knows how to use partial indexes to
satisfy MAX queries; I'm not sure if it does.

-- Mark Lewis

On Tue, 2006-10-03 at 14:35 -0700, Graham Davis wrote:
> Not many.  It fluctuates, but there are usually only ever a few hundred 
> at most.  Each assetid has multi-millions of positions though.
> 
> Mark Lewis wrote:
> 
> >Hmmm.  How many distinct assetids are there?
> >-- Mark Lewis
> >
> >On Tue, 2006-10-03 at 14:23 -0700, Graham Davis wrote:
> >  
> >
> >>The "summary table" approach maintained by triggers is something we are 
> >>considering, but it becomes a bit more complicated to implement.  
> >>Currently we have groups of new positions coming in every few seconds or 
> >>less.  They are not guaranteed to be in order.  So for instance, a group 
> >>of positions from today could come in and be inserted, then a group of 
> >>positions that got lost from yesterday could come in and be inserted 
> >>afterwards. 
> >>
> >>This means the triggers would have to do some sort of logic to figure 
> >>out if the newly inserted position is actually the most recent by 
> >>timestamp.  If positions are ever deleted or updated, the same sort of 
> >>query that is currently running slow will need to be executed in order 
> >>to get the new most recent position.  So there is the possibility that 
> >>new positions can be inserted faster than the triggers can calculate 
> >>and  maintain the summary table.  There are some other complications 
> >>with maintaining such a summary table in our system too, but I won't get 
> >>into those.
> >>
> >>Right now I'm just trying to see if I can get the query itself running 
> >>faster, which would be the easiest solution for now.
> >>
> >>Graham.
> >>
> >>
> >>Mark Lewis wrote:
> >>
> >>
> >>
> >>>Have you looked into a materialized view sort of approach?  You could
> >>>create a table which had assetid as a primary key, and max_ts as a
> >>>column.  Then use triggers to keep that table up to date as rows are
> >>>added/updated/removed from the main table.
> >>>
> >>>This approach would only make sense if there were far fewer distinct
> >>>assetid values than rows in the main table, and would get slow if you
> >>>commonly delete rows from the main table or decrease the value for ts in
> >>>the row with the highest ts for a given assetid.
> >>>
> >>>-- Mark Lewis
> >>>
> >>>On Tue, 2006-10-03 at 13:52 -0700, Graham Davis wrote:
> >>> 
> >>>
> >>>  
> >>>
> >>>>Thanks Tom, that explains it and makes sense.  I guess I will have to 
> >>>>accept this query taking 40 seconds, unless I can figure out another way 
> >>>>to write it so it can use indexes.  If there are any more syntax 
> >>>>suggestions, please pass them on.  Thanks for the help everyone.
> >>>>
> >>>>Graham.
> >>>>
> >>>>
> >>>>Tom Lane wrote:
> >>>>
> >>>>   
> >>>>
> >>>>
> >>>>
> >>>>>Graham Davis <[EMAIL PROTECTED]> writes:
> >>>>>
> >>>>>
> >>>>> 
> >>>>>
> >>>>>  
> >>>>>
> >>>>>>How come an aggreate like that has to use a sequential scan?  I know 
> >>>>>>that PostgreSQL use to have to do a sequential scan for all aggregates, 
> >>>>>>but there was support added to version 8 so that aggregates would take 
> >>>>>>advantage of indexes.
> >>>>>>  
> >>>>>>
> >>>>>>   
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>Not in a GROUP BY context, only for the simple case.  Per the comment in
> >>>>>planagg.c:
> >>>>>
> >>>>>  * We don't handle GROUP BY, because our current implementations of
> >>>>>  * grouping require looking at all the rows anyway, and so there's not
> >>>>>  * much point in optimizing MIN/MAX.
> >>>>>
> >>>>>The problem is that using an index to obtain the maximum value of ts for
> >>>>>a given value of assetid is not the same thing as finding out what all
> >>>>>the distinct values of assetid are.
> >>>>>
> >>>>>This could possibly be improved but it would take a considerable amount
> >>>>>more work.  It's definitely not in the category of "bug fix".
> >>>>>
> >>>>> regards, tom lane
> >>>>>
> >>>>>
> >>>>> 
> >>>>>
> >>>>>  
> >>>>>
> >>>>   
> >>>>
> >>>>
> >>>>
> >>
> >>
> 
> 

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Unsubscribe

2006-10-04 Thread Mark Lewis
I'd prefer to have a short footer link called something like "Mailing
List Page" which would take you to a page where you could subscribe,
unsubscribe, or view the archives.  I think that making the link short
and also making it a quick shortcut away from the archives tips the
scales in terms of utility vs. annoyance.  One of the tips that shows up
in the footers today is just a link to the archives anyway.

-- Mark Lewis

On Wed, 2006-10-04 at 11:28 -0500, Bruno Wolff III wrote:
> On Wed, Oct 04, 2006 at 08:30:03 -0700,
>   "Joshua D. Drake" <[EMAIL PROTECTED]> wrote:
> > 
> > Although I 100% agree with you Bruno, it should be noted that our lists
> > are a closed box for most people. They don't follow what is largely
> > considered standard amongst lists which is to have list information at
> > the bottom of each e-mail.
> 
> There are reasons you don't want to do that. Footers work OK for single
> part email messages. They don't make so much sense in multipart messages.
> You can probably take a crap shoot and add the footer to the first
> text/plain part and not break things. This won't work so well for multipart
> alternative messages that have text/plain and text/html parts. You could
> also try to insert a footer in to the html part, but thats a bit trickier
> since you can't just put it at the end.
> 
> However, since the postgres lists are mostly just using text/plain parts
> for message bodies and there are already footers being used to distribute
> tips, it wouldn't make things significantly worse to add unsubscribe
> information as well.
> 
> I would prefer just making the unsubscribe instructions easy to find on
> the web.
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Simple join optimized badly?

2006-10-11 Thread Mark Lewis
Tom,

I'm interested in the problem of cross-column statistics from a
theoretical perspective.  It would be interesting to sit down and try to
reason out a useful solution, or at very least to understand the problem
better so I can anticipate when it might come and eat me.

>From my understanding, the main problem is that if PG knows the
selectivity of n conditions C1,C2,...,Cn then it doesn't know whether
the combined selectivity will be C1*C2*...*Cn (conditions are
independent) or max(C1,C2,...,Cn) (conditions are strictly dependent),
or somewhere in the middle. Therefore, row estimates could be orders of
magnitude off.

I suppose a common example would be a table with a serial primary key
column and a timestamp value which is always inserted as
CURRENT_TIMESTAMP, so the two columns are strongly correlated.  If the
planner guesses that 1% of the rows of the table will match pk>100,
and 1% of the rows of the table will match timestamp > X, then it would
be nice for it to know that if you specify both "pk>100 AND
timestamp>X" that the combined selectivity is still only 1% and not 1% *
1% = 0.01%.

As long as I'm sitting down and reasoning about the problem anyway, are
there any other types of cases you're aware of where some form of cross-
column statistics would be useful?  In the unlikely event that I
actually come up with a brilliant and simple solution, I'd at least like
to make sure that I'm solving the right problem :)

Thanks,
Mark Lewis



On Tue, 2006-10-10 at 22:38 -0400, Tom Lane wrote:
> Brian Herlihy <[EMAIL PROTECTED]> writes:
> > What would it take for hints to be added to postgres?
> 
> A *whole lot* more thought and effort than has been expended on the
> subject to date.
> 
> Personally I have no use for the idea of "force the planner to do
> exactly X given a query of exactly Y".  You don't have exactly Y
> today, tomorrow, and the day after (if you do, you don't need a
> hint mechanism at all, you need a mysql-style query cache).
> IMHO most of the planner mistakes we see that could be fixed via
> hinting are really statistical estimation errors, and so the right
> level to be fixing them at is hints about how to estimate the number
> of rows produced for given conditions.  Mind you that's still a plenty
> hard problem, but you could at least hope that a hint of that form
> would be useful for more than one query.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-24 Thread Mark Lewis
Tom,

I've got a similar problem with deletes taking a very long time.  I know
that there are lots of foreign keys referencing this table, and other
foreign keys referencing those tables, etc.  I've been curious, is there
a way to find out how long the foreign key checks take for each
dependent table?

-- Mark Lewis

On Thu, 2005-03-24 at 16:52, Tom Lane wrote:
> Karim Nassar <[EMAIL PROTECTED]> writes:
> > Here is the statement:
> 
> > orfs=# explain analyze DELETE FROM int_sensor_meas_type WHERE
> > id_meas_type IN (SELECT * FROM meas_type_ids);
> >  QUERY PLAN 
> > -
> >  Hash Join  (cost=11.53..42.06 rows=200 width=6) (actual
> > time=1.564..2.840 rows=552 loops=1)
> > ...
> >  Total runtime: 2499616.216 ms
> > (7 rows)
> 
> Notice that the actual join is taking 2.8 ms.  The other ~40 minutes is
> in operations that we cannot see in this plan, but we can surmise are ON
> DELETE triggers.
> 
> > Where do I go from here?
> 
> Look at what your triggers are doing.  My private bet is that you have
> unindexed foreign keys referencing this table, and so each deletion
> forces a seqscan of some other, evidently very large, table(s).
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Benchmark Data requested --- pgloader CE design ideas

2008-02-07 Thread Mark Lewis
> > I was thinking of not even reading the file content from the controller
> > thread, just decide splitting points in bytes (0..ST_SIZE/4 -
> > ST_SIZE/4+1..2*ST_SIZE/4 etc) and let the reading thread fine-tune by
> > beginning to process input after having read first newline, etc.
> 
> The problem I was pointing out is that if chunk#2 moved foward a few bytes 
> before it started reading in search of a newline, how will chunk#1 know 
> that it's supposed to read up to that further point?  You have to stop #1 
> from reading further when it catches up with where #2 started.  Since the 
> start of #2 is fuzzy until some reading is done, what you're describing 
> will need #2 to send some feedback to #1 after they've both started, and 
> that sounds bad to me.  I like designs where the boundaries between 
> threads are clearly defined before any of them start and none of them ever 
> talk to the others.

I don't think that any communication is needed beyond the beginning of
the threads.  Each thread knows that it should start at byte offset X
and end at byte offset Y, but if Y happens to be in the middle of a
record then just keep going until the end of the record.  As long as the
algorithm for reading past the end marker is the same as the algorithm
for skipping past the beginning marker then all is well.

-- Mark Lewis

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Query slows after offset of 100K

2008-02-14 Thread Mark Lewis
Michael,

Our application had a similar problem, and what we did to avoid having
people click into the middle of 750k records was to show the first page
with forward/back links but no link to go to the middle.  So people
could manually page forward as far as they want, but nobody is going to
sit there clicking next 37k times.  We have several thousand users and
none of them complained about the change.  Maybe it's because at the
same time as we made that change we also improved the rest of the
searching/filtering interface.  But I think that really people don't
need to jump to the middle of the records anyway as long as you have
decent search abilities.

If you wanted to keep your same GUI, one workaround would be to
periodically update a table which maps "page number" to "first unique
key on page".  That decouples the expensive work to generate the page
offsets from the query itself, so if your data changes fairly
infrequently it might be appropriate.  Sort of a materialized-view type
approach.

If you can be approximate in your GUI you can do a lot more with this
optimization-- if people don't necessarily need to be able to go
directly to page 372898 but instead would be satisfied with a page
roughly 47% of the way into the massive result set (think of a GUI
slider), then you wouldn't need to update the lookup table as often even
if the data changed frequently, because adding a few thousand records to
a 750k row result set is statistically insignificant, so your markers
wouldn't need to be updated very frequently and you wouldn't need to
store a marker for each page, maybe only 100 markers spread evenly
across the result set would be sufficient.

-- Mark Lewis


On Thu, 2008-02-14 at 19:49 +, Michael Lorenz wrote:
> Fair enough, and I did think of this as well.  However, I didn't think this 
> was a viable option in my case, since we're currently allowing the user to 
> randomly access the pages (so $lastkey wouldn't really have any meaning).  
> The user can choose to sort on object ID, name or modification time, and then 
> go straight to any page in the list.  With 750K records, that's around 37K 
> pages.
> 
> Maybe a better way to phrase my question is:  how can I paginate my data on 3 
> different keys which allow random access to any given page, and still get 
> reasonable performance?  Should I just force the user to limit their result 
> set to some given number of records before allowing any paginated access?  Or 
> is it just not practical, period?
> 
> Thanks,
> Michael Lorenz
> 
> 
> > To: [EMAIL PROTECTED]
> > CC: pgsql-performance@postgresql.org
> > Subject: Re: [PERFORM] Query slows after offset of 100K 
> > Date: Thu, 14 Feb 2008 14:08:15 -0500
> > From: [EMAIL PROTECTED]
> > 
> > Michael Lorenz  writes:
> >> My query is as follows:
> >> SELECT o.objectid, o.objectname, o.isactive, o.modificationtime 
> >> FROMobject o 
> >> WHERE  ( o.deleted = false OR o.deleted IS NULL ) 
> >> AND  o.accountid = 111 
> >> ORDER BY 2 
> >> LIMIT 20 OFFSET 1;
> > 
> > This is guaranteed to lose --- huge OFFSET values are never a good idea
> > (hint: the database still has to fetch those rows it's skipping over).
> > 
> > A saner way to do pagination is to remember the last key you displayed
> > and do something like "WHERE key> $lastkey ORDER BY key LIMIT 20",
> > which will allow the database to go directly to the desired rows,
> > as long as you have an index on the key.  You do need a unique ordering
> > key for this to work, though.
> > 
> > regards, tom lane
> > 
> > ---(end of broadcast)---
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> >subscribe-nomail command to [EMAIL PROTECTED] so that your
> >message can get through to the mailing list cleanly
> 
> _
> Your Future Starts Here. Dream it? Then be it! Find it at www.seek.com.au
> http://a.ninemsn.com.au/b.aspx?URL=http%3A%2F%2Fninemsn%2Eseek%2Ecom%2Eau%2F%3Ftracking%3Dsk%3Ahet%3Ask%3Anine%3A0%3Ahot%3Atext&_t=764565661&_r=OCT07_endtext_Future&_m=EXT
> ---(end of broadcast)---
> TIP 7: You can help support the PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] count * performance issue

2008-03-06 Thread Mark Lewis
On Thu, 2008-03-06 at 07:28 -0800, Craig James wrote:
...
> My question is: What do the other databases do that Postgres can't do, and 
> why not?
> 
> Count() on Oracle and MySQL is almost instantaneous, even for very large 
> tables. So why can't Postgres do what they do?
...

I can vouch that Oracle can still take linear time to perform a
count(*), at least in some cases.

I have also seen count(*) fast in some cases too... my understanding is
that they maintain a list of "interested transactions" on a per-relation
basis.  Perhaps they do an optimization based on the index size if there
are no pending DML transactions?

-- Mark



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] Hardware question for a DB server

2008-03-12 Thread Mark Lewis
What type of usage does it need to scale for?  How many concurrent
connections?  What size database?  Data warehousing or OLTP-type
workloads?  Ratio of reads/writes?  Do you care about losing data?

One question that's likely going to be important depending on your
answers above is whether or not you're getting a battery-backed write
cache for that ServeRAID-8K.

-- Mark Lewis

On Wed, 2008-03-12 at 19:58 +0100, Pascal Cohen wrote:
> Hello, we plan to buy a dedicated server to host our database.
> Here is the proposal I was given (with a second identical server fro 
> backup using log shipping):
> =
> IBM X3650 (This is a 2U server, can hold 8 Drives)
> 2 x QC Xeon E5450 (3.0GHz 12MB L2 1333MHz 80W)
> 8 x 2GB RAM (16GB total)
> 2.5" SAS Hotswap
> ServeRAID-8K SAS Controller
> 8 x 73GB 15K 2.5" SAS Drive
> CD/DVD Drive
> Remote Supervisor Adapter II Slimline
> Redundant Power
> 4 Year, 24x7 2hour support/warranty
> 
> =
> 
> I would like specialists advices.
> 
> If you need additional details, please let me know.
> 
> Thanks in advance for your help
> 
> Thank you
> 
> Pascal
> 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL+Hibernate Performance

2008-08-20 Thread Mark Lewis
On Wed, 2008-08-20 at 17:55 +0530, Kranti K K Parisa™ wrote:
> Hi,
> 
> Can anyone suggest the performance tips for PostgreSQL using
> Hibernate.
> 
> One of the queries:
> 
> - PostgreSQL has INDEX concept and Hibernate also has Column INDEXes.
> Which is better among them? or creating either of them is enough? or
> need to create both of them?
> 
> and any more performace aspects ?

Hibernate is a library for accessing a database such as PostgreSQL.  It
does not offer any add-on capabilities to the storage layer itself.  So
when you tell Hibernate that a column should be indexed, all that it
does create the associated PostgreSQL index when you ask Hibernate to
build the DB tables for you.  This is part of Hibernate's effort to
protect you from the implementation details of the underlying database,
in order to make supporting multiple databases with the same application
code easier.

So there is no performance difference between a PG index and a Hibernate
column index, because they are the same thing.

The most useful Hibernate performance-tuning advice isn't PG-specific at
all, there are just things that you need to keep in mind when developing
for any database to avoid pathologically bad performance; those tips are
really beyond the scope of this mailing list, Google is your friend
here.

I've been the architect for an enterprise-class application for a few
years now using PostgreSQL and Hibernate together in a
performance-critical context, and honestly I can't think of one time
that I've been bitten by a PG-specific performance issue (a lot of
performance issues with Hibernate that affected all databases though;
you need to know what you're doing to make Hibernate apps that run fast.
If you do run into problems, you can figure out the actual SQL that
Hibernate is issuing and do the normal PostgreSQL explain analyze on it;
usually caused by a missing index.

-- Mark



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL+Hibernate Performance

2008-08-20 Thread Mark Lewis
The tradeoffs for multiple indexes are more or less as follows:

1. Having the right indexes makes queries faster, often dramatically so.

2. But more indexes makes inserts/updates slower, although generally not
dramatically slower.

3. Each index requires disk space.  With several indexes, you can easily
have more of your disk taken up by indexes than with actual data.

I would be careful to only create the indexes you need, but it's
probably worse to have too few indexes than too many.  Depends on your
app though.

-- Mark

On Wed, 2008-08-20 at 20:40 +0530, Kranti K K Parisa™ wrote:
> creating multiple indexes on same column will effect performance?
>  for example:
> 
> index1 : column1, column2, column3
> index2: column1
> index3: column2,
> index4: column3
> index5: column1,column2
> 
> which means, i am trying fire the SQL queries keeping columns in the
> where conditions. and the possibilities are like the above.
> 
> if we create such indexes will it effect on performance?
> and what is the best go in this case?
> 
> 
> On Wed, Aug 20, 2008 at 8:10 PM, Mark Lewis <[EMAIL PROTECTED]>
> wrote:
> On Wed, 2008-08-20 at 17:55 +0530, Kranti K K Parisa™ wrote:
> 
> 
> > Hi,
> >
> > Can anyone suggest the performance tips for PostgreSQL using
> > Hibernate.
> >
> > One of the queries:
> >
> > - PostgreSQL has INDEX concept and Hibernate also has Column
> INDEXes.
> > Which is better among them? or creating either of them is
> enough? or
> > need to create both of them?
> >
> > and any more performace aspects ?
> 
> 
> 
> Hibernate is a library for accessing a database such as
> PostgreSQL.  It
> does not offer any add-on capabilities to the storage layer
> itself.  So
> when you tell Hibernate that a column should be indexed, all
> that it
> does create the associated PostgreSQL index when you ask
> Hibernate to
> build the DB tables for you.  This is part of Hibernate's
> effort to
> protect you from the implementation details of the underlying
> database,
> in order to make supporting multiple databases with the same
> application
> code easier.
> 
> So there is no performance difference between a PG index and a
> Hibernate
> column index, because they are the same thing.
> 
> The most useful Hibernate performance-tuning advice isn't
> PG-specific at
> all, there are just things that you need to keep in mind when
> developing
> for any database to avoid pathologically bad performance;
> those tips are
> really beyond the scope of this mailing list, Google is your
> friend
> here.
> 
> I've been the architect for an enterprise-class application
> for a few
> years now using PostgreSQL and Hibernate together in a
> performance-critical context, and honestly I can't think of
> one time
> that I've been bitten by a PG-specific performance issue (a
> lot of
> performance issues with Hibernate that affected all databases
> though;
> you need to know what you're doing to make Hibernate apps that
> run fast.
> If you do run into problems, you can figure out the actual SQL
> that
> Hibernate is issuing and do the normal PostgreSQL explain
> analyze on it;
> usually caused by a missing index.
> 
> -- Mark
> 
> 
> 
> 
> --
> Sent via pgsql-performance mailing list
> (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
> 
> 
> 
> -- 
> 
> Best Regards
> Kranti Kiran Kumar Parisa
> M: +91 - 9391 - 438 - 738
> +91 - 9849 - 625 - 625
> 
> 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL+Hibernate Performance

2008-08-20 Thread Mark Lewis
Yes, we use connection pooling.  As I recall Hibernate ships with c3p0
connection pooling built-in, which is what we use.  We were happy enough
with c3p0 that we ended up moving our other non-hibernate apps over to
it, away from DBCP.

pgpool does connection pooling at a socket level instead of in a local
library level, so really it's a very different thing.  If your app is
the only thing talking to this database, and you don't have a
multi-database configuration, then it will be easier for you to use a
Java-based connection pooling library like c3p0 or DBCP than to use
pgpool.

-- Mark

On Wed, 2008-08-20 at 20:32 +0530, Kranti K K Parisa™ wrote:
> Hi Mark,
> 
> Thank you very much for the information. I will analyse the DB
> structure and create indexes on PG directly.
> Are you using any connection pooling like DBCP? or PG POOL?
> 
> Regards, KP
> 
> 
> On Wed, Aug 20, 2008 at 8:05 PM, Mark Lewis <[EMAIL PROTECTED]>
> wrote:
> 
> On Wed, 2008-08-20 at 17:55 +0530, Kranti K K Parisa™ wrote:
> > Hi,
> >
> > Can anyone suggest the performance tips for PostgreSQL using
> > Hibernate.
> >
> > One of the queries:
> >
> > - PostgreSQL has INDEX concept and Hibernate also has Column
> INDEXes.
> > Which is better among them? or creating either of them is
> enough? or
> > need to create both of them?
> >
> > and any more performace aspects ?
> 
> 
> Hibernate is a library for accessing a database such as
> PostgreSQL.  It
> does not offer any add-on capabilities to the storage layer
> itself.  So
> when you tell Hibernate that a column should be indexed, all
> that it
> does create the associated PostgreSQL index when you ask
> Hibernate to
> build the DB tables for you.  This is part of Hibernate's
> effort to
> protect you from the implementation details of the underlying
> database,
> in order to make supporting multiple databases with the same
> application
> code easier.
> 
> So there is no performance difference between a PG index and a
> Hibernate
> column index, because they are the same thing.
> 
> The most useful Hibernate performance-tuning advice isn't
> PG-specific at
> all, there are just things that you need to keep in mind when
> developing
> for any database to avoid pathologically bad performance;
> those tips are
> really beyond the scope of this mailing list, Google is your
> friend
> here.
> 
> I've been the architect for an enterprise-class application
> for a few
> years now using PostgreSQL and Hibernate together in a
> performance-critical context, and honestly I can't think of
> one time
> that I've been bitten by a PG-specific performance issue (a
> lot of
> performance issues with Hibernate that affected all databases
> though;
> you need to know what you're doing to make Hibernate apps that
> run fast.
> If you do run into problems, you can figure out the actual SQL
> that
> Hibernate is issuing and do the normal PostgreSQL explain
> analyze on it;
> usually caused by a missing index.
> 
> -- Mark
> 
> 
> 
> -- 
> 
> Best Regards
> Kranti Kiran Kumar Parisa
> M: +91 - 9391 - 438 - 738
> +91 - 9849 - 625 - 625
> 
> 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


  1   2   >