Quoting Greg Sabino Mullane <[EMAIL PROTECTED]>:
> > Using COPY ... FROM STDIN via the Perl DBI (DBD::Pg) interface,
> > I accidentally strung together several \n-terminated input lines,
> > and sent them to the server with a single "putline".
> ...
> > So. Is it a feechur? Worth stress-testing?
I have a performance problem; I'd like any suggestions on where to continue
investigation.
A set of insert-only processes seems to serialize itself. :-(
The processes appear to be blocked on disk IO, and probably the table drive,
rather than the pg_xlog drive.
Each process is inserting a block
Quoting Tom Lane <[EMAIL PROTECTED]>:
> "Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> > A friend of mine has an application where he's copying in 4000 rows at a
> > time into a table that has about 4M rows. Each row is 40-50 bytes. This
> > is taking 25 seconds on a dual PIII-1GHz with 1G of R
Quoting Bill Chandler <[EMAIL PROTECTED]>:
> ... The normal activity is to delete 3-5% of the rows per day,
> followed by a VACUUM ANALYZE.
...
> However, on occasion, deleting 75% of rows is a
> legitimate action for the client to take.
> > In case nobody else has asked: is your max_fsm_page
Quoting Alvaro Herrera <[EMAIL PROTECTED]>:
> One further question is: is this really a meaningful test? I mean, in
> production are you going to query 30 rows regularly? And is the
> system always going to be used by only one user? I guess the question
> is if this big select is representa
Quoting Andrew Dunstan <[EMAIL PROTECTED]>:
> After some more experimentation, I'm wondering about some sort of
> adaptive algorithm, a bit along the lines suggested by Marko
Ristola, but limited to 2 rounds.
>
> The idea would be that we take a sample (either of fixed size, or
> some sm
Quoting Richard Rowell <[EMAIL PROTECTED]>:
> I've ported enough of my companies database to Postgres to make
> warehousing on PG a real possibility. I thought I would toss my
> data
> migration architecture ideas out for the list to shoot apart..
>
[...]
Not much feedback required.
Yes, droppi
Quoting Josh Berkus :
> > >Perhaps I can save you some time (yes, I have a degree in Math). If I
> > >understand correctly, you're trying extrapolate from the correlation
> > >between a tiny sample and a larger sample. Introducing the tiny sample
> > >into any decision can only produce a less accu
> Steven Rosenstein <[EMAIL PROTECTED]> writes:
> > My question is, are there any advantages, drawbacks, or outright
> > restrictions to using multiple simultaneous COPY commands to load
> data into
> > the same table?
Do you mean, multiple COPY commands (connections) being putline'd from
the same
Quoting Markus Schaber <[EMAIL PROTECTED]>:
> Hi, Josh,
>
> Josh Berkus wrote:
>
> > Yes, actually. We need 3 different estimation methods:
> > 1 for tables where we can sample a large % of pages (say, >= 0.1)
> > 1 for tables where we sample a small % of pages but are "easily
> estimated"
> >
Quoting Josh Berkus :
> Mischa,
>
> > Okay, although given the track record of page-based sampling for
> > n-distinct, it's a bit like looking for your keys under the
> streetlight,
> > rather than in the alley where you dropped them :-)
>
> Bad analogy, but funny.
Bad analogy? Page-
Quoting David Roussel <[EMAIL PROTECTED]>:
> > COPY invokes all the same logic as INSERT on the server side
> > (rowexclusive locking, transaction log, updating indexes, rules).
> > The difference is that all the rows are inserted as a single
> > transaction. This reduces the number of fsync's on
Quoting Kris Jurka <[EMAIL PROTECTED]>:
> On Wed, 4 May 2005, Mischa Sandberg wrote:
>
> > Copy makes better use of the TCP connection for transmission. COPY
> uses
> > the TCP connection like a one-way pipe. INSERT is like an RPC: the
> > sender has
After reading the comparisons between Opteron and Xeon processors for Linux,
I'd like to add an Opteron box to our stable of Dells and Sparcs, for
comparison.
IBM, Sun and HP have their fairly pricey Opteron systems.
The IT people are not swell about unsupported purchases off ebay.
Anyone care to
Thanks to everyone for their pointers to suppliers of Opteron systems.
The system I'm pricing is under a tighter budget than a production machine,
because it will be for perftests. Our customers tend to run on Dells but
occasionally run on (Sun) Opterons.
---(end of bro
Quoting Russell Smith <[EMAIL PROTECTED]>:
> On Mon, 9 May 2005 11:49 am, Dan Harris wrote:
> > On May 8, 2005, at 6:51 PM, Russell Smith wrote:
> [snip]
> > select distinct em.incidentid, ea.recordtext as retdata, eg.long, eg.lat
> > FROM em
> > JOIN ea ON em.incidentid = ea.incidentid --- sligh
Quoting "Jim C. Nasby" <[EMAIL PROTECTED]>:
> I'm not really familiar enough with hash indexes to know if this
> would
> work, but if the maximum bucket size was known you could use that to
> determine a maximum range of buckets to look at. In some cases, that
> range would include only one bucket
Quoting [EMAIL PROTECTED]:
> > exploring the option of buying 10 cheapass
> > machines for $300 each. At the moment, that $300 buys you, from
> Dell, a
> > 2.5Ghz Pentium 4
>
> Buy cheaper ass Dells with an AMD 64 3000+. Beats the crap out of
> the 2.5
> GHz Pentium, especially for PostgreSQL.
Quoting "Jim C. Nasby" <[EMAIL PROTECTED]>:
> Well, in a hash-join right now you normally end up feeding at least
> one
> side of the join with a seqscan. Wouldn't it speed things up
> considerably if you could look up hashes in the hash index instead?
You might want to google on "grace hash" and
Quoting Alex Stapleton <[EMAIL PROTECTED]>:
> This is why I mention partitioning. It solves this issue by storing
> different data sets on different machines under the same schema.
> These seperate chunks of the table can then be replicated as well for
> data redundancy and so on. MySQL are wor
Quoting Tom Lane <[EMAIL PROTECTED]>:
> Mischa Sandberg <[EMAIL PROTECTED]> writes:
> > The PG hash join is the simplest possible: build a hash table in
> memory, and match an input stream against it.
>
> [ raised eyebrow... ] Apparently you've not read the
Quoting Christopher Kings-Lynne <[EMAIL PROTECTED]>:
> > This is why I mention partitioning. It solves this issue by storing
> > different data sets on different machines under the same schema.
> > These seperate chunks of the table can then be replicated as well for
> > data redundancy and so o
Quoting Christopher Kings-Lynne <[EMAIL PROTECTED]>:
>
> >>*laff*
> >>Yeah, like they've been working on views for the last 5 years, and
> >>still haven't released them :D :D :D
> >
> > ?
> > http://dev.mysql.com/doc/mysql/en/create-view.html
> > ...for MySQL 5.0.1+ ?
>
> Give me a call when i
Quoting "Jim C. Nasby" <[EMAIL PROTECTED]>:
> To the best of my knowledge no such work has been done. There is a
> project (who's name escapes me) that lets you run queries against a
> remote postgresql server from a postgresql connection to a different
> server, which could serve as the basis for
Quoting Christopher Kings-Lynne <[EMAIL PROTECTED]>:
> > Another trick you can use with large data sets like this when you
> want
> > results
> > back in seconds is to have regularly updated tables that aggregate
> the data
> > along each column normally aggregated against the main data set.
>
>
Was curious why you pointed out SQL-MED as a SQL-standard approach to
federated servers. Always thought of it as covering access to non-SQL
data, the way the lo_* interface works; as opposed to meshing compatible
(to say nothing of identical) SQL servers. Just checked Jim Melton's
last word on that
Quoting Guillaume Smet <[EMAIL PROTECTED]>:
> Hi,
>
> We have some performances problem on a particular query.
...
I have to say it, this was the best laid-out set of details behind a
problem I've ever seen on this list; I'm going to try live up to it, the
next time I have a problem of my own.
Quoting Tom Lane <[EMAIL PROTECTED]>:
> "Mindaugas Riauba" <[EMAIL PROTECTED]> writes:
> > ... So contents of database changes very fast. Problem is that
> when
> > pg_autovacuum does vacuum those changes slows down too much.
>
> The "vacuum cost" parameters can be adjusted to make vacuums fired
Quoting Bernd Jagla <[EMAIL PROTECTED]>:
> ... the speed of the head of the HD is actually
> limitiing. Also, I only experimented with RAID5, and heard that
> RAID10 will be good for reading but not writing.
Au contraire. RAID5 is worse than RAID10 for writing, because it has the
extra implicit r
> -Original Message-
> From: [EMAIL PROTECTED]
[mailto:pgsql-performance-
> [EMAIL PROTECTED] On Behalf Of John Parnefjord
> Sent: Tuesday, January 30, 2007 2:05 AM
> Subject: Re: [PERFORM] Tuning
> EnterpriseDB advocates: 250 KB + 8.2 KB * shared_buffers + 14.2 kB *
> max_connections up t
Dan Gorman wrote:
All,
I might be completely crazy here, but it seems every other database
exposes select query stats. Postgres only exposes
updates/deletes/inserts. Is there something I am missing here?
Perhaps.
You can EXPLAIN ANALYZE a SELECT, just like i/u/d -- but then you
don't get th
Dan Gorman wrote:
What I am looking for is that our DB is doing X selects a min.
What specifically would you like to measure?
Duration for specific queries?
Queries in an app for which you have no source?
There may be a way to get what you want by other means ...
Details?
I gather you cannot j
Alvaro Herrera wrote:
Yeah, the problem seems underspecified.
So, Dan, the question is, what are you trying to measure?
This might be a statistic that management has always been given,
for Oracle, and you need to produce the "same" number for PostgreSQL.
If not, it's hard to figure out what a
Andrew Sullivan wrote:
On Thu, Jun 01, 2006 at 01:54:08PM +0200, Antoine wrote:
Hi,
We just don't seem to be getting much benefit from autovacuum. Running
a manual vacuum seems to still be doing a LOT, which suggests to me
that I should either run a cron job and disable autovacuum, or just
run a
Joshua D. Drake wrote:
- in our env, clients occasionally hit max_connections. This is a
known and (sort of) desired pushback on load. However, that sometimes
knocks pg_autovacuum out.
That is when you use:
superuser_reserved_connections
Blush. Good point. Though, when we hit max_connection
Jim C. Nasby wrote:
...
Actually, in 8.1.x I've seen some big wins from greatly increasing the
amount of shared_buffers, even as high as 50% of memory, thanks to the
changes made to the buffer management code. ...
Anyone else run into a gotcha that one of our customers ran into?
PG 7.4.8 runnin
Tom Lane wrote:
Mischa Sandberg <[EMAIL PROTECTED]> writes:
vmstat showed that it was swapping like crazy.
Dropped shared_buffers back down again.
Swapping stopped.
Does Solaris have any call that allows locking a shmem segment in RAM?
Yes, mlock(). But want to understand what'
Jim C. Nasby wrote:
On Tue, Jun 13, 2006 at 03:21:34PM -0700, Mischa Sandberg wrote:
Raised shared_buffers to 16000 (128MB).
DB server dropped to a CRAWL.
vmstat showed that it was swapping like crazy.
Dropped shared_buffers back down again.
Swapping stopped.
What's sort_mem set
Jim C. Nasby wrote:
On Tue, Jun 13, 2006 at 04:20:34PM -0700, Mischa Sandberg wrote:
Jim C. Nasby wrote:
What's sort_mem set to? I suspect you simply ran the machine out of
memory.
8192 (8MB). No issue when shared_buffers was 2000; same apps always.
So if all 50 backends were running a
Tom Lane wrote:
Mischa Sandberg <[EMAIL PROTECTED]> writes:
Tom Lane wrote:
Does Solaris have any call that allows locking a shmem segment in RAM?
Yes, mlock(). But want to understand what's going on before patching.
Sure, but testing it with mlock() might help you understand w
Mark Lewis wrote:
On Thu, 2006-06-15 at 14:05 -0400, John Vincent wrote:
Now I've been told by our DBA that we should have been able to wholy
satisfy that query via the indexes.
DB2 can satisfy the query using only indexes because DB2 doesn't do
MVCC.
You can get pretty much the same effect
PG 8.0.3 is choosing a bad plan between a query.
I'm going to force the plan (by making one join into a function).
I'd like to know if this is unexpected; in general,
can PG see that a join on an grouped-by field
can be pushed down into the query as an indexable filter?
The query below joins a t
Tom Lane wrote:
Mischa Sandberg <[EMAIL PROTECTED]> writes:
can PG see that a join on an grouped-by field
can be pushed down into the query as an indexable filter?
No. The GROUP BY serves as a partial optimization fence. If you're
concerned about the speed of this query, I recom
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bucky
Jordan
Sent: Thursday, October 12, 2006 2:19 PM
To: josh@agliodbs.com; Jim C. Nasby
Cc: pgsql-hackers@postgresql.org; pgsql-performance@postgresql.org
Subject: Re: [HACKERS] [PERFORM] Hints proposal
>
Don't know about Oracle, but select-distinct in MSSQL2K will indeed throw
away duplicates, which chops the CPU time. Very easy to see in the graphic
query plan, both in terms of CPU and the number of rows retrieved from a
single-node or nested-loop subtree. Definitely a worthwhile optimization.
"T
I'm trying to make a (qua-technical, qua-business) case for switching from
MS SQL, and one of the types of query that really doesn't sit well with MS
SQL2K is:
-- All fields integers or equivalent.
-- Table T(k, x: nonkey fields...)
-- Table U(k, a, z: m)-- for each value of (k) a set of non-i
Usually, when you post a request like this, you should provide
something a little more concrete (the CREATE TABLE statement for that table,
with
Since you didn't, I'll posit something that sounds like what
you're using, and take a stab at your problem.
TABLE Prices (
stock
VARCHAR(
""Andy Ballingall"" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> On another thread, (not in this mailing list), someone mentioned that
there
> are a class of databases which, rather than caching bits of database file
> (be it in the OS buffer cache or the postmaster workspace), co
Regarding Raid5 at all, you might want to look at http://www.baarf.com
""Stephane Tessier"" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> I think with your help guys I'll do it!
>
> I'm working on it!
>
> I'll work on theses issues:
>
> we have space for more ram(we use 2 gigs on p
Coming from the MSSQL world, I'm used to the first step in optimization
to be, choose your clustered index and choose it well.
I see that PG has a one-shot CLUSTER command, but doesn't support
continuously-updated clustered indexes.
What I infer from newsgroup browsing is, such an index is impossib
Ummm ... not quite. In MSSQL/Sybase/Oracle, a clustered index maintains
its space saturation as part of each update operation. High activity
does indeed result in less-full pages (typically 60-80% full for tables
with heavy deletions or rowsize changes). To bring the percentage back
up, you run
I think you've probably fingered the kicker of why PG doesn't have this
kind of clustering already. Hence perhaps the need for other approaches
to the issue (the disk-IO efficiency of reading groups of rows related
by a common key) that other DB's (with in-place update) address with
synchronous c
This discussion is starting to sound like the split in HEAP memory
management evolution, into garbage-collecting (e.g. Java) and
non-garbage-collecting (e.g. C++).
Reclamation by GC's these days has become seriously sophisticated.
CLUSTER resembles the first generation of GC's, which were
singl
Sheer nitpick here...
A B-tree is where the records (data) live at all levels of the tree;
B+ tree is where the records are only at the leaf level.
That's what Knuth calls them, anyway.
Clustered indexes for all known dbs are true B+ trees.
Nonclustered indexes could be B-trees (probably aren't),
s
J. Andrew Rogers wrote:
On Thu, 2004-08-26 at 12:30, Magnus Hagander wrote:
IIRC, SQL Server always creates clustered indexes by default for primary
keys.
That would surprise me actually.
Yaz, it should. It doesn't ALWAYS create clustered (unique) index for
primary keys, but clustered is the defau
Mischa Sandberg wrote:
Coming from the MSSQL world, I'm used to the first step in optimization
to be, choose your clustered index and choose it well.
I see that PG has a one-shot CLUSTER command, but doesn't support
continuously-updated clustered indexes.
What I infer from newsgroup b
Mark Cotner wrote:
Hi all,
I had a difficult time deciding which list to post
this to, so please forgive me if this list doesn't
perfectly match my questions. My decision will not
solely be based on performance, but it is the primary
concern. I would be very appreciative if you all
could comment
Damien Dougan wrote:
Basically we have a number of tables, which are exposed as 2 public
views (say PvA and PvB). For each row in PvA, there are a number of
related rows in PvB (this number is arbitrary, which is one of the
reasons why it cant be expressed as additional columns in PvA - so we
reall
Simon Riggs wrote:
Jim C. Nasby
On Mon, Sep 13, 2004 at 11:07:35PM +0100, Simon Riggs wrote:
PostgreSQL's functionality is in many ways similar to Oracle
Partitioning.
Loading up your data in many similar tables, then creating a view like:
CREATE VIEW BIGTABLE (idate, col1, col2, col3...) AS
SELECT
Googling 'upsert' (an Oraclism, I believe) will get you hits on Oracle
and DB2's implementation of MERGE, which does what AMOUNTS to what is
described below (one mass UPDATE...FROM, one mass INSERT...WHERE NOT
EXISTS).
No, you shouldn't iterate row-by-row through the temp table.
Whenever possibl
Our product (Sophos PureMessage) runs on a Postgres database.
Some of our Solaris customers have Oracle licenses, and they've
commented on the performance difference between Oracle and Postgresql
on such boxes. In-house, we've noticed the 2:1 (sometimes 5:1)
performance difference in inserting row
Tom Lane wrote:
Ole Langbehn <[EMAIL PROTECTED]> writes:
What do you think about the idea of an "UniqueSort" which would do
sort+unique in one pass ?
This is what oracle does and it is quite fast with it...
Hashing is at least as fast, if not faster.
regards, tom lane
I got good mileage in a
This may sound more elaborate than it's worth, but I don't know of
a better way to avoid a table scan.
You want to index on a computed value that is a common prefix of your
FROM and TO fields.
The next step is to search on a fixed SET of prefixes of different
lengths. For example, some of your ran
63 matches
Mail list logo