Tom Lane <[EMAIL PROTECTED]> writes:
> Gregory Stark <[EMAIL PROTECTED]> writes:
> > There are various attempts at providing better timing infrastructure at low
> > overhead but I'm not sure what's out there currently. I expect to do this
> >
rhaps it's possible to finesse that issue by providing
either the Limit or Sort node with pointers to other nodes.
(Incidentally I'm not sure where 2-5x comes from. It's entirely dependant on
your data distribution. It's not hard to come up with distributions where it's
1000x
Tom Lane <[EMAIL PROTECTED]> writes:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Gregory Stark wrote:
> >> (Incidentally I'm not sure where 2-5x comes from. It's entirely dependant
> >> on
> >> your data distribution. It's not
case I would 100% agree with your conclusion and
strongly recommend using varchar. The only reason I would think of using char
is when the data should always be the same length, like a SSN or md5hash or
something like that. In which case it's purely for the self-documenting
notational convenience, n
he
filesystem the flexibility to place the chunks efficiently.
In the case of a performance-critical file like the WAL that's always read
sequentially it may be to our advantage to defeat this technique and force it
to be allocated sequentially. I'm not sure whether any filesystems provi
rst n records according to some sort key without actually sorting
the records. That might make it more worthwhile.
In short. Yes, there are a lot of optimizations possible around partitioned
tables that we don't do either because it's not clear how to tell when they'r
t all the
constraints were mutually exclusive and covered ascending ranges then it could
avoid doing the extra sort. Hm...
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 3: Have you checke
possible to add features like this in the
future.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
an type is for joins which isn't going to be what you need.
But the queries as written here would be just as fast or faster to do one big
sort as they would be to do separate sorts and merge the results.
You might want to do it the way you describe if there were selective WHERE
clauses that you&
actually be a problem. It would cause transactions that started
before the cluster (but didn't access the table before the cluster) to not see
any records after the cluster.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
ck table succeeds, you truncate it and
commit, then the old transaction gets around to looking at the table.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
't be me the one to shoot you.
>
> There have been noises towards making the ANALYZE portion use the same
> scan that VACUUM already does, but nobody has written the code (it would
> be useful for some kinds of stats).
I think it does for the count of total records in the table.
But
only if you're counting transactions/sec in a single
session. You can get much more if you have many sessions since they can all
commit together in a single disk i/o.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)-
R upper(b.imie) = ?)
AND (? = '' OR b.pesel = ?)
AND (? = '' OR upper(trim(b.downseria)) = ?)
AND (? = '' OR b.dowosnr = ?)
AND (? = 0 OR b.typkred = ?)
AND k.datazwrot IS NULL
If this is the only query or a particularly important query you could consi
10291 loops=1)
> -> Index Scan using m_pkey on m (cost=0.00..0.80 rows=1 width=7)
> (actual time=0.009..0.009 rows=0 loops=10291)
That's not discounting the nested loop for cache effect at all!
What is your effective_cache_size for this?
--
Gregory Stark
E
performance. So even with 10 clients
you should expect to see 5*120 tps on a 7200 rpm drive and 5*250 tps on a
15kprm drive.
Heikki posted a patch that experimented with fixing this. Hopefully it'll be
fixed for 8.4.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---
governs perceptions, not average case.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAI
"Greg Smith" <[EMAIL PROTECTED]> writes:
> On Tue, 22 May 2007, Gregory Stark wrote:
>
>> However as mentioned a while back in practice it doesn't work quite right and
>> you should expect to get 1/2 the expected performance. So even with 10
>> clien
ber for each length of
> search-string (afaik).
I don't think that's true. Postgres calculates the lower and upper bound
implied by the search pattern and then uses the histogram to estimate how
selective that range is. It's sometimes surprisingly good but obviously it
l batch
of short transactions and committing each one as a separate transaction. In
that case you would want a drive that can fsync fast which either means a
battery backed cache or 15kRPM drive. It doesn't necessarily mean you need a
bit raid array though.
--
Gregory Stark
EnterpriseDB
e reads in idle cycles.
I don't think you normally do it for performance though since there's more to
be gained by using larger stripes. In theory you should get the same boost on
reads as widening your stripes but of course you get no benefit on writes. And
I'm not sure raid con
ler cache -- that was hell to track down).
Checksums aren't even enough in that case as you'll happily generate a
checksum for the bad data before storing it...
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 6: explain analyze is your friend
move duplicates which often involves gathering all the
records and performing a big sort and lots of extra work. UNION ALL is much
faster and can start returning records right away.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)-
from table bloat. But if you never delete or update records
then that's irrelevant.
Does reindexing or clustering the table make a marked difference?
I would suggest you post your schema and the results of "vacuum verbose".
--
Gregory Stark
EnterpriseDB http://
query
plans. It's possible you have some plans that are processing many more records
than they need to to do their work because they're using indexes or
combinations of indexes that aren't ideal.
specific enough
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.c
Those plans look like they have a lot of casts to text in them. How have you
defined your indexes? Are your id columns really text?
And you don't have a 7.4 install around to compare the plans do you?
--
Gregory Stark
EnterpriseDB http://www.enterprised
width=0) (actual time=1.721..1.721 rows=1279 loops=1)
When's the last time you analyzed your tables? Postgres is guessing it'll find
60 rows and instead finding over a thousands rows...
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(
nd load entire partitions in O(1) is makes it feasible to
manage data on a scale that would simply be impossible without partitioned
tables.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 4:
aps where share=1
(the latter might take a while)
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
#x27;s not "explain analyze", that's just plain "explain".
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
nerally by increasing cpu
usage. Usually they slow things down by causing queries to require more i/o.
It's only UPDATES and DELETES that create garbage tuples that need to be
vacuumed though. If some of your tables are mostly insert-only they might need
to be vacuumed as frequently or at all.
-
est information would be to do vacuum verbose and report the data it
prints out.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings
5 rows out
>> after the filter. It's using this plan anyway because it has no better
>> alternative, but you should think about whether a different index
>> definition would help.
Another index won't help if the reason the cost is so high isn't because the
index i
"Gregory Stark" <[EMAIL PROTECTED]> writes:
> "Karl Wright" <[EMAIL PROTECTED]> writes:
>
>>> In this case it looks like the planner is afraid that that's exactly
>>> what will happen --- a cost of 14177 suggests that several thous
hat would mean loading one of your polls into the small
bits of space freed up in every page. For most tables like this you want to do
large bulk loads and want your loads stored quickly in contiguous space so it
can be accessed quickly, not spread throughout the table.
--
Gregory Stark
you're going to get garbage. Even in Postgres wasn't writing
anything the OS might still choose to flush blocks during that time, possibly
not even Postgres data blocks but filesystem meta-information blocks.
--
Gregory Stark
EnterpriseDB http://www.e
to good for your needs.
You do not want to be reconnecting to the database for each page fetch.
Replanning queries is the least of the problems with that approach.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)--
.
I concur that anything that doesn't allow concurrent i/o while the
snapshotting is happening is worthless. It sounds like you're just dd'ing from
the device which is pretty much guaranteed not to work.
Even if Postgres didn't do any i/o there's nothing stopping the OS and
00:39:43 sfmedstorageha001 postgres[3506]: [9-1] 2007-06-21 00:39:43
> PDTLOG: redo done at 71/99870670
> Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [10-1] 2007-06-21 00:39:43
> PDTWARNING: page 28905 of relation 1663/16384/76718 was uninitialized
What version of Postgr
t could explain missing pages at the end
of a file like this too. And it would explain how you could have two written
in the midst of others that are missing.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)-
nt: 24MB of shared
> buffers
> equates to a machine with 128MB of memory, give or take).
I think it's more that the stock configure has to assume it's not a dedicated
box. Picture someone installing Postgres on their debian box because it's
required for Gnucash. Even having
lowered it just as an experiment to test if it was checkpoint causing the
problems not as a permanent measure.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings
nly way I can see that happening would be if you had
an index on "my_molkey(version_id)" and "my_rownum(version_id) WHERE row_num
between 10 and 20". Then it could do a merge join between two index
scans.
Note than even then I'm surprised the optimizer is bothering
good. I'm curious how it performs though.
Actually it seems like in that configuration fsync should be basically
zero-cost. In other words, you should be able to leave fsync=on and get the
same performance (whatever that is) and not have to worry about any risks.
--
Gregory Stark
Ent
serts in autocommit mode? What kind of transaction rate do you get
with both sync mode on and fsync=on in Postgres?
And did you say this with a battery backed cache? In theory fsync=on/off and
shouldn't make much difference at all with a battery backed cache. Stranger
and
t. Just throwing
them all into a big raid might work just as well.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings
hat doesn't matter much but if
it's a cpu-bound query it can.
>> ...can I use \timing??? I don't get any time when using the
>> \timing option...
Yes you can use \timing. You'll have to provide more information of what
you're doing before
ully slow, but
> I didn't do a detailed analysis of that issue yet.
That's strange. Deleting should be the *quickest* operation in Postgres. Do
you perchance have foreign key references referencing this table? Do you have
any triggers?
--
Gregory Stark
EnterpriseDB http://www.e
during that fsync so you would need at least 6 concurrently busy
connections. If you have a more cpu-bound system then that number might be
higher but 100+ connections ought to be enough and in any case I would expect
a benchmark to be mostly disk-bound.
--
Gregory Stark
EnterpriseDB
t it's
faster at deleting a single 1G file than ext3.
On mythtv the experience is that if you use ext3 and delete a large file while
recording another program you can expect the new recording to lose stutter at
that point. The large delete will lock out the recording from writing to the
files
st of data loss on a system crash or power failure. With
a BBU there's no advantage to fsync=off so that temptation to risk data loss
is removed.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
pplication is different.
On an exotic machine like this you're going to run into unique problems that
nobody here can anticipate with certainty.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 6: explain analyze is your friend
if you ran out of lock
> space entirely
I assume you've checked the server logs and are sure that you aren't in fact
getting errors. I could, for example, envision a situation where a fraction of
the transactions are getting some error and those transactions are therefore
not being
p
PGPool or PGBouncer or some other connection aggregating tool to handle the
connections. This is a pretty low-impact change which shouldn't require making
any application changes aside from changing the database connection string.
Effectively this is a just a connection pool that lives
each of the SPI queries and the locks. I have
some ideas for tackling the SPI queries which would help the batch loading
case but I'm not sure how much resources it makes sense to expend to save 5%
in the OLTP case.
ri-profiling-v2.patch.gz
Description: Binary data
--
Gregory Star
s=208701 loops=1)
>-> Index Scan using "mytable(test_col)" on mytable (cost=0.00..14160.38
> rows=306925 width=4) (actual time=0.140..575.580 rows=306925 loops=1)
> Total runtime: 1013.657 ms
> (3 rows)
I assume you have random_page_cost dialled way down? The cost
-> Seq Scan on tbl_filetype_suffix
(cost=1.00..10001.34 rows=14 width=8) (actual time=0.133..0.176
rows=14 loops=1)"
" Filter: (filetype_suffix_index IS
TRUE)"
--
Gregory Stark
EnterpriseDB
fully the line breaks are gone. I couldn't find any
> in my sent mail.
No, the double-quotes are gone but the lines are still wrapped. It's become
quite a hassle recently to get mailers to do anything reasonable with code.
--
Gregory Stark
EnterpriseDB
hind the buffer will fill and throttle the sender.
If your Postgres data is on the same device as the syslogs those fsyncs will
probably cause a big slowdown directly on Postgres's I/O as well.
You can turn off the fsyncs in syslog by putting a - before the filename.
--
Gregory
this index. If we created a new index and then tried to drop this one the drop
would fail because of the foreign key which needs it. It's possible these
problems could all be worked out but it would still take quite a bit of work
to do so.
--
Gregory Stark
EnterpriseDB http://ww
"Steven Flatt" <[EMAIL PROTECTED]> writes:
> On 8/22/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
>
> Interestingly enough, the example you've given does not work for me either.
> The select count(*) from test blocks until the reindex completes. Are we
&g
eg, for REINDEX to report the new
> relfilenode) can have the following behavior:
Should reindex be doing an in-place update? Don't we have to do in-place
updates for other system catalogs which are read in snapshotnow for precisely
the same reasons?
Alternatively, why does the planner
y that 75-line
plans push the bounds of my assisting-you-pleasure. Have you experimented with
simplifying this query?
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
ger (or
smaller but that wouldn't happen just due to deletes unless you run vacuum)
then recent versions of Postgres will notice even if you don't run analyze and
take that into account.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
-
ually use your index but the latter can't
unless you create one for it specifically (which is not so easy -- it'll be
easier in 8.3 though). Worse, I'm not really sure it'll be any faster than the
query you already have.
--
Gregory Stark
EnterpriseDB http://www.enterp
"Gregory Stark" <[EMAIL PROTECTED]> writes:
> "JS Ubei" <[EMAIL PROTECTED]> writes:
>
>> I need to improve a query like :
>>
>> SELECT id, min(the_date), max(the_date) FROM my_table GROUP BY id;
>...
> I don't think you'll
"hubert depesz lubaczewski" <[EMAIL PROTECTED]> writes:
> On Wed, Sep 05, 2007 at 12:30:21PM +0100, Gregory Stark wrote:
>> SELECT DISTINCT ON (id) id, the_date AS min_date FROM my_table ORDER BY id,
>> the_date ASC
>> SELECT DISTINCT ON (id) id, the_date
was it earlier this will use an index should a reasonable one
> exist.
That's not true for this query. In fact that was precisely the original query
he as looking to optimize.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of br
sonably efficient source of the
distinct ids. Also it may or may not be faster than simply scanning the whole
table like above and simulating it with subqueries makes it impossible to
choose the best plan.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---
minfo say? What does it say when this is happening?
You might also tweak /proc/sys/vm/overcommit_memory but I don't remember what
the values are, you can search to find them.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 6: explain analyze is your friend
"Simon Riggs" <[EMAIL PROTECTED]> writes:
> You're right, but the distinction is a small one. What are the chances
> of losing two independent servers within a few milliseconds of each
> other?
If they're on the same power bus?
--
Gregory
"Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> That chance is minuscule or at least should be. Of course we are
> assuming some level of conditioned power that is independent of the
> power bus, e.g; a UPS.
I find your faith in UPSes charmingly quaint.
--
Gre
t; You would have to have lightning handed by God to your server to have a
> total power failure without proper shutdown in the above scenario.
Which happens a couple times a year to various trusting souls. I suppose
you're not a regular reader of Risks? Or a regular user of Livejournal fo
uot;. Even with AIO your
seek times are not going to be improved by wide raid stripes. And you can't
possibly find the page at level n+1 before you've looked at the page at level
n. Do you mean to be able to probe multiple index keys simultaneously? How
does that work out?
--
Gregory
sequential and random without prefetch would be even higher.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings
"Jean-David Beyer" <[EMAIL PROTECTED]> writes:
> Gregory Stark wrote (in part):
>
>> The extra spindles speed up sequential i/o too so the ratio between
>> sequential
>> and random with prefetch would still be about 4.0. But the ratio between
>> s
rick is:
echo 1 > /proc/sys/vm/drop_caches
Also, it helps to run a "vmstat 1" in another window and watch the bi and bo
columns.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)-
"Gregory Stark" <[EMAIL PROTECTED]> writes:
> "Luke Lonergan" <[EMAIL PROTECTED]> writes:
>
>> Right now the pattern for index scan goes like this:
>>
>> - Find qualifying TID in index
>> - Seek to TID location in relfile
>>
n't
committing for a long time. This can stop vacuum from being able to clean up
dead space and if it's in the middle of a query can actually cause vacuum to
get stuck waiting for the query to finish using the page it's using.
--
Gregory Stark
EnterpriseDB http://www.e
ut are precisely those that Postgres will likely
choose shortly as victim buffers, forcing Linux to page them back in just so
Postgres can overwrite them.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
"Csaba Nagy" <[EMAIL PROTECTED]> writes:
> On Fri, 2007-09-21 at 09:03 +0100, Gregory Stark wrote:
>> >> Mem: 32945280k total, 32871832k used,73448k free, 247432k buffers
>> >> Swap: 1951888k total,42308k used, 1909580k free, 30294300k cache
ain any
rows.
> and since the total amount of reading from the disk exceeds the amount of
> physical memory, then the valid tuples are "pushed out" of memory.
That's right.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(
of 10? 100? 1000?
Not really. It's a big enough difference for the planner to make a bad
decision or it isn't. But if you pressed me I would say a factor of 10 is bad.
A factor of 2 is inevitable in some cases.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
best way to log the SQL that is being
>> executed?
>
> Take a look at statement_timeout and log_statement configuration variables.
I suspect he meant log_min_duration_statement which lets you log only queries
which take too long and not statement_timeout which would actually kill your
than trying to get to run fast enough to
fit in the off-peak period.
> deadlock_timeout = 1
I would not suggest having this quite this high. Raising it from the default
is fine but having a value larger than your patience is likely to give you the
false impression that something is hung if y
he common leaf pages
and heap pages will in fact be cached.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EM
ogs. In principle we could
> track per-database xmin values as well, but the distributed overhead
> that'd be added to *every* GetSnapshotData call is a bit worrisome.
Don't we do that now in CVS (ie, in 8.2)?
--
Gregory Stark
EnterpriseDB http://www.enterprisedb
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Gregory Stark wrote:
> > Tom Lane <[EMAIL PROTECTED]> writes:
> >
> > > Matteo Sgalaberni <[EMAIL PROTECTED]> writes:
> > > > Good to know this...but why this behaviour? it'is lovely...:)
Florian Weimer <[EMAIL PROTECTED]> writes:
> I've done that. Fortunately, ANALYZE time didn't increase by that
> much, compared to the default (by just a factor of 10).
With really high stats times you also have to keep an eye on planning time.
The extra data in the stats table can cause plan
tivity of WHERE clause
>> > -- Incorrect selectivity of JOIN
>> > -- Wrong estimate of rows returned from SRF
>> > -- Incorrect cost estimate for index use
>> >
>> > Can you think of any others?
-- Incorrect estimate for resu
> Gaetano,
>
> > don't you think the best statistic target for a boolean
> > column is something like 2? Or in general the is useless
> > have a statistics target > data type cardinality ?
>
> It depends, really, on the proportionality of the boolean values; if they're
> about equal, I certain
Bruce Momjian wrote:
> Agreed. I think we should reduce it at least to 3.
Note that changing it from 4 to 3 or even 2 is unlikely to really change much.
Many of the plans people complain about turn out to have critical points
closer to 1.2 or 1.1.
The only reason things work out better with
che configuration which has a larger number
of connections, limit the number of connections for the cpu-bound dynamic
content server, and have a 1-1 ratio between apache dynamic content
connections and postgres backends. The alternative is to use connection
pooling. Often a combination of the two
xid patch
going in. Perhaps calling some function which forces an xid to be allocated
and seeing how much it slows down the benchmark would be a good substitute.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---(end of broadcast)---
the value and
there would always be a ceiling to bump into so just raising the number of
buffers isn't particularly interesting unless there's some magic numbers we're
trying to hit.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---
to this bottleneck at all.
Raising NUM_CLOG_BUFFERS just moves around the arbitrary bottleneck. This
benchmark is useful in that it gives us an idea where the bottleneck lies for
various values of NUM_CLOG_BUFFERS but it doesn't tell us what value realistic
users are likely to bump into.
--
Gr
of plain UNION.
Finally you should consider removing all the intermediate GROUP BYs and just
group the entire result. In theory it should be faster but in practice I'm not
sure it works out that way.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---
ry available
for cache. Also, work_mem could be larger at least for large batch queries
like this.
None of this is relevant for this query though. Actually I think a larger
work_mem can avoid problems with hash joins so you might try that but I don't
think it would be choosing it estimated that m
y to relevant bits of context :P
But the original post didn't include any foreign key constraints. I suspect
you've guessed it right though. In fact I suspect what's happening is he
doesn't have an index on the referencing column so the foreign key checks ar
n them up yet.
In theory if we can preserve ordering across append nodes there's no good
reason to prune them. But generally I think simplifying the plan is good if
only to present simpler plans to the user.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---
1 - 100 of 219 matches
Mail list logo