Jesper,
the whole idea of bitmap index scan is to optimize heap access, so it ruins
any ordering, returned by index. That's why our new KNNGist, which returned
ordered index tuples doesn't supports bitmap index scan (note, this is only
for knn search).
Oleg
On Wed, 6 Jan 2010, Robert Haas
On Thu, 7 Jan 2010, Jesper Krogh wrote:
If disk seeks are killing you a kinda crazy idea would be to
duplicate the table - clustering one by (id1) and
the other one by an index on (id2) and unioning the
results of each.
That's doubling the disk space needs for the table. Is there any odds
that
Hi all,
following the simple but interesting air-traffic benchmark published at:
http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/
I decided to run the benchmark over postgres to get some more
experience and insights. Unfortunately, the
Hello
- Lefteris lsi...@gmail.com escreveu:
Hi all,
following the simple but interesting air-traffic benchmark published
at:
http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/
Quite interesting test, if you have the time to
In response to Lefteris :
airtraffic=# EXPLAIN ANALYZE SELECT DayOfWeek, count(*) AS c FROM
ontime WHERE Year BETWEEN 2000 AND 2008 GROUP BY DayOfWeek ORDER
BY c DESC;
QUERY
PLAN
Carlo Stonebanks wrote:
Our DB has an audit table which is 500M rows and growing. (FYI the
objects being audited are grouped semantically, not individual field
values).
Recently we wanted to add a new feature and we altered the table to add
a new column. We are backfilling this varchar(255)
Thank you all for your answers!
Andrea, I see the other way around what you are saying:
Sort (cost=7407754.12..7407754.13 rows=4 width=2) (actual
time=371188.821..371188.823 rows=7 loops=1)
Seq Scan on ontime (cost=0.00..7143875.40 rows=52775727 width=2)
(actual time=190938.959..346180.079
On 7-1-2010 13:38 Lefteris wrote:
I decided to run the benchmark over postgres to get some more
experience and insights. Unfortunately, the query times I got from
postgres were not the expected ones:
Why were they not expected? In the given scenario, column databases are
having a huge
Hi Arjen,
so I understand from all of you that you don't consider the use of 25k
for sorting to be the cause of the slowdown? Probably I am missing
something on the specific sort algorithm used by PG. My RAM does fill
up, mainly by file buffers from linux, but postgres process remains to
0.1%
In response to Lefteris :
Thank you all for your answers!
Andrea, I see the other way around what you are saying:
Sort (cost=7407754.12..7407754.13 rows=4 width=2) (actual
time=371188.821..371188.823 rows=7 loops=1)
Seq Scan on ontime (cost=0.00..7143875.40 rows=52775727 width=2)
In response to Lefteris :
Hi Arjen,
so I understand from all of you that you don't consider the use of 25k
for sorting to be the cause of the slowdown? Probably I am missing
something on the specific sort algorithm used by PG. My RAM does fill
up, mainly by file buffers from linux, but
Yes, I am reading the plan wrong! I thought that each row from the
plan reported the total time for the operation but it actually reports
the starting and ending point.
So we all agree that the problem is on the scans:)
So the next question is why changing shared memory buffers will fix
that? i
Lefteris escribió:
Yes, I am reading the plan wrong! I thought that each row from the
plan reported the total time for the operation but it actually reports
the starting and ending point.
So we all agree that the problem is on the scans:)
So the next question is why changing shared memory
Alvaro Herrera escribió:
No amount of tinkering is going to change the fact that a seqscan is the
fastest way to execute these queries. Even if you got it to be all in
memory, it would still be much slower than the other systems which, I
gather, are using columnar storage and thus are
On 7.1.2010 15:23, Lefteris wrote:
I think what you all said was very helpful and clear! The only part
that I still disagree/don't understand is the shared_buffer option:))
Did you ever try increasing shared_buffers to what was suggested (around
4 GB) and see what happens (I didn't see it in
On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras ivo...@freebsd.org wrote:
On 7.1.2010 15:23, Lefteris wrote:
I think what you all said was very helpful and clear! The only part
that I still disagree/don't understand is the shared_buffer option:))
Did you ever try increasing shared_buffers to what
On Thu, Jan 7, 2010 at 3:05 PM, Lefteris lsi...@gmail.com wrote:
On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras ivo...@freebsd.org wrote:
On 7.1.2010 15:23, Lefteris wrote:
I think what you all said was very helpful and clear! The only part
that I still disagree/don't understand is the
Leo Mannhart leo.mannh...@beecom.ch wrote:
You could also try to just update the whole table in one go, it is
probably faster than you expect.
That would, of course, bloat the table and indexes horribly. One
advantage of the incremental approach is that there is a chance for
autovacuum or
- Lefteris lsi...@gmail.com escreveu:
Did you ever try increasing shared_buffers to what was suggested
(around
4 GB) and see what happens (I didn't see it in your posts)?
No I did not to that yet, mainly because I need the admin of the
machine to change the shmmax of the kernel and
Kevin Grittner wrote:
Leo Mannhart leo.mannh...@beecom.ch wrote:
You could also try to just update the whole table in one go, it is
probably faster than you expect.
That would, of course, bloat the table and indexes horribly. One
advantage of the incremental approach is that there is a
Hello,
I've been lurking on this list a couple weeks now, and have asked some side
questions to some of the list members, who have been gracious, and helpful,
and encouraged me to just dive in and participate on the list.
I'll not tell you the whole story right off the bat, but let me give
2010/1/7 Lefteris lsi...@gmail.com:
On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras ivo...@freebsd.org wrote:
On 7.1.2010 15:23, Lefteris wrote:
I think what you all said was very helpful and clear! The only part
that I still disagree/don't understand is the shared_buffer option:))
Did you ever
On Thu, Jan 7, 2010 at 4:57 PM, Ivan Voras ivo...@freebsd.org wrote:
2010/1/7 Lefteris lsi...@gmail.com:
On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras ivo...@freebsd.org wrote:
On 7.1.2010 15:23, Lefteris wrote:
I think what you all said was very helpful and clear! The only part
that I still
On Thu, 7 Jan 2010, Gurgel, Flavio wrote:
If one single query execution had a step that brought a page to the
buffercache, it's enough to increase another step speed and change the
execution plan, since the data access in memory is (usually) faster then
disk.
Postgres does not change a query
Alvaro Herrera wrote:
No amount of tinkering is going to change the fact that a seqscan is the
fastest way to execute these queries. Even if you got it to be all in
memory, it would still be much slower than the other systems which, I
gather, are using columnar storage and thus are perfectly
I would suggest:
1. turn off autovacuum
1a. ewentually tune db for better performace for this kind of operation
(cant not help here)
2. restart database
3. drop all indexes
4. update
5. vacuum full table
6. create indexes
7. turn on autovacuum
Ludwik
2010/1/7 Leo Mannhart leo.mannh...@beecom.ch
Ludwik Dylag ldy...@gmail.com wrote:
I would suggest:
1. turn off autovacuum
1a. ewentually tune db for better performace for this kind of
operation (cant not help here)
2. restart database
3. drop all indexes
4. update
5. vacuum full table
6. create indexes
7. turn on autovacuum
Welcome out of the shadows, Gary! ;-)
Gary Warner g...@cis.uab.edu wrote:
My biggest question mark there really has to do with how many
users I have and how that might alter the results.
In benchmarks I've run with our software load, I've found that I get
best throughput when I use a
On Thursday 07 January 2010 09:57:59 Kevin Grittner wrote:
Ludwik Dylag ldy...@gmail.com wrote:
I would suggest:
1. turn off autovacuum
1a. ewentually tune db for better performace for this kind of
operation (cant not help here)
2. restart database
3. drop all indexes
4. update
On Thu, Jan 7, 2010 at 10:57 AM, Gurgel, Flavio fla...@4linux.com.br wrote:
- Matthew Wakeling matt...@flymine.org escreveu:
On Thu, 7 Jan 2010, Gurgel, Flavio wrote:
Postgres does not change a query plan according to the shared_buffers
setting. It does not anticipate one step
- Scott Marlowe scott.marl...@gmail.com escreveu:
You do know that indexes in postgresql are not covering right?
I.e.
after hitting the index, the db then has to hit the table to see if
those rows are in fact visible. So there's no such thing in pgsql,
at
the moment, as an index only
Got an explain analyze of the delete query?
UPDATE mdx_core.audit_impt
SET source_table = 'mdx_import.'||impt_name
WHERE audit_impt_id = 31941 AND audit_impt_id = 319400010
AND coalesce(source_table, '') = ''
Index Scan using audit_impt_pkey on audit_impt (cost=0.00..92.63 rows=1
If it is possible to lock this audit table exclusively (may be during
off peak hours) I would look into
- create new_audit_table as select col1, col2, col3 ... col9,
'new_col_value' from old_audit_table;
- create all indexes
- drop old_audit_table
- rename new_audit_table to old_audit_table
That
On 01/05/2010 08:34 PM, Robert Haas [robertmh...@gmail.com] wrote:
- If you have other queries where this index helps (even though it is
hurting this one), then you're going to have to find a way to execute
the query without using bound parameters - i.e. with the actual values
in there instead
Joining via a tinyint or something will make your large table smaller which
is nice. Smaller tables = faster tables.
On Wed, Jan 6, 2010 at 11:21 PM, Radhika S rs88...@gmail.com wrote:
Hi,
I am going to test this out but would be good to know anyways. A large
table is joined to a tiny table
Oops, I meant to mention this too virtually all GigE and/or server
class NICs do TCP checksum offload.
Dimitri - it's unlikely that you have a hardware issue on the NIC, it's more
likely to be a cable problem or network congestion. What you want to look
for in the tcpdump capture is things
On Thu, Jan 7, 2010 at 1:43 PM, Brian Cox brian@ca.com wrote:
On 01/05/2010 08:34 PM, Robert Haas [robertmh...@gmail.com] wrote:
- If you have other queries where this index helps (even though it is
hurting this one), then you're going to have to find a way to execute
the query without
On Wed, Jan 6, 2010 at 6:53 PM, Zintrigue zintri...@gmail.com wrote:
I'm wondering if there's any performance penalty here
There definitely is. Your design sounds pretty painful to me...
adding a column referencing a side-table will be much nicer.
If anyone can offer in any insight as too how
every update is a UPDATE ... WHERE id
= x AND id x+10 and a commit is performed after every 1000 updates
statement, i.e. every 1 rows.
What is the rationale behind this? How about doing 10k rows in 1
update, and committing every time?
You could try making the condition on the ctid column,
What is the rationale behind this? How about doing 10k rows in 1
update, and committing every time?
When we did 10K updates, the application would sometimes appear to have
frozen, and we were concerned that there was a deadlock condition because of
the number of locked rows. While we may have
Lefteris wrote:
So we all agree that the problem is on the scans:)
So the next question is why changing shared memory buffers will fix
that? i only have one session with one connection, do I have like many
reader workers or something?
I wouldn't expect it to. Large sequential scans like
Hi Greg,
thank you for your help. The changes I did on the dataset was just
removing the last comma from the CSV files as it was interpreted by pg
as an extra column. The schema I used, the load script and queries can
be found at:
http://homepages.cwi.nl/~lsidir/postgres/
(I understood that if
On Thu, Jan 7, 2010 at 11:57 PM, Lefteris lsi...@gmail.com wrote:
Hi Greg,
thank you for your help. The changes I did on the dataset was just
removing the last comma from the CSV files as it was interpreted by pg
as an extra column. The schema I used, the load script and queries can
be found
On Thu, Jan 07, 2010 at 01:38:41PM +0100, Lefteris wrote:
airtraffic=# EXPLAIN ANALYZE SELECT DayOfWeek, count(*) AS c FROM
ontime WHERE Year BETWEEN 2000 AND 2008 GROUP BY DayOfWeek ORDER
BY c DESC;
Well, this query basically has to be slow. Correct approach to this
problem is to add
On Thu, Jan 7, 2010 at 2:48 PM, Carlo Stonebanks
stonec.regis...@sympatico.ca wrote:
Doing the updates in smaller chunks resolved these apparent freezes - or,
more specifically, when the application DID freeze, it didn't do it for more
than 30 seconds. In all likelyhood, this is the OS and the
Craig Ringer cr...@postnewspapers.com.au writes:
Can Pg even read partial records ? I thought it all operated on a page
level, where if an index indicates that a particular value is present on
a page the whole page gets read in and all records on the page are
checked for the value of
Hi all,
I want to compare the two arrys in sql, The below example is wrong.
But i want to know how to do that in postgres 8.2.5
SELECT 'x' WHERE string_to_array('the,quick,ram,fox', ',') any
(string_to_array('the,quick,lazy ,fox', ','))
Regards,
Ram
It might well be checkpoints. Have you tried cranking up checkpoint
segments to something like 100 or more and seeing how it behaves then?
No I haven't, althugh it certainly make sense - watching the process run,
you get this sense that the system occaisionally pauses to take a deep, long
Already done in an earlier post, Kevin - I have included it again below. As
you can see, it's pretty well wqhat you would expect, index scan plus a
filter.
One note: updates where no rows qualify run appreciably faster than the ones
that do. That is, the update itself appears to be consuming
On Thu, Jan 7, 2010 at 11:14 PM, Carlo Stonebanks
stonec.regis...@sympatico.ca wrote:
It might well be checkpoints. Have you tried cranking up checkpoint
segments to something like 100 or more and seeing how it behaves then?
No I haven't, althugh it certainly make sense - watching the process
50 matches
Mail list logo