Am 21.08.2008 um 19:08 schrieb Merlin Moncure:
On Thu, Aug 21, 2008 at 11:07 AM, Moritz Onken
<[EMAIL PROTECTED]> wrote:
Am 21.08.2008 um 16:39 schrieb Scott Carey:
It looks to me like the work_mem did have an effect.
Your earlier queries had a sort followed by group aggregate
Am 21.08.2008 um 16:39 schrieb Scott Carey:
It looks to me like the work_mem did have an effect.
Your earlier queries had a sort followed by group aggregate at the
top, and now its a hash-aggregate. So the query plan DID change.
That is likely where the first 10x performance gain came fr
Am 21.08.2008 um 09:04 schrieb Moritz Onken:
Am 20.08.2008 um 20:28 schrieb Tom Lane:
"Scott Carey" <[EMAIL PROTECTED]> writes:
The planner actually thinks there will only be 28704 rows returned
of width
12. But it chooses to sort 53 million rows before aggregating.
Th
Am 20.08.2008 um 20:28 schrieb Tom Lane:
"Scott Carey" <[EMAIL PROTECTED]> writes:
The planner actually thinks there will only be 28704 rows returned
of width
12. But it chooses to sort 53 million rows before aggregating.
Thats
either a bug or there's something else wrong here. That is
Am 20.08.2008 um 20:06 schrieb Scott Carey:
Ok, so the problem boils down to the sort at the end.
The query up through the merge join on domain is as fast as its
going to get. The sort at the end however, should not happen
ideally. There are not that many rows returned, and it should
h
Am 19.08.2008 um 17:23 schrieb Moritz Onken:
Am 19.08.2008 um 16:49 schrieb Scott Carey:
What is your work_mem set to? The default?
Try increasing it significantly if you have the RAM and seeing if
that affects the explain plan. You may even want to set it to a
number larger than the
Am 19.08.2008 um 16:49 schrieb Scott Carey:
What is your work_mem set to? The default?
Try increasing it significantly if you have the RAM and seeing if
that affects the explain plan. You may even want to set it to a
number larger than the RAM you have just to see what happens. In
all
Am 19.08.2008 um 14:17 schrieb Matthew Wakeling:
On Tue, 19 Aug 2008, Moritz Onken wrote:
tablename| attname | n_distinct | correlation
result | domain | 1642 | 1
Well, the important thing is the correlation, which is 1, indicating
that
As far as I can tell, it should. If it is clustered on an index on
domain, and then analysed, it should no longer have to sort on domain.
Could you post here the results of running:
select * from pg_stats where attname = 'domain';
schemaname | tablename| attname | null_fra
Am 18.08.2008 um 18:05 schrieb Matthew Wakeling:
On Mon, 18 Aug 2008, Moritz Onken wrote:
Running the query for more than one user is indeed not much slower.
That's what I need. I'm clustering the results table on domain
right now. But why is this better than clustering it on &qu
Well, you're getting the database to read the entire contents of the
domain_categories table in order. That's 12 million rows - a fair
amount of work.
You may find that removing the "user = 1337" constraint doesn't make
the query much slower - that's where you get a big win by clusterin
Am 18.08.2008 um 16:30 schrieb Matthew Wakeling:
On Mon, 18 Aug 2008, Moritz Onken wrote:
I have indexes on result.domain, domain_categories.domain,
result.user, domain_categories.category. Clustered result on user
and domain_categories on domain.
"-> Materializ
Hi,
I run this query:
select max(a."user"), b.category, count(1) from result a,
domain_categories b where a."domain" = b."domain" group by b.category;
the table result contains all websites a user visited. And the table
domain_categories contains all categories a domain is in.
result has 2
Am 12.08.2008 um 17:21 schrieb Bill Moran:
In response to Moritz Onken <[EMAIL PROTECTED]>:
Am 12.08.2008 um 17:04 schrieb Bill Moran:
In response to Moritz Onken <[EMAIL PROTECTED]>:
We chose UUID as PK because there is still some information in an
integer key.
You can s
Am 12.08.2008 um 17:04 schrieb Bill Moran:
In response to Moritz Onken <[EMAIL PROTECTED]>:
We chose UUID as PK because there is still some information in an
integer key.
You can see if a user has registered before someone else (user1.id <
user2.id)
or you can see how many
We chose UUID as PK because there is still some information in an
integer key.
You can see if a user has registered before someone else (user1.id <
user2.id)
or you can see how many new users registered in a specific period of
time
(compare the id of the newest user to the id a week ago). Thi
The thing here is that you are effectively causing Postgres to run a
sub-select for each row of the "result" table, each time generating
either an empty list or a list with one or more identical URLs. This
is effectively forcing a nested loop. In a way, you have two
constraints where you
Am 30.06.2008 um 16:59 schrieb Steinar H. Gunderson:
On Mon, Jun 30, 2008 at 09:16:06AM +0200, Moritz Onken wrote:
the result table has 20.000.000 records and the item table 5.000.000.
The query
select count(1) from result where url in (select shorturl from item
where shorturl = result.url
However there's a lot more scope for improving a query along these
lines, like adding indexes, or CLUSTERing on an index. It depends
what other queries you are wanting to run.
I don't know how much update/insert activity there will be on your
database. However, if you were to add an inde
Am 30.06.2008 um 12:19 schrieb Matthew Wakeling:
select count(1) from result where url in (select shorturl from item
where shorturl = result.url);
What on earth is wrong with writing it like this?
SELECT COUNT(*) FROM (SELECT DISTINCT result.url FROM result, item
WHERE
item.shorturl =
Am 28.06.2008 um 21:19 schrieb Steinar H. Gunderson:
On Sat, Jun 28, 2008 at 06:24:42PM +0200, Moritz Onken wrote:
SELECT distinct url from item where url like 'http://www.micro%'
limit
10;
Here, the planner knows the pattern beforehand, and can see that
it's a
simple
Anfang der weitergeleiteten E-Mail:
Von: Moritz Onken <[EMAIL PROTECTED]>
Datum: 30. Juni 2008 09:16:06 MESZ
An: Steinar H. Gunderson <[EMAIL PROTECTED]>
Betreff: Re: [PERFORM] Planner should use index on a LIKE 'foo%' query
Am 28.06.2008 um 21:19 schrieb Steinar H. Gu
Hi,
I have a query
select count(*)
from result
where exists
(select * from item where item.url LIKE result.url || '%' limit 1);
which basically returns the number of items which exist in table
result and match a URL in table item by its prefix.
I read all about idexes (http://www.postgre
We have very good experiences with openVZ as virtualizer.
Since it's not a para virtualization like xen it's very fast. Almost
as fast as the host.
www.openvz.org
Am 04.03.2008 um 16:43 schrieb Theo Kramer:
Hi
We are thinking of running a PostgreSQL instance on a virtual host
under
Xen.
I am also wondering about the ordering and whether that matters.
Can you have "he", "can", "drink" as well as "drink", "he", "can"
and should they be considered the same? If so you will need a
different tactic.
ordering matters. So the 3-column tactic should work.
Thanks for your advice!
Hi,
I need to store a lot of 3-tuples of words (e.g. "he", "can",
"drink"), order matters!
The source is about 4 GB of these 3-tuples.
I need to store them in a table and check whether one of them is
already stored, and if that's the case to increment a column named
"count" (or something).
26 matches
Mail list logo