Ron Mayer [EMAIL PROTECTED] writes:
Would another possible condition for considering
Cartesian joins be be:
* Consider Cartesian joins when a unique constraint can prove
that at most one row will be pulled from one of the tables
that would be part of this join?
What for? That
Tom Lane wrote:
Ron Mayer [EMAIL PROTECTED] writes:
Would another possible condition for considering
Cartesian joins be be:
* Consider Cartesian joins when a unique constraint can prove
that at most one row will be pulled from one of the tables
that would be part of this join?
Ron Mayer wrote:
Tom Lane wrote:
Ron Mayer [EMAIL PROTECTED] writes:
Would another possible condition for considering
Cartesian joins be be:
* Consider Cartesian joins when a unique constraint can prove
that at most one row will be pulled from one of the tables
that would be part
I have _not_ added a TODO for this item. Let me know if one is needed.
---
Tom Lane wrote:
Ron Mayer [EMAIL PROTECTED] writes:
Tom Lane wrote:
There's something fishy about this --- given that that plan has a lower
Bruce Momjian [EMAIL PROTECTED] writes:
I have _not_ added a TODO for this item. Let me know if one is needed.
Please do, I think it's an open issue.
* Consider Cartesian joins when both relations are needed to form an
indexscan qualification for a third relation
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
I have _not_ added a TODO for this item. Let me know if one is needed.
Please do, I think it's an open issue.
* Consider Cartesian joins when both relations are needed to form an
indexscan qualification for a third relation
All that helps to pgsql to perform good in a TB-sized database
enviroment is a Good Think (r) :D
Pablo
Bruce Momjian wrote:
I have _not_ added a TODO for this item. Let me know if one is needed.
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
I have _not_ added a TODO for this item. Let me know if one is needed.
Please do, I think it's an open issue.
* Consider Cartesian joins when both relations are needed to form an
indexscan qualification for a third relation
Would
On Fri, 2007-12-07 at 12:45 -0500, Robert Treat wrote:
On Thursday 06 December 2007 04:38, Simon Riggs wrote:
I think you're completly overlooking the effect of disk latency has on
query times. We run queries all the time that can vary from 4 hours to
12 hours in time based solely on
On Thursday 06 December 2007 04:38, Simon Riggs wrote:
Robert,
On Wed, 2007-12-05 at 15:07 -0500, Robert Treat wrote:
If the whole performance of your system depends upon indexed access, then
maybe you need a database that gives you a way to force index access at
the query level?
That
Tom Lane wrote:
Ron Mayer [EMAIL PROTECTED] writes:
Tom Lane wrote:
There's something fishy about this --- given that that plan has a lower
cost estimate, it should've picked it without any artificial
constraints.
One final thing I find curious about this is that the estimated
number of rows
Robert,
On Wed, 2007-12-05 at 15:07 -0500, Robert Treat wrote:
If the whole performance of your system depends upon indexed access, then
maybe you need a database that gives you a way to force index access at the
query level?
That sounds like a request for hints, which is OT here, ISTM.
On Thu, Dec 06, 2007 at 09:38:16AM +, Simon Riggs wrote:
The issue is that if somebody issues a large query then it will be a
problem whichever plan the query takes. Forcing index scans can make a
plan more expensive than a seq scan in many cases.
OTOH, the planner can really screw up
Michael Stone [EMAIL PROTECTED] writes:
OTOH, the planner can really screw up queries on really large databases.
IIRC, the planner can use things like unique constraints to get some
idea, e.g., of how many rows will result from a join. Unfortunately,
the planner can't apply those techniques
Matthew [EMAIL PROTECTED] writes:
On Thu, 6 Dec 2007, Tom Lane wrote:
Hmm. IIRC, there are smarts in there about whether a mergejoin can
terminate early because of disparate ranges of the two join variables.
Very cool. Would that be a planner cost estimate fix (so it avoids the
merge join),
On Thu, 6 Dec 2007, Tom Lane wrote:
Matthew [EMAIL PROTECTED] writes:
... For this query, Postgres would perform a nested loop,
iterating over all rows in the small table, and doing a hundred index
lookups in the big table. This completed very quickly. However, adding the
LIMIT meant that
On Thu, Dec 06, 2007 at 11:13:18AM -0500, Tom Lane wrote:
Indeed, and if you've got examples where it's that far off, you should
report them.
Yeah, the trick is to get it to a digestable test case. The basic
scenario (there are more tables columns in the actual case) is a set
of tables
Tom Lane wrote:
Michael Stone [EMAIL PROTECTED] writes:
OTOH, the planner can really screw up queries on really large databases.
... I've got some queries that the
planner thinks will return on the order of 10^30 rows for that sort of
reason. In practice, the query may return 10^3 rows
Ron Mayer [EMAIL PROTECTED] writes:
Tom Lane wrote:
There's something fishy about this --- given that that plan has a lower
cost estimate, it should've picked it without any artificial
constraints.
I think the reason it's not picking it was discussed back in this thread
too.
Tom Lane wrote:
Ron Mayer [EMAIL PROTECTED] writes:
Tom Lane wrote:
...given that that plan has a lower cost estimate, it
should've picked it without any artificialconstraints.
I think the reason it's not picking it was discussed back...
On Thursday 29 November 2007 11:14, Simon Riggs wrote:
On Thu, 2007-11-29 at 10:45 -0500, Tom Lane wrote:
Gregory Stark [EMAIL PROTECTED] writes:
Simon Riggs [EMAIL PROTECTED] writes:
Tom's previous concerns were along the lines of How would know what
to set it to?, given that the
On Nov 28, 2007, at 7:27 AM, Bill Moran wrote:
Is there something wrong with:
set enable_seqscan = off
Note that in cases of very heavy skew, that won't work. It only adds
10M to the cost estimate for a seqscan, and it's definitely possible
to have an index scan that looks even more
Hi Peter,
If you run into a scaling issue with PG (you will at those scales 1TB+), you
can deploy Greenplum DB which is PG 8.2.5 compatible. A large internet
company (look for press soon) is in production with a 150TB database on a
system capable of doing 400TB and we have others in production
Isn't that what statement_timeout is for? Since this is entirely based
on estimates, using arbitrary fuzzy numbers for this seems fine to me;
precision isn't really the goal.
There's an important difference to statement_timeout: this proposal
would avoid completely taking any resources if it
On 11/29/07, Gregory Stark [EMAIL PROTECTED] wrote:
Simon Riggs [EMAIL PROTECTED] writes:
On Wed, 2007-11-28 at 14:48 +0100, Csaba Nagy wrote:
In fact an even more useful option would be to ask the planner to throw
error if the expected cost exceeds a certain threshold...
Tom's previous
On Fri, 2007-11-30 at 17:41 +1100, Russell Smith wrote:
Simon Riggs wrote:
On Tue, 2007-11-27 at 18:06 -0500, Pablo Alcaraz wrote:
Simon Riggs wrote:
All of those responses have cooked up quite a few topics into one. Large
databases might mean text warehouses, XML message
Simon Riggs [EMAIL PROTECTED] writes:
On Wed, 2007-11-28 at 14:48 +0100, Csaba Nagy wrote:
In fact an even more useful option would be to ask the planner to throw
error if the expected cost exceeds a certain threshold...
Well, I've suggested it before:
statement_cost_limit on
Gregory Stark [EMAIL PROTECTED] writes:
Simon Riggs [EMAIL PROTECTED] writes:
Tom's previous concerns were along the lines of How would know what to
set it to?, given that the planner costs are mostly arbitrary numbers.
Hm, that's only kind of true.
The units are not the problem. The
On Thu, 2007-11-29 at 10:45 -0500, Tom Lane wrote:
Gregory Stark [EMAIL PROTECTED] writes:
Simon Riggs [EMAIL PROTECTED] writes:
Tom's previous concerns were along the lines of How would know what to
set it to?, given that the planner costs are mostly arbitrary numbers.
Hm, that's only
* Simon Riggs ([EMAIL PROTECTED]) wrote:
On Thu, 2007-11-29 at 10:45 -0500, Tom Lane wrote:
Given that this list spends all day every day discussing cases where the
planner is wrong, I'd have to think that that's a bet I wouldn't take.
I think you have a point, but the alternative is often
On Thu, 2007-11-29 at 10:45 -0500, Tom Lane wrote:
Given that this list spends all day every day discussing cases where the
planner is wrong, I'd have to think that that's a bet I wouldn't take.
You could probably avoid this risk by setting the cutoff at something
like 100 or 1000 times what
Simon Riggs wrote:
On Wed, 2007-11-28 at 14:48 +0100, Csaba Nagy wrote:
In fact an even more useful option would be to ask the planner to throw
error if the expected cost exceeds a certain threshold...
Well, I've suggested it before:
statement_cost_limit on pgsql-hackers, 1 March
Simon Riggs wrote:
On Tue, 2007-11-27 at 18:06 -0500, Pablo Alcaraz wrote:
Simon Riggs wrote:
All of those responses have cooked up quite a few topics into one. Large
databases might mean text warehouses, XML message stores, relational
archives and fact-based business data warehouses.
On Tue, 27 Nov 2007, Pablo Alcaraz wrote:
it would be nice to do something with selects so we can recover a rowset
on huge tables using a criteria with indexes without fall running a full
scan.
You mean: Be able to tell Postgres Don't ever do a sequential scan of
this table. It's silly. I
In response to Matthew [EMAIL PROTECTED]:
On Tue, 27 Nov 2007, Pablo Alcaraz wrote:
it would be nice to do something with selects so we can recover a rowset
on huge tables using a criteria with indexes without fall running a full
scan.
You mean: Be able to tell Postgres Don't ever do a
On Wed, 2007-11-28 at 08:27 -0500, Bill Moran wrote:
Is there something wrong with:
set enable_seqscan = off
?
Nothing wrong with enable_seqscan = off except it is all or nothing type
of thing... if you want the big table to never use seqscan, but a medium
table which is joined in should use
In response to Gregory Stark [EMAIL PROTECTED]:
Bill Moran [EMAIL PROTECTED] writes:
In response to Matthew [EMAIL PROTECTED]:
On Tue, 27 Nov 2007, Pablo Alcaraz wrote:
it would be nice to do something with selects so we can recover a rowset
on huge tables using a criteria with
On Wed, 2007-11-28 at 08:54 -0500, Bill Moran wrote:
Nothing wrong with enable_seqscan = off except it is all or nothing type
of thing...
If that's true, then I have a bug report to file:
[snip]
It looks to me to be session-alterable.
I didn't mean that it can't be set per session, I
Matthew wrote:
On Tue, 27 Nov 2007, Pablo Alcaraz wrote:
it would be nice to do something with selects so we can recover a rowset
on huge tables using a criteria with indexes without fall running a full
scan.
You mean: Be able to tell Postgres Don't ever do a sequential scan of
this
On Wed, 28 Nov 2007, Gregory Stark wrote:
Is there something wrong with:
set enable_seqscan = off
?
This does kind of the opposite of what you would actually want here. What you
want is that if you give it a query which would be best satisfied by a
sequential scan it should throw an
Pablo Alcaraz wrote:
Simon Riggs wrote:
All of those responses have cooked up quite a few topics into one. Large
databases might mean text warehouses, XML message stores, relational
archives and fact-based business data warehouses.
The main thing is that TB-sized databases are performance
Pablo Alcaraz escribió:
In my opinion there are queries that I think they ll need to be tuned for
huge databases (huge databases = a database which relevant
tables(indexes) are (will be) far bigger that all the ram available):
-- example table
CREATE TABLE homes (
id bigserial,
On Wed, 2007-11-28 at 14:48 +0100, Csaba Nagy wrote:
In fact an even more useful option would be to ask the planner to throw
error if the expected cost exceeds a certain threshold...
Well, I've suggested it before:
statement_cost_limit on pgsql-hackers, 1 March 2006
Would people like me to
On Tue, 2007-11-27 at 18:06 -0500, Pablo Alcaraz wrote:
Simon Riggs wrote:
All of those responses have cooked up quite a few topics into one. Large
databases might mean text warehouses, XML message stores, relational
archives and fact-based business data warehouses.
The main thing is
On Wed, 28 Nov 2007, Simon Riggs wrote:
statement_cost_limit on pgsql-hackers, 1 March 2006
Would people like me to re-write and resubmit this patch for 8.4?
Yes please. The more options, the better.
Tom's previous concerns were along the lines of How would know what to
set it to?, given
On Wed, 28 Nov 2007, Simon Riggs wrote:
On Wed, 2007-11-28 at 14:48 +0100, Csaba Nagy wrote:
In fact an even more useful option would be to ask the planner to throw
error if the expected cost exceeds a certain threshold...
Well, I've suggested it before:
statement_cost_limit on
Thanks all. This is just what I needed.
On Nov 26, 2007 1:16 PM, Stephen Cook [EMAIL PROTECTED] wrote:
I think either would work; both PostgreSQL and MS SQL Server have
success stories out there running VLDBs. It really depends on what you
know and what you have. If you have a lot of
On Tue, 2007-11-27 at 14:18 -0600, Peter Koczan wrote:
Thanks all. This is just what I needed.
All of those responses have cooked up quite a few topics into one. Large
databases might mean text warehouses, XML message stores, relational
archives and fact-based business data warehouses.
The
Simon Riggs wrote:
All of those responses have cooked up quite a few topics into one. Large
databases might mean text warehouses, XML message stores, relational
archives and fact-based business data warehouses.
The main thing is that TB-sized databases are performance critical. So
it all
Hi all,
I have a user who is looking to store 500+ GB of data in a database
(and when all the indexes and metadata are factored in, it's going to
be more like 3-4 TB). He is wondering how well PostgreSQL scales with
TB-sized databases and what can be done to help optimize them (mostly
hardware
Peter Koczan wrote:
Hi all,
I have a user who is looking to store 500+ GB of data in a database
(and when all the indexes and metadata are factored in, it's going to
be more like 3-4 TB). He is wondering how well PostgreSQL scales with
TB-sized databases and what can be done to help optimize
We have several TB database in production and it works well on
HP rx1620 dual Itanium2, MSA 20, running Linux. It's read-only storage for
astronomical catalogs with about 4-billions objects. We have custom
index for spherical coordinates which provide great performance.
Oleg
On Mon, 26 Nov 2007,
I had a client that tried to use Ms Sql Server to run a 500Gb+ database.
The database simply colapsed. They switched to Teradata and it is
running good. This database has now 1.5Tb+.
Currently I have clients using postgresql huge databases and they are
happy. In one client's database the
I think either would work; both PostgreSQL and MS SQL Server have
success stories out there running VLDBs. It really depends on what you
know and what you have. If you have a lot of experience with Postgres
running on Linux, and not much with SQL Server on Windows, of course the
former would
54 matches
Mail list logo