Re: [HACKERS] I feel the need for speed. What am I doing wrong?
"Dann Corbit" <[EMAIL PROTECTED]> writes: > Yikes! Five times slower! But then I took Tom's incredibly helpful > suggestion to disable the sequential scan: Ideally, you shouldn't have to do that. Now that you have the correct indexes in place, could you show us the EXPLAIN ANALYZE output for both cases (enable_seqscan = on and off)? Also, you might try leaving enable_seqscan = on, and seeing how far you have to decrease random_page_cost to get the planner to choose indexscan. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] I feel the need for speed. What am I doing wrong?
> -Original Message- > From: Tom Lane [mailto:[EMAIL PROTECTED]] > Sent: Monday, January 06, 2003 5:26 PM > To: Nigel J. Andrews > Cc: Dann Corbit; [EMAIL PROTECTED]; > [EMAIL PROTECTED] > Subject: Re: [HACKERS] I feel the need for speed. What am I > doing wrong? > > > "Nigel J. Andrews" <[EMAIL PROTECTED]> writes: > >> select a."RT_REC_KEY", a."cnxarraycolumn", a."CRC" from > >> "CNX_DS_53_SIS_STU_OPT_FEE_TB" a left outer join > >> "CNX_DS2_53_SIS_STU_OPT_FEE_TB" b on ( a."RT_REC_KEY" = > >> b."RT_REC_KEY" and a."cnxarraycolumn" = b."cnxarraycolumn") where > >> b.oid is null ; > > > I suspect you get no results because it's unlikely b.oid > will be null. > > Try "it's impossible for b.oid to be null --- unless a dummy > b row is being provided by the LEFT JOIN". I interpret the > purpose of the query to be to look for "a" rows that have no > matching "b" row. > > Using OID for this is kind of cute, I guess, since it is > guaranteed not-null in a real row; he doesn't have to think > about whether any of his regular columns are not-null. I am very happy to report that PostgreSQL now easily beats MS Access for speed! :-) Here are the MS Access results, where I create on index and then a second: Access Outer Join 2 column index on 1st table 02:29.9 2 column index on 2nd table + 2 column index on 1st table 02:18.3 For PostgreSQL, I originally created a 3 column index (because I have also 3 column joins in other places) and got this result: Postgres Outer Join enable_seqscan = 1 3 column Index 1 12:43.9 Showing the plan showed that the indexes were being ignored. Yikes! Five times slower! But then I took Tom's incredibly helpful suggestion to disable the sequential scan: Postgres Outer Join enable_seqscan = 0 3 column Index 0 05:17.5 Changed to a 2 column index: 2 column index 0 04:58.3 Added an index to the second table: 2 column index on 2nd table + 2 column index on 1st table 0 01:53.6 PostgreSQL is now 22% faster than Access (HAPPY DAYS)! For my application, I happen to know that the data will be approximately physically clustered, and that the indexes will always be very near matches for data sets big enough to matter. Benchmarking also showed that adding the 3rd column to the index was counter productive, even when the join criteria was a.f1 = b.f1, a.f2=b.f2, a.f3 = b.f3. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] I feel the need for speed. What am I doing wrong?
"Nigel J. Andrews" <[EMAIL PROTECTED]> writes: >> select a."RT_REC_KEY", a."cnxarraycolumn", a."CRC" from >> "CNX_DS_53_SIS_STU_OPT_FEE_TB" a left outer join >> "CNX_DS2_53_SIS_STU_OPT_FEE_TB" b on ( a."RT_REC_KEY" = b."RT_REC_KEY" >> and a."cnxarraycolumn" = b."cnxarraycolumn") where b.oid is null ; > I suspect you get no results because it's unlikely b.oid will be null. Try "it's impossible for b.oid to be null --- unless a dummy b row is being provided by the LEFT JOIN". I interpret the purpose of the query to be to look for "a" rows that have no matching "b" row. Using OID for this is kind of cute, I guess, since it is guaranteed not-null in a real row; he doesn't have to think about whether any of his regular columns are not-null. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] I feel the need for speed. What am I doing wrong?
"Dann Corbit" <[EMAIL PROTECTED]> writes: > Creating the following index had no effect on performance! > create unique index i1 on "CNX_DS2_53_SIS_STU_OPT_FEE_TB" ("RT_REC_KEY", > "cnxarraycolumn", "CRC"); What does EXPLAIN ANALYZE have to say about the query? If you set enable_seqscan = 0, does the explain result change? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] I feel the need for speed. What am I doing wrong?
> -Original Message- > From: Nigel J. Andrews [mailto:[EMAIL PROTECTED]] > Sent: Monday, January 06, 2003 4:58 PM > To: Dann Corbit > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: Re: [HACKERS] I feel the need for speed. What am I > doing wrong? > > Added -general list so that the next followup can remove > -hackers and everyone there will have had notice. > > > On Mon, 6 Jan 2003, Dann Corbit wrote: > > > > I have a query using two postgres tables. > > One is called "CNX_DS_53_SIS_STU_OPT_FEE_TB" and the other is called > > "CNX_DS2_53_SIS_STU_OPT_FEE_TB". > > > > I am getting 3 times slower performance than Microsoft Access when > > performing a left outer join. > > > > ... > > > > Here is the query: > > select a."RT_REC_KEY", a."cnxarraycolumn", a."CRC" from > > "CNX_DS_53_SIS_STU_OPT_FEE_TB" a left outer join > > "CNX_DS2_53_SIS_STU_OPT_FEE_TB" b on ( a."RT_REC_KEY" = > b."RT_REC_KEY" > > and a."cnxarraycolumn" = b."cnxarraycolumn") where b.oid is null ; > > > > > > Creating the following index had no effect on performance! create > > unique index i1 on "CNX_DS2_53_SIS_STU_OPT_FEE_TB" ("RT_REC_KEY", > > "cnxarraycolumn", "CRC"); > > > > Both tables had 6139062 rows of data. > > > > In this query ... all rows of data match perfectly, so no > results are > > returned. > > I suspect you get no results because it's unlikely b.oid will > be null. Are you sure the query is how it should be since you > seem to be expecting no rows to be returned and yet your > reason for that doesn't match the query as shown. Without the > oid test I'd bet you get a result set of 6139062 rows. There are supposed to be no results for this particular query. The data is unique with only RT_REC_KEY and cnxarraycolumn (I tried using that as an index also). The basic gist of it is like this: I want to know where the keys match (e.g.: RT_REC_KEY + cnxarraycolumn) but the CRC has changed (which will tell me what data has changed). This is for a data synchronization system that uses PostgreSQL as a join engine. I store primary key data together with a 64 bit CRC in data tables stored in PostgreSQL. In this particular case, there were no changes but there will be changes at other times. > > Is there a way to reformulate this query so that it will use the > > index? > > Given the above comment I'd say no since the entirety of both > tables will be tested to make the result set. > > Alternatively, if the query is right try something along the lines of: > > SELECT a.blah, a.foo, > FROM a, b > WHERE a.blah = b.blah AND a.foo = b.foo AND b.oid IS NULL OID is never null. I don't think that this query is equivalent. This Oid is just the standard PostgreSQL Oid, and not some arbitrary field. > if that doesn't use a query try pushing the null test into a > subselect like: > > SELECT a.blah, a.foo, > FROM a, (SELECT * FROM b WHERE oid IS NULL) b > WHERE a.blah = b.blah AND a.foo = b.foo > OID is never null. I don't think that this query is equivalent. > After that let's hope I haven't embarrassed myself. Probably, I did not explain myself clearly enough. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] I feel the need for speed. What am I doing wrong?
Added -general list so that the next followup can remove -hackers and everyone there will have had notice. On Mon, 6 Jan 2003, Dann Corbit wrote: > > I have a query using two postgres tables. > One is called "CNX_DS_53_SIS_STU_OPT_FEE_TB" and the other is called > "CNX_DS2_53_SIS_STU_OPT_FEE_TB". > > I am getting 3 times slower performance than Microsoft Access when > performing a left outer join. > > ... > > Here is the query: > select a."RT_REC_KEY", a."cnxarraycolumn", a."CRC" from > "CNX_DS_53_SIS_STU_OPT_FEE_TB" a left outer join > "CNX_DS2_53_SIS_STU_OPT_FEE_TB" b on ( a."RT_REC_KEY" = b."RT_REC_KEY" > and a."cnxarraycolumn" = b."cnxarraycolumn") where b.oid is null ; > > > Creating the following index had no effect on performance! > create unique index i1 on "CNX_DS2_53_SIS_STU_OPT_FEE_TB" ("RT_REC_KEY", > "cnxarraycolumn", "CRC"); > > Both tables had 6139062 rows of data. > > In this query ... all rows of data match perfectly, so no results are > returned. I suspect you get no results because it's unlikely b.oid will be null. Are you sure the query is how it should be since you seem to be expecting no rows to be returned and yet your reason for that doesn't match the query as shown. Without the oid test I'd bet you get a result set of 6139062 rows. > Is there a way to reformulate this query so that it will use the index? Given the above comment I'd say no since the entirety of both tables will be tested to make the result set. Alternatively, if the query is right try something along the lines of: SELECT a.blah, a.foo, FROM a, b WHERE a.blah = b.blah AND a.foo = b.foo AND b.oid IS NULL if that doesn't use a query try pushing the null test into a subselect like: SELECT a.blah, a.foo, FROM a, (SELECT * FROM b WHERE oid IS NULL) b WHERE a.blah = b.blah AND a.foo = b.foo After that let's hope I haven't embarrassed myself. -- Nigel J. Andrews ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster