Re: [HACKERS] ScanKey representation for RowCompare index
On Sun, 2006-01-15 at 18:03 -0500, Tom Lane wrote: There's one nontrivial decision still to make about how to implement proper per-spec row-comparison operations, namely: how a row comparison ought to be represented in the index access method API. I'm not sure I understand why. Surely a row wise comparison can take advantage of indexes on some of its columns? When would it be advantageous to create an index on the whole row anyway? Wouldn't the key likely be excessively long? So why would we support this at all? I'm currently leaning to plan B, on the grounds that: 1. It would require no changes in _bt_checkkeys(), which is the only user of the data structure that is particularly performance-critical. With plan A we'd be adding at least a few cycles to _bt_checkkeys in all cases. Plan B avoids that at the cost of an extra level of function call to do a row comparison. Given the relative frequency of the two sorts of index conditions, this has to be the better tradeoff to make. 2. There's quite a bit of logic in btree indexscan setup that would find it convenient to treat a row comparison as if it were a single condition on just its leftmost column. This may end up being nearly a wash, but I think that plan A would make the setup code a shade more complex than plan B would. In particular, the rules about valid orderings of the ScanKey entries would be complicated under plan A, whereas under plan B it's still clear where everything belongs. Any thoughts? Anyone see a good plan C, or a serious flaw that I'm missing in either of these ideas? That looks sound. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)
On Mon, 2006-01-16 at 00:07 -0500, Rod Taylor wrote: A couple of days ago I found myself wanting to aggregate 3 Billion tuples down to 100 Million tuples based on an integer key with six integer values -- six sum()'s. PostgreSQL ran out of memory with its Hash Aggregator and doing an old style Sort Sum took a fair amount of time to complete (cancelled the process after 24 hours -- small machine). Spilling to disk would be nice but I suspect the obvious method would thrash quite badly with non-sorted input. There is already hash table overflow (spill to disk) logic in HashJoins, so this should be possible by reusing that code for HashAggs. That's on my todo list, but I'd welcome any assistance. A question: Are the rows in your 3 B row table clumped together based upon the 100M row key? (or *mostly* so) We might also be able to pre-aggregate the rows using a plan like HashAgg SortedAgg or SortedAgg Sort SortedAgg The first SortedAgg seems superfluous, buy would reduce the row volume considerably if incoming rows were frequently naturally adjacent, even if the values were not actually sorted. (This could also be done during sorting, but its much easier to slot the extra executor step into the plan). That might then reduce the size of the later sort, or allow it to become a HashAgg. I could make that manually enabled using enable_pre_agg to allow us to measure the effectiveness of that technique and decide what cost model we'd use to make it automatic. Would that help? I've written something similar using a client and COPY with temporary tables. Even with the Export/Import copy I still beat the SortSum method PostgreSQL falls back to. You can get round this now by chopping the larger table into pieces with a WHERE clause and then putting them back together with a UNION. If the table is partitioned, then do this by partitions. This should also help when it comes to recalculating the sums again in the future, since you'll only need to rescan the rows that have been added since the last summation. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Coding standards? Recommendations?
On Sun, 2006-01-15 at 11:14 -0500, korry wrote: I've noticed a variety of coding styles in the PostgreSQL source code. In particular, I see a mix of naming conventions. Some variables use camelCase (or CamelCase), others use under_score_style. I just follow the style of the module I'm working in. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] source documentation tool doxygen
I've created a browsable source tree documentation, it's done with the doxygen tool. http://www.mcknight.de/pgsql-doxygen/cvshead/html/ There was a discussion about this some time ago, Jonathan Gardner proposed it here: http://archives.postgresql.org/pgsql-hackers/2004-03/msg00748.php quite a few people found it useful but it somehow got stuck. The reason was apparently that you'd have to tweak your comments in order to profit from it as much as possible. However I still think it's a nice-to-have among the online documentation and it gives people quite new to the code the possibility to easily check what is where defined and how... What do you think? doxygen can also produce a pdf but I haven't succeeded in doing that so far, pdflatex keeps bailing out. Has anybody else succeeded building this yet? Joachim -- Joachim Wieland [EMAIL PROTECTED] C/ Usandizaga 12 1°B ICQ: 37225940 20002 Donostia / San Sebastian (Spain) GPG key available ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] source documentation tool doxygen
I wish I've had this when I started working with PostgreSQL. This looks really good. Very useful indeed, even without the comments. What kind of changes are needed in order to get the comments in? Regards, Thomas Hallgren Joachim Wieland wrote: I've created a browsable source tree documentation, it's done with the doxygen tool. http://www.mcknight.de/pgsql-doxygen/cvshead/html/ There was a discussion about this some time ago, Jonathan Gardner proposed it here: http://archives.postgresql.org/pgsql-hackers/2004-03/msg00748.php quite a few people found it useful but it somehow got stuck. The reason was apparently that you'd have to tweak your comments in order to profit from it as much as possible. However I still think it's a nice-to-have among the online documentation and it gives people quite new to the code the possibility to easily check what is where defined and how... What do you think? doxygen can also produce a pdf but I haven't succeeded in doing that so far, pdflatex keeps bailing out. Has anybody else succeeded building this yet? Joachim ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] source documentation tool doxygen
Try following the link (the Doxygen icon) - it has both a tutorial and extensive doc. Regards, Kim Bisgaard Thomas Hallgren wrote: I wish I've had this when I started working with PostgreSQL. This looks really good. Very useful indeed, even without the comments. What kind of changes are needed in order to get the comments in? Regards, Thomas Hallgren Joachim Wieland wrote: I've created a browsable source tree documentation, it's done with the doxygen tool. http://www.mcknight.de/pgsql-doxygen/cvshead/html/ There was a discussion about this some time ago, Jonathan Gardner proposed it here: http://archives.postgresql.org/pgsql-hackers/2004-03/msg00748.php quite a few people found it useful but it somehow got stuck. The reason was apparently that you'd have to tweak your comments in order to profit from it as much as possible. However I still think it's a nice-to-have among the online documentation and it gives people quite new to the code the possibility to easily check what is where defined and how... What do you think? doxygen can also produce a pdf but I haven't succeeded in doing that so far, pdflatex keeps bailing out. Has anybody else succeeded building this yet? Joachim ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] source documentation tool doxygen
Thomas Hallgren said: I wish I've had this when I started working with PostgreSQL. This looks really good. Very useful indeed, even without the comments. What kind of changes are needed in order to get the comments in? I too have done this. But retrofitting Doxygen style comments to the PostgreSQL source code would be a big undertaking. Maintaining it, which would be another task for reviewers/committers, would also be a pain unless there were some automated checking tool. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] [PATCH] Better way to check for getaddrinfo function.
Title: [PATCH] Better way to check for getaddrinfo function. Just thought that the following patch might improve checking for getaddrinfo function (in configure.in) I was forced to write 'coz getaddrinfo went unnoticed in Tru64 Unix. (displaying attached patch) $ diff -r configure.in configure.in.1 920c920,944 AC_REPLACE_FUNCS([getaddrinfo]) --- AC_CACHE_CHECK([for getaddrinfo], ac_cv_func_getaddrinfo, [AC_TRY_LINK([#include netdb.h], [struct addrinfo *g,h;g=h;getaddrinfo(,,g,g);], AC_TRY_RUN([ #include assert.h #include netdb.h #include sys/types.h #ifndef AF_INET # include sys/socket.h #endif #ifdef __cplusplus extern C #endif char (*f) (); int main(void) { f = getaddrinfo; return 0; } ],ac_cv_func_getaddrinfo=yes, ac_cv_func_getaddrinfo=no), ac_cv_func_getaddrinfo=no)]) if test $ac_cv_func_getaddrinfo = yes; then AC_DEFINE(HAVE_GETADDRINFO,1,[Define if you have the getaddrinfo function]) fi Rajesh R -- This space intentionally left non-blank. configure-in.patch configure-in.patch Description: configure-in.patch ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)
On Mon, 2006-01-16 at 08:32 +, Simon Riggs wrote: On Mon, 2006-01-16 at 00:07 -0500, Rod Taylor wrote: A couple of days ago I found myself wanting to aggregate 3 Billion tuples down to 100 Million tuples based on an integer key with six integer values -- six sum()'s. PostgreSQL ran out of memory with its Hash Aggregator and doing an old style Sort Sum took a fair amount of time to complete (cancelled the process after 24 hours -- small machine). Spilling to disk would be nice but I suspect the obvious method would thrash quite badly with non-sorted input. There is already hash table overflow (spill to disk) logic in HashJoins, so this should be possible by reusing that code for HashAggs. That's on my todo list, but I'd welcome any assistance. A question: Are the rows in your 3 B row table clumped together based upon the 100M row key? (or *mostly* so) We might also be able to They are randomly distributed. Fully sorting the data is quite painful. pre-aggregate the rows using a plan like HashAgg SortedAgg or SortedAgg Sort SortedAgg The first SortedAgg seems superfluous, buy would reduce the row volume considerably if incoming rows were frequently naturally adjacent, even if the values were not actually sorted. (This could also be done during sorting, but its much easier to slot the extra executor step into the plan). That might then reduce the size of the later sort, or allow it to become a HashAgg. I could make that manually enabled using enable_pre_agg to allow us to measure the effectiveness of that technique and decide what cost model we'd use to make it automatic. Would that help? I don't understand how this helps. The problem isn't the 3B data source rows but rather the 100M destination keys that are being aggregated against. The memory constraints of HashAgg are a result of the large number of target keys and should be the same if it was 100M rows or 10B rows. I think I need something closer to: HashAgg - HashSort (to disk) HashSort would create a number of files on disk with similar data. Grouping all similar keys into a single temporary file which HashAgg can deal with individually (100 loops by 1M target keys instead of 1 loop by 100M target keys). The results would be the same as partitioning by keyblock and running a HashAgg on each partition, but it would be handled by the Executor rather than by client side code. I've written something similar using a client and COPY with temporary tables. Even with the Export/Import copy I still beat the SortSum method PostgreSQL falls back to. You can get round this now by chopping the larger table into pieces with a WHERE clause and then putting them back together with a UNION. If the table is partitioned, then do this by partitions. True, except this results in several sequential scans over the source data. I can extract and sort in a single pass at client side but it would be far better if I could get PostgreSQL to do the same. I could probably write a plpgsql function to do that logic but it would be quite messy. This should also help when it comes to recalculating the sums again in the future, since you'll only need to rescan the rows that have been added since the last summation. We store the aggregated results and never do this type of calculation on that dataset again. The original dataset comes from about 300 partitions (time and source) and they are removed upon completion. While this calculation is being performed additional partitions are added. I suppose I could store source data in 300 * 1000 partitions (Approx 300 batches times 1000 segments) but that would probably run into other problems. PostgreSQL probably has issues with that many tables. -- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] ScanKey representation for RowCompare index conditions
On Sun, Jan 15, 2006 at 06:03:12PM -0500, Tom Lane wrote: There's one nontrivial decision still to make about how to implement proper per-spec row-comparison operations, namely: how a row comparison ought to be represented in the index access method API. The current representation of index conditions in the AM API is an array of ScanKey structs, one per indexcol op constant index condition, with implicit ANDing across all the conditions. There are various bits of code that require the ScanKeys to appear in order by index column, though this isn't inherent in the data structure itself. ISTM that row-wise comparisons, as far as indexes are concerned are actually simpler than normal scan-keys. For example, if you have the condition (a,b) = (5,1) then once the index has found that point, every subsequent entry in the index matches (except possibly NULLs). So you really don't actually need a row-comparison at any point after than. Now, if you have bracketing conditions like (a,b) BETWEEN (5,1) and (7,0) it gets more interesting. Ideally you'd want to pass both of these to the index so it knows when to stop. This really suggests using your plan B since you then have the possibility of passing both, which you don't really have with plan A. The latter also allows you to give more auxilliary conditions like b4. So it seems like a good idea. No plan C I can think of... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. signature.asc Description: Digital signature
Re: [HACKERS] ScanKey representation for RowCompare index conditions
Simon Riggs [EMAIL PROTECTED] writes: On Sun, 2006-01-15 at 18:03 -0500, Tom Lane wrote: There's one nontrivial decision still to make about how to implement proper per-spec row-comparison operations, namely: how a row comparison ought to be represented in the index access method API. I'm not sure I understand why. Surely a row wise comparison can take advantage of indexes on some of its columns? Not until we write the code to let it do so. Currently, what the system can match to an index on (a,b) is conditions like WHERE a = x AND b = y, which is quite a different animal both semantically and representationally from WHERE (a,b) = (x,y). (At least, assuming that one imputes the correct SQL-spec meaning to the latter, viz a x OR (a = x AND b = y).) Because of the semantic difference, we have to preserve the difference when passing the condition to the index AM, thus the need for my question. So why would we support this at all? Example applications here and here: http://archives.postgresql.org/pgsql-performance/2004-07/msg00188.php http://archives.postgresql.org/pgsql-performance/2005-12/msg00590.php (which demonstrate BTW the reasons for not trying to handle this by just expanding out the OR/AND equivalents). regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Docs off on ILIKE indexing?
http://www.postgresql.org/docs/8.1/static/indexes-types.html says: The optimizer can also use a B-tree index for queries involving the pattern matching operators LIKE, ILIKE, ~, and ~*, if the pattern is a constant and is anchored to the beginning of the string - for example, col LIKE 'foo%' or col ~ '^foo', but not col LIKE '%bar'. But really, does it use indexes for ILIKE? (And I assume the same holds for case insensitive regexp matching) (If it does, can someone enlighten me on what I have to do - I have a system with C locale that refuses to do it for ILIKE, but works just fine for LIKE. My workaronud for now is to create an index on lower(foo) and then use WHERE lower(foo) LIKE 'bar%' which works fine - but it does require an extra index..) So. Am I off, or are the docs? Or is it just me who can't read ;-) //Magnus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] ScanKey representation for RowCompare index conditions
Martijn van Oosterhout kleptog@svana.org writes: ISTM that row-wise comparisons, as far as indexes are concerned are actually simpler than normal scan-keys. For example, if you have the condition (a,b) = (5,1) then once the index has found that point, every subsequent entry in the index matches (except possibly NULLs). So you really don't actually need a row-comparison at any point after than. Well, yeah you do, eg if the caller starts demanding backward fetch; you need to be able to tell where to stop. In any case it's certainly necessary to pass down the whole row-condition into the index AM; without that, the closest you could get would be an indexscan on a = 5 which might have to scan an undesirably large number of rows before reaching the first one with b = 1. Now, if you have bracketing conditions like (a,b) BETWEEN (5,1) and (7,0) it gets more interesting. Ideally you'd want to pass both of these to the index so it knows when to stop. This really suggests using your plan B since you then have the possibility of passing both, which you don't really have with plan A. The latter also allows you to give more auxilliary conditions like b4. No, you misunderstood my plan A --- it's not giving up any of these options. But it's paying for it in complexity. I was envisioning a case like the above as being represented this way: sk_flagssk_attnosk_func sk_datum ROW_START a = 5 ROW_END b = 1 ROW_START a = 7 ROW_END b = 0 normal b 4 Hence my comments about weird rules for the ordering of entries under plan A --- the normal and ROW_START entries would be in order by attno, but the row continuation/end entries wouldn't appear to follow this global ordering. They'd follow a local order within each row condition, instead. Since you didn't understand what I was saying, I suspect that plan A is too confusing ... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)
Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2006-01-16 at 00:07 -0500, Rod Taylor wrote: A couple of days ago I found myself wanting to aggregate 3 Billion tuples down to 100 Million tuples based on an integer key with six integer values -- six sum()'s. There is already hash table overflow (spill to disk) logic in HashJoins, so this should be possible by reusing that code for HashAggs. That's on my todo list, but I'd welcome any assistance. Yeah, I proposed something similar awhile back in conjunction with fixing the spill logic for hash joins (which was always there, but was not adaptive until recently). I got the join part done but got distracted before fixing HashAgg :-( In principle, you just reduce the range of currently-in-memory hash codes whenever you run low on memory. The so-far-accumulated working state for aggregates that are not in the range anymore goes into a temp file, and subsequently any incoming tuples that hash outside the range go into another temp file. After you've completed the scan, you finalize and emit the aggregates that are still in memory, then you pick up the first set of dropped aggregates, rescan the associated TODO file of unprocessed tuples, lather rinse repeat till done. The tricky part is to preserve the existing guarantee that tuples are merged into their aggregate in arrival order. (This does not matter for the standard aggregates but it definitely does for custom aggregates, and there will be unhappy villagers appearing on our doorsteps if we break it.) I think this can work correctly under the above sketch but it needs to be verified. It might require different handling of the TODO files than what hashjoin does. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] [PATCH] Better way to check for getaddrinfo function.
R, Rajesh (STSD) [EMAIL PROTECTED] writes: Just thought that the following patch might improve checking for getaddrinfo function (in configure.in) Since AC_TRY_RUN tests cannot work in cross-compilation scenarios, you need an *extremely* good reason to put one in. I thought this might improve things doesn't qualify. Exactly what problem are you trying to solve and why is a run-time test necessary? Why doesn't the existing coding work for you? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Docs off on ILIKE indexing?
Magnus Hagander [EMAIL PROTECTED] writes: http://www.postgresql.org/docs/8.1/static/indexes-types.html says: The optimizer can also use a B-tree index for queries involving the pattern matching operators LIKE, ILIKE, ~, and ~*, if the pattern is a constant and is anchored to the beginning of the string - for example, col LIKE 'foo%' or col ~ '^foo', but not col LIKE '%bar'. But really, does it use indexes for ILIKE? That's pretty poorly phrased. For ILIKE it'll only work if there's a prefix of the pattern that's not letters (and hence is unaffected by the case-folding issue). regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Warm-up cache may have its virtue
On Sat, Jan 14, 2006 at 04:13:56PM -0500, Qingqing Zhou wrote: Qingqing Zhou [EMAIL PROTECTED] wrote I wonder if we should really implement file-system-cache-warmup strategy which we have discussed before. There are two natural good places to do this: (1) sequentail scan (2) bitmap index scan For the sake of memory, there is a third place a warm-up cache or pre-read is beneficial (OS won't help us): (3) xlog recovery Wouldn't it be better to improve pre-reading data instead, ie, making sure things like seqscan and bitmap scan always keep the IO system busy? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Improving N-Distinct estimation by ANALYZE
On Fri, Jan 13, 2006 at 11:37:38PM -0500, Tom Lane wrote: Josh Berkus josh@agliodbs.com writes: It's also worth mentioning that for datatypes that only have an = operator the performance of compute_minimal_stats is O(N^2) when values are unique, so increasing sample size is a very bad idea in that case. Hmmm ... does ANALYZE check for UNIQUE constraints? Our only implementation of UNIQUE constraints is btree indexes, which require more than an = operator, so this seems irrelevant. IIRC, the point was that if we know a field has to be unique, there's no sense in doing that part of the analysis on it; you'd only care about correlation. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] source documentation tool doxygen
On Mon, 2006-01-16 at 07:57 -0600, Andrew Dunstan wrote: I too have done this. But retrofitting Doxygen style comments to the PostgreSQL source code would be a big undertaking. Maintaining it, which would be another task for reviewers/committers, would also be a pain unless there were some automated checking tool. I don't think it would be all that painful. There would be no need to convert the entire source tree to use proper Doxygen-style comments in one fell swoop: individual files and modules can be converted whenever anyone gets the inclination to do so. I don't think the maintenance burden would be very substantial, either. -Neil ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] source documentation tool doxygen
Neil Conway [EMAIL PROTECTED] writes: I don't think it would be all that painful. There would be no need to convert the entire source tree to use proper Doxygen-style comments in one fell swoop: individual files and modules can be converted whenever anyone gets the inclination to do so. I don't think the maintenance burden would be very substantial, either. In the previous go-round on this topic, I seem to recall some concern about side-effects, to wit reducing the readability of the comments for ordinary non-doxygen code browsing. I'd be quite against taking any noticeable hit in that direction. A quick look through the doxygen manual doesn't make it sound too invasive, but I am worried about how well it will coexist with pgindent. It seems both tools think they can dictate the meaning of the characters immediately after /* of a comment block. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Surrogate keys (Was: enums)
On Sat, Jan 14, 2006 at 07:28:21PM +0900, Michael Glaesemann wrote: On Jan 13, 2006, at 21:42 , Leandro Guimar?es Faria Corcete DUTRA wrote: If you still declare the natural key(s) as UNIQUEs, you have just made performance worse. Now there are two keys to be checked on UPDATEs and INSERTs, two indexes to be updated, and probably a SEQUENCE too. For UPDATEs and INSERTs, the proper primary key also needs to be checked, but keys are used for more than just checking uniqueness: they're also often used in JOINs. Joining against a single integer I'd think it quite a different proposition (I'd think faster in terms of performance) than joining against, say, a text column or a composite key. a) the optimizer does a really poor job on multi-column index statistics b) If each parent record will have many children, the space savings from using a surrogate key can be quite large c) depending on how you view things, putting actual keys all over the place is denormalized Generally, I just use surrogate keys for everything unless performance dictates something else. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)
On Mon, 2006-01-16 at 09:42 -0500, Rod Taylor wrote: On Mon, 2006-01-16 at 08:32 +, Simon Riggs wrote: On Mon, 2006-01-16 at 00:07 -0500, Rod Taylor wrote: A question: Are the rows in your 3 B row table clumped together based upon the 100M row key? (or *mostly* so) We might also be able to They are randomly distributed. Fully sorting the data is quite painful. ... I don't understand how this helps. It wouldn't since your rows are randomly distributed. The idea was not related to improving HashAgg, but to improving Aggregation for the case of naturally grouped data. I think I need something closer to: HashAgg - HashSort (to disk) HashSort would create a number of files on disk with similar data. Grouping all similar keys into a single temporary file which HashAgg can deal with individually (100 loops by 1M target keys instead of 1 loop by 100M target keys). The results would be the same as partitioning by keyblock and running a HashAgg on each partition, but it would be handled by the Executor rather than by client side code. I've written something similar using a client and COPY with temporary tables. Even with the Export/Import copy I still beat the SortSum method PostgreSQL falls back to. That is exactly how the spill to disk logic works for HashJoin (and incidentally, identical to an Oracle one-pass hash join since both are based upon the hybrid hash join algorithm). Multi-pass would only be required to handle very skewed hash distributions, which HJ doesn't do yet. So yes, this can be done. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)
On Mon, 2006-01-16 at 12:36 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2006-01-16 at 00:07 -0500, Rod Taylor wrote: A couple of days ago I found myself wanting to aggregate 3 Billion tuples down to 100 Million tuples based on an integer key with six integer values -- six sum()'s. There is already hash table overflow (spill to disk) logic in HashJoins, so this should be possible by reusing that code for HashAggs. That's on my todo list, but I'd welcome any assistance. Yeah, I proposed something similar awhile back in conjunction with fixing the spill logic for hash joins (which was always there, but was not adaptive until recently). I got the join part done but got distracted before fixing HashAgg :-( You've done the main work. :-) The tricky part is to preserve the existing guarantee that tuples are merged into their aggregate in arrival order. (This does not matter for the standard aggregates but it definitely does for custom aggregates, and there will be unhappy villagers appearing on our doorsteps if we break it.) I think this can work correctly under the above sketch but it needs to be verified. It might require different handling of the TODO files than what hashjoin does. For HJ we write each outer tuple to its own file-per-batch in the order they arrive. Reading them back in preserves the original ordering. So yes, caution required, but I see no difficulty, just reworking the HJ code (nodeHashjoin and nodeHash). What else do you see? Best Regards, Simon Riggs ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] PostgreSQL win32 NT4
[EMAIL PROTECTED] wrote: NT4 is officially dead, IMHO no need for PostgreSQL to officially support it, let's leave place for companies offering commercial postgresql versions to work on it if they have enough customer requests. BTW Win 2000 is more or less 6 years old now ... Agreed. If they are using NT4 they certainly should be happy using PostgreSQL 8.1.X. So, we will have an NT4 solution, it just won't be our most recent major release. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] PostgreSQL win32 NT4
Bruce Momjian wrote: [EMAIL PROTECTED] wrote: NT4 is officially dead, IMHO no need for PostgreSQL to officially support it, let's leave place for companies offering commercial postgresql versions to work on it if they have enough customer requests. BTW Win 2000 is more or less 6 years old now ... I believe Microsoft has an official 10 year support policy. Although the don't backport all features (which makes sense). Agreed. If they are using NT4 they certainly should be happy using PostgreSQL 8.1.X. So, we will have an NT4 solution, it just won't be our most recent major release. -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: PLphp, PLperl - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] PostgreSQL win32 NT4
NT4 is officially dead, IMHO no need for PostgreSQL to officially support it, let's leave place for companies offering commercial postgresql versions to work on it if they have enough customer requests. BTW Win 2000 is more or less 6 years old now ... I believe Microsoft has an official 10 year support policy. Although the don't backport all features (which makes sense). They're 10 years for extended support. Only 5 years for mainstream support, meaning it has already passed. The difference? You can no longer get a hotfix unless you have a specific hotfix agreement, that you have to buy separate. You don't get any no-charge incident support (that would be support-because-of-a-bug). You don't get to make any feature requests (though seriously, they hardly ever listen to you unless you're really big anyway). And you don't get any warranty claims, whatever that means in reality. You do, however, get paid support and security hotfixes. //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)
Simon Riggs [EMAIL PROTECTED] writes: For HJ we write each outer tuple to its own file-per-batch in the order they arrive. Reading them back in preserves the original ordering. So yes, caution required, but I see no difficulty, just reworking the HJ code (nodeHashjoin and nodeHash). What else do you see? With dynamic adjustment of the hash partitioning, some tuples will go through multiple temp files before they ultimately get eaten, and different tuples destined for the same aggregate may take different paths through the temp files depending on when they arrive. It's not immediately obvious that ordering is preserved when that happens. I think it can be made to work but it may take different management of the temp files than hashjoin uses. (Worst case, we could use just a single temp file for all unprocessed tuples, but this would result in extra I/O.) regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Improving N-Distinct estimation by ANALYZE
On Mon, 2006-01-16 at 12:26 -0600, Jim C. Nasby wrote: On Fri, Jan 13, 2006 at 11:37:38PM -0500, Tom Lane wrote: Josh Berkus josh@agliodbs.com writes: It's also worth mentioning that for datatypes that only have an = operator the performance of compute_minimal_stats is O(N^2) when values are unique, so increasing sample size is a very bad idea in that case. Hmmm ... does ANALYZE check for UNIQUE constraints? Our only implementation of UNIQUE constraints is btree indexes, which require more than an = operator, so this seems irrelevant. IIRC, the point was that if we know a field has to be unique, there's no sense in doing that part of the analysis on it; you'd only care about correlation. An interesting point: if we know the cardinality by definition there is no need to ANALYZE for that column. An argument in favour of the definitional rather than the discovery approach. As a designer I very often already know the cardinality by definition, so I would appreciate the opportunity to specify that to the database. Tom has not spoken against checking for UNIQUE constraints: he is just pointing out that there never could be a constraint in the case I was identifying. For other datatypes we could check for them rather than analyzing the sample, but the effect would be the same either way. My original point was that behaviour is O(N^2) when unique; it is still O(N^2) when nearly unique, albeit O(N^2) - O(N). Reducing stats_target does not help there - the effect varies according to number of rows in the sample, not num slots in MCV list. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Improving N-Distinct estimation by ANALYZE
On Fri, 13 Jan 2006 19:18:29 +, Simon Riggs [EMAIL PROTECTED] wrote: I enclose a patch for checking out block sampling. Can't comment on the merits of block sampling and your implementation thereof. Just some nitpicking: |! * Row Sampling: As of May 2004, we use the Vitter algorithm to create Linking the use of the Vitter algorithm to May 2004 is not quite appropriate. We introduced two stage sampling at that time. | * a random sample of targrows rows (or less, if there are less in the |! * sample of blocks). In this case, targblocks is always the same as |! * targrows, so we always read one row per block. This is just wrong, unless you add on average. Even then it is a bit misleading, because in most cases we *read* more tuples than we use. | * Although every row has an equal chance of ending up in the final | * sample, this sampling method is not perfect: not every possible | * sample has an equal chance of being selected. For large relations | * the number of different blocks represented by the sample tends to be |! * too small. In that case, block sampling should be used. Is the last sentence a fact or personal opinion? | * block. The previous sampling method put too much credence in the row | * density near the start of the table. FYI, previous refers to the date mentioned above: previous == before May 2004 == before two stage sampling. |+ /* Assume that we will have rows at least 64 bytes wide |+ * Currently we're very unlikely to overflow availMem here |+ */ |+ if ((allocrows * sizeof(HeapTuple)) (allowedMem 4)) This is a funny way of saying if (allocrows * (sizeof(Heaptuple) + 60) allowedMem) It doesn't match the comment above; and it is something different if the size of a pointer is not four bytes. |- if (bs.m 0) |+ if (bs.m 0 ) Oops. |+ ereport(DEBUG2, |+ (errmsg(ANALYZE attr %u sample: n=%u nmultiple=%u f1=%u d=%u, |+ stats-tupattnum,samplerows, nmultiple, f1, d))); ^ missing space here and in some more places I haven't been following the discussion too closely but didn't you say that a block sampling algorithm somehow compensates for the fact that the sample is not random? Servus Manfred ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Improving N-Distinct estimation by ANALYZE
Simon Riggs [EMAIL PROTECTED] writes: Tom has not spoken against checking for UNIQUE constraints: he is just pointing out that there never could be a constraint in the case I was identifying. More generally, arguing for or against any system-wide change on the basis of performance of compute_minimal_stats() is folly, because that function is not used for any common datatypes. (Ideally it'd never be used at all.) regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Anyone see a need for BTItem/HashItem?
I'm considering getting rid of the BTItem/BTItemData and HashItem/HashItemData struct definitions and just referencing IndexTuple(Data) directly in the btree and hash AMs. It appears that at one time in the forgotten past, there was some access-method-specific data in index entries in addition to the common IndexTuple struct, but that's been gone for a long time and I can't see a reason why either of these AMs would resurrect it. So this just seems like extra notational cruft to me, as well as an extra layer of palloc overhead (see eg _bt_formitem()). GIST already got rid of this concept, or never had it. Does anyone see a reason to keep this layer of struct definitions? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Anyone see a need for BTItem/HashItem?
On Mon, Jan 16, 2006 at 03:52:01PM -0500, Tom Lane wrote: I'm considering getting rid of the BTItem/BTItemData and HashItem/HashItemData struct definitions and just referencing IndexTuple(Data) directly in the btree and hash AMs. It appears that at one time in the forgotten past, there was some access-method-specific data in index entries in addition to the common IndexTuple struct, but that's been gone for a long time and I can't see a reason why either of these AMs would resurrect it. So this just seems like extra notational cruft to me, as well as an extra layer of palloc overhead (see eg _bt_formitem()). GIST already got rid of this concept, or never had it. Does anyone see a reason to keep this layer of struct definitions? If you cut it out, what will the heap and index access methods needed for SQL/MED use? Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Anyone see a need for BTItem/HashItem?
David Fetter [EMAIL PROTECTED] writes: On Mon, Jan 16, 2006 at 03:52:01PM -0500, Tom Lane wrote: Does anyone see a reason to keep this layer of struct definitions? If you cut it out, what will the heap and index access methods needed for SQL/MED use? What's that have to do with this? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Anyone see a need for BTItem/HashItem?
>From what I've seen, I don't think we need to keep them around.On 1/16/06, Tom Lane [EMAIL PROTECTED] wrote: I'm considering getting rid of the BTItem/BTItemData andHashItem/HashItemData struct definitions and just referencing IndexTuple(Data) directly in the btree and hash AMs.It appears thatat one time in the forgotten past, there was some access-method-specificdata in index entries in addition to the common IndexTuple struct, but that's been gone for a long time and I can't see a reason why either ofthese AMs would resurrect it.So this just seems like extra notationalcruft to me, as well as an extra layer of palloc overhead (see eg _bt_formitem()).GIST already got rid of this concept, or never had it.Does anyone see a reason to keep this layer of struct definitions?regards, tom lane---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Anyone see a need for BTItem/HashItem?
On Mon, Jan 16, 2006 at 04:02:07PM -0500, Tom Lane wrote: David Fetter [EMAIL PROTECTED] writes: On Mon, Jan 16, 2006 at 03:52:01PM -0500, Tom Lane wrote: Does anyone see a reason to keep this layer of struct definitions? If you cut it out, what will the heap and index access methods needed for SQL/MED use? What's that have to do with this? I'm sure you'll correct me if I'm mistaken, but this is a candidate for the spot where such interfaces--think of Informix's Virtual (Table|Index) Interface--would go. Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] equivalence class not working?
not sure if this is the right place to post... I am using postgres 8.1. In indxpath.c, it says Note: if Postgres tried to optimize queries by forming equivalenceclasses over equi-joined attributes (i.e., if it recognized that a qualification such as where a.b=c.d and a.b=5 could make use ofan index on c.d), then we could use that equivalence class info here with joininfo_list to do more complete tests for the usabilityof a partial index. . XXX as of 7.1, equivalence class info *is* available. Now i have the following two queries on TPC-H, where there is an index built on o_totalprice. explain select * from lineitem, orders where o_totalprice=l_extendedprice and l_extendedprice2000; explain select * from lineitem, orders where o_totalprice=l_extendedprice and l_extendedprice2000 and o_totalprice2000; The second query uses the index while the first does not. It seems to me that both queries are the same (the o_totalprice2000 in the second query can be inferred). Is there something that needs to be tuned or ...? thanks a lot!
Re: [HACKERS] Anyone see a need for BTItem/HashItem?
David Fetter [EMAIL PROTECTED] writes: On Mon, Jan 16, 2006 at 04:02:07PM -0500, Tom Lane wrote: David Fetter [EMAIL PROTECTED] writes: If you cut it out, what will the heap and index access methods needed for SQL/MED use? What's that have to do with this? I'm sure you'll correct me if I'm mistaken, but this is a candidate for the spot where such interfaces--think of Informix's Virtual (Table|Index) Interface--would go. Can't imagine putting anything related to external-database access inside either the btree or hash AMs; it'd only make sense to handle it at higher levels. It's barely conceivable that external access would make sense as a specialized AM in its own right, but I don't see managing external links exclusively within the indexes. IOW, if we did need extra stuff in IndexTuples for external access, we'd want to put it inside IndexTuple, not in a place where it could only be seen by these AMs. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] equivalence class not working?
uwcssa [EMAIL PROTECTED] writes: I am using postgres 8.1. In indxpath.c, it says Note: if Postgres tried to optimize queries by forming equivalence classes over equi-joined attributes (i.e., if it recognized that aqualification such as where a.b=3Dc.d and a.b=3D5 could make use of an index on c.d), then we could use that equivalence class info here with joininfo_list to do more complete tests for the usability of a partial index. . XXX as of 7.1, equivalence class info *is* available. Are you deliberately ignoring the rest of the comment? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] ScanKey representation for RowCompare index conditions
On Mon, Jan 16, 2006 at 12:07:44PM -0500, Tom Lane wrote: Since you didn't understand what I was saying, I suspect that plan A is too confusing ... Umm, yeah. Now you've explained it I think it should be excluded on the basis that it'll be a source of bugs. For all the places that matter a row-condition needs to be treated as a whole and storing parts in a top level ScanKey array just seems like asking for trouble. Given no plan C, I guess plan B is the way to go...? -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. signature.asc Description: Digital signature
Re: [HACKERS] Anyone see a need for BTItem/HashItem?
On Mon, Jan 16, 2006 at 04:21:50PM -0500, Tom Lane wrote: David Fetter [EMAIL PROTECTED] writes: On Mon, Jan 16, 2006 at 04:02:07PM -0500, Tom Lane wrote: David Fetter [EMAIL PROTECTED] writes: If you cut it out, what will the heap and index access methods needed for SQL/MED use? What's that have to do with this? I'm sure you'll correct me if I'm mistaken, but this is a candidate for the spot where such interfaces--think of Informix's Virtual (Table|Index) Interface--would go. Can't imagine putting anything related to external-database access inside either the btree or hash AMs; it'd only make sense to handle it at higher levels. It's barely conceivable that external access would make sense as a specialized AM in its own right, but I don't see managing external links exclusively within the indexes. IOW, if we did need extra stuff in IndexTuples for external access, we'd want to put it inside IndexTuple, not in a place where it could only be seen by these AMs. Thanks for the explanation :) Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] source documentation tool doxygen
On Jan 17, 2006, at 3:51 , Tom Lane wrote: A quick look through the doxygen manual doesn't make it sound too invasive, but I am worried about how well it will coexist with pgindent. It seems both tools think they can dictate the meaning of the characters immediately after /* of a comment block. I haven't looked at it yet, but might there not be a way to have a preprocessing step where the current comment format is converted to something doxygen-friendly? pg_indent2doxygen or something? Then the current comment style could be maintained and doxygen could get what it likes. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] ScanKey representation for RowCompare index conditions
Martijn van Oosterhout kleptog@svana.org writes: On Mon, Jan 16, 2006 at 12:07:44PM -0500, Tom Lane wrote: Since you didn't understand what I was saying, I suspect that plan A is too confusing ... Umm, yeah. Now you've explained it I think it should be excluded on the basis that it'll be a source of bugs. For all the places that matter a row-condition needs to be treated as a whole and storing parts in a top level ScanKey array just seems like asking for trouble. Actually, I'd just been coming back to plan A, after realizing that it's not possible to do this with zero change in _bt_checkkeys() after all. The amount of information passed down to an ordinary sk_func isn't sufficient to do what the row-compare subroutine would need to do: it needs access to the whole indextuple, not only the first column's value, plus access to the direction and continuescan variables. We could add extra parameters to what gets passed, but that's certainly no cheaper than adding a test on some sk_flags bits. And having the row comparison logic inside bt_checkkeys is probably more understandable than having a separate subroutine with strange calling conventions. I think given adequate documentation in skey.h, either representation is about the same as far as the source-of-bugs argument goes. The main risk you get from a Plan-B approach is that it's easier to fail to notice bugs of omission, where a piece of code looking at a ScanKey needs to do something special with a row-compare key and omits to do so. At least that's what I'm guessing at the moment --- I haven't actually tried to code up any of the changes yet. In any case, since we are only intending to support this for searches of btree indexes, there are not that many pieces of code that will be concerned with it. I think it's just _bt_first, _bt_preprocess_keys, _bt_checkkeys, and ExecIndexBuildScanKeys. Everyplace else will be dealing only in simple ScanKeys with no row comparisons. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] source documentation tool doxygen
On Tue, Jan 17, 2006 at 06:51:15AM +0900, Michael Glaesemann wrote: I haven't looked at it yet, but might there not be a way to have a preprocessing step where the current comment format is converted to something doxygen-friendly? pg_indent2doxygen or something? Then the current comment style could be maintained and doxygen could get what it likes. It's not about getting in all the comments, it's mostly about comments describing functions / makros / typedefs / structures / global variables and the like. Comments in function bodies will just stay I guess. However we could try to find some de facto rules about comments to get most of the matches: For example a comment starting on column 0 of the line is probably one you're looking for vs. one that has some whitespace at the beginning is not. Of course this would still need manual review for every file. Joachim -- Joachim Wieland [EMAIL PROTECTED] C/ Usandizaga 12 1°B ICQ: 37225940 20002 Donostia / San Sebastian (Spain) GPG key available ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] source documentation tool doxygen
Wow, looks great. Is that URL stable? Can we link to it from the PostgreSQL developers page? http://www.postgresql.org/developer/coding --- Joachim Wieland wrote: I've created a browsable source tree documentation, it's done with the doxygen tool. http://www.mcknight.de/pgsql-doxygen/cvshead/html/ There was a discussion about this some time ago, Jonathan Gardner proposed it here: http://archives.postgresql.org/pgsql-hackers/2004-03/msg00748.php quite a few people found it useful but it somehow got stuck. The reason was apparently that you'd have to tweak your comments in order to profit from it as much as possible. However I still think it's a nice-to-have among the online documentation and it gives people quite new to the code the possibility to easily check what is where defined and how... What do you think? doxygen can also produce a pdf but I haven't succeeded in doing that so far, pdflatex keeps bailing out. Has anybody else succeeded building this yet? Joachim -- Joachim Wieland [EMAIL PROTECTED] C/ Usandizaga 12 1?B ICQ: 37225940 20002 Donostia / San Sebastian (Spain) GPG key available ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [pgsql-www] [HACKERS] source documentation tool doxygen
the only question I have ... is there any way of improving that right index? Love the 'detail pages', but somehow making the right index more 'tree like' might make navigation a bit easier ... On Mon, 16 Jan 2006, Bruce Momjian wrote: Wow, looks great. Is that URL stable? Can we link to it from the PostgreSQL developers page? http://www.postgresql.org/developer/coding --- Joachim Wieland wrote: I've created a browsable source tree documentation, it's done with the doxygen tool. http://www.mcknight.de/pgsql-doxygen/cvshead/html/ There was a discussion about this some time ago, Jonathan Gardner proposed it here: http://archives.postgresql.org/pgsql-hackers/2004-03/msg00748.php quite a few people found it useful but it somehow got stuck. The reason was apparently that you'd have to tweak your comments in order to profit from it as much as possible. However I still think it's a nice-to-have among the online documentation and it gives people quite new to the code the possibility to easily check what is where defined and how... What do you think? doxygen can also produce a pdf but I haven't succeeded in doing that so far, pdflatex keeps bailing out. Has anybody else succeeded building this yet? Joachim -- Joachim Wieland [EMAIL PROTECTED] C/ Usandizaga 12 1?B ICQ: 37225940 20002 Donostia / San Sebastian (Spain) GPG key available ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] equivalence class not working?
Fine. The rest documentation says:For now, the test only uses restriction clauses (those in restrictinfo_list). --Nels, Dec '92, however, I understand it as being overridden by the followup, which is:XXX as of 7.1, equivalence class info *is* available. Considerimproving this code as foreseen by Nels. Therefore, equivalence class should be detected and used for index selection... or anyone could tell me if after 7.1 Postgresql has determined not to use equi-join for index selection... On 1/16/06, Tom Lane [EMAIL PROTECTED] wrote: uwcssa [EMAIL PROTECTED] writes: I am using postgres 8.1.In indxpath.c, it says Note: if Postgres tried to optimize queries by forming equivalence classes over equi-joined attributes (i.e., if it recognized that aqualification such as where a.b=3Dc.d and a.b=3D5 could make use ofan index on c.d), then we could use that equivalence class info here with joininfo_list to do more complete tests for the usability of a partial index..XXX as of 7.1, equivalence class info *is* available.Are you deliberately ignoring the rest of the comment? regards, tom lane
Re: [pgsql-www] [HACKERS] source documentation tool doxygen
On Jan 17, 2006, at 8:40 , Marc G. Fournier wrote: the only question I have ... is there any way of improving that right index? Love the 'detail pages', but somehow making the right index more 'tree like' might make navigation a bit easier ... Along those lines, I wonder if a CSS couldn't be worked up to integrate the look with the rest of the site. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Improving N-Distinct estimation by ANALYZE
On Mon, 2006-01-16 at 21:24 +0100, Manfred Koizar wrote: On Fri, 13 Jan 2006 19:18:29 +, Simon Riggs [EMAIL PROTECTED] wrote: I enclose a patch for checking out block sampling. Can't comment on the merits of block sampling and your implementation thereof. But your thoughts are welcome... |! * Row Sampling: As of May 2004, we use the Vitter algorithm to create Linking the use of the Vitter algorithm to May 2004 is not quite appropriate. We introduced two stage sampling at that time. I'll redo some of the comments as you suggest and review coding points. | * a random sample of targrows rows (or less, if there are less in the |! * sample of blocks). In this case, targblocks is always the same as |! * targrows, so we always read one row per block. This is just wrong, unless you add on average. Even then it is a bit misleading, because in most cases we *read* more tuples than we use. The current code reads a number of blocks == targrows, hence one per block but I'll change the comment. | * Although every row has an equal chance of ending up in the final | * sample, this sampling method is not perfect: not every possible | * sample has an equal chance of being selected. For large relations | * the number of different blocks represented by the sample tends to be |! * too small. In that case, block sampling should be used. Is the last sentence a fact or personal opinion? That is my contention, based upon research. The existing code clearly identifies that case as requiring a solution. We use Chaudhuri et al's 1998 result for the sufficiency of sample size, yet overlook his estimators and his ideas for random block selection. My first point is that we need proportional sample size, not fixed size. [I show that for large tables we currently use a sample that is 2-5 times too small, even based on Chaudhuri's work.] ANALYZE is very I/O bound: random row sampling would need to scan the whole table to take a 2-3% sample in most cases. Block sampling is a realistic way of getting a larger sample size without loss of performance, and also a way of getting a reasonable sample when accessing very few blocks. We would keep random row sampling for smaller relations where the performance effects of sample collection are not noticeable. I haven't been following the discussion too closely but didn't you say that a block sampling algorithm somehow compensates for the fact that the sample is not random? I haven't said that block sampling compensates for the sample being non-random. There is a danger of bias in the sample and I made that clear. I've tried to mitigate that by the use of random blocks, reusing your code and taking note of the earlier problems you solved. However, block sampling allows us to spot cases where rows are naturally grouped together, which row sampling doesn't spot until very high sample sizes. I explored in detail a common case where this causes the current method to fall down very badly. Brutlag Richardson [2000] give a good run down on block sampling and I'm looking to implement some of his block estimators to complement the current patch. (Credit to Andrew Dunstan for locating the paper...) http://www.stat.washington.edu/www/research/reports/1999/tr355.ps [I did briefly explore the possibility of hybrid row/block sampling, using row sampling as well some block sampling to identify grouping, with an attempt to avoid swamping the sample with biased rows; but I don't have a strong basis for that, so I've ignored that for now.] This patch allows us to explore the possible bias that block sampling gives, allowing us to make a later decision to include it, or not. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [pgsql-www] [HACKERS] source documentation tool doxygen
This was my plan all along, just been waiting for someone to make it work with the postgresql code and then send instructions to the postgresql web team on how to set it up. Robert Treat On Monday 16 January 2006 18:32, Bruce Momjian wrote: Wow, looks great. Is that URL stable? Can we link to it from the PostgreSQL developers page? http://www.postgresql.org/developer/coding --- Joachim Wieland wrote: I've created a browsable source tree documentation, it's done with the doxygen tool. http://www.mcknight.de/pgsql-doxygen/cvshead/html/ There was a discussion about this some time ago, Jonathan Gardner proposed it here: http://archives.postgresql.org/pgsql-hackers/2004-03/msg00748.php quite a few people found it useful but it somehow got stuck. The reason was apparently that you'd have to tweak your comments in order to profit from it as much as possible. However I still think it's a nice-to-have among the online documentation and it gives people quite new to the code the possibility to easily check what is where defined and how... What do you think? doxygen can also produce a pdf but I haven't succeeded in doing that so far, pdflatex keeps bailing out. Has anybody else succeeded building this yet? Joachim -- Joachim Wieland [EMAIL PROTECTED] C/ Usandizaga 12 1?B ICQ: 37225940 20002 Donostia / San Sebastian (Spain) GPG key available ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)
On Mon, 2006-01-16 at 14:43 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: For HJ we write each outer tuple to its own file-per-batch in the order they arrive. Reading them back in preserves the original ordering. So yes, caution required, but I see no difficulty, just reworking the HJ code (nodeHashjoin and nodeHash). What else do you see? With dynamic adjustment of the hash partitioning, some tuples will go through multiple temp files before they ultimately get eaten, and different tuples destined for the same aggregate may take different paths through the temp files depending on when they arrive. It's not immediately obvious that ordering is preserved when that happens. I think it can be made to work but it may take different management of the temp files than hashjoin uses. (Worst case, we could use just a single temp file for all unprocessed tuples, but this would result in extra I/O.) Sure hash table is dynamic, but we read all inner rows to create the hash table (nodeHash) before we get the outer rows (nodeHJ). Why would we continue to dynamically build the hash table after the start of the outer scan? (I see that we do this, as you say, but I am surprised). Best Regards, Simon Riggs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)
Simon Riggs [EMAIL PROTECTED] writes: Sure hash table is dynamic, but we read all inner rows to create the hash table (nodeHash) before we get the outer rows (nodeHJ). But our idea of the number of batches needed can change during that process, resulting in some inner tuples being initially assigned to the wrong temp file. This would also be true for hashagg. Why would we continue to dynamically build the hash table after the start of the outer scan? The number of tuples written to a temp file might exceed what we want to hold in memory; we won't detect this until the batch is read back in, and in that case we have to split the batch at that time. For hashagg this point would apply to the aggregate states not the input tuples, but it's still a live problem (especially if the aggregate states aren't fixed-size values ... consider a concat aggregate for instance). regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] source documentation tool doxygen
On Monday 16 January 2006 10:51, Tom Lane wrote: Neil Conway [EMAIL PROTECTED] writes: I don't think it would be all that painful. There would be no need to convert the entire source tree to use proper Doxygen-style comments in one fell swoop: individual files and modules can be converted whenever anyone gets the inclination to do so. I don't think the maintenance burden would be very substantial, either. In the previous go-round on this topic, I seem to recall some concern about side-effects, to wit reducing the readability of the comments for ordinary non-doxygen code browsing. I'd be quite against taking any noticeable hit in that direction. A quick look through the doxygen manual doesn't make it sound too invasive, but I am worried about how well it will coexist with pgindent. It seems both tools think they can dictate the meaning of the characters immediately after /* of a comment block. In my experimentation, I found that there are really three paths for using Doxygen on code. (1) Do nothing to the code. Doxygen in this state works pretty well. You can browse the code and you can find references to things. Some of the comments are picked up correctly as-is. It does a really good job of formatting the code for HTML. (2) Do minimally invasive changes to make Doxygen recognize comments better. Example: Move comments from behind to in-front, or vice-versa. Put an extra '*' after '/*' to denote special documentation. This makes Doxygen especially useful, because most things already have proper documentation and bringing that out to the person who browses Doxygen's output is very good. All of a sudden, we have a manual that reads like a book describing how PostgreSQL is put together PLUS we still have the marked-up code. (3) Do maximally invasive changes to the comments. This includes things like using Doxygen-specific markup, putting in LaTeX for formulas, and putting in cross-references. This is a huge amount of work (initial + maintenance) and doesn't add much benefit beyond what (2) has. It makes the book read more like a book, but it was already pretty readable. It's not a question of WHETHER to use Doxygen. You can use it whether or not anyone else wants to. It's really a question of how far do we bend backwards to accomodate Doxygen. I'd say we just stick to moving comments around and putting in the extra '*'s. Let's not encourage people to put in Doxygen markup that isn't obvious. If someone contributes code with Doxygen mark-up, we won't turn them down but we won't promise we'll keep their markup as pretty as it was when they submitted it. -- Jonathan Gardner [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] FW: PGBuildfarm member firefly Branch HEAD Failed at Stage Check
PG Build Farm wrote: The PGBuildfarm member firefly had the following event on branch HEAD: Failed at Stage: Check The snapshot timestamp for the build that triggered this notification is: 2006-01-17 03:27:00 The specs of this machine are: OS: UnixWare / 7.1.4 Arch: i386 Comp: cc (SCO UDK) / 4.2 For more information, see http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=fireflybr=HEAD This is a Stats failure. I thought the latest change Tom put in would fix it. I guess not. How can I help? LER -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 512-248-2683 E-Mail: ler@lerctr.org US Mail: 430 Valona Loop, Round Rock, TX 78681-3683 US ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)
Tom Lane [EMAIL PROTECTED] writes: Why would we continue to dynamically build the hash table after the start of the outer scan? The number of tuples written to a temp file might exceed what we want to hold in memory; we won't detect this until the batch is read back in, and in that case we have to split the batch at that time. For hashagg this point would apply to the aggregate states not the input tuples, but it's still a live problem (especially if the aggregate states aren't fixed-size values ... consider a concat aggregate for instance). For a hash aggregate would it be possible to rescan the original table instead of spilling to temporary files? Then when you run out of working memory you simply throw out half the hash table and ignore subsequent tuples that fall in those hash buckets. Then you rescan for the discarded hash bucket regions. This avoids having to do any disk writes at the expense possibly of additional reads. I think in terms of i/o it would be much faster in most cases. The downsides are: a) volatile aggregates or aggregates with side-effects would be confused by being executed twice. I'm not clear that volatile aggregate functions make any sense anyways though. b) I'm unclear whether rescanning the table could potentially find tuples in a different state than previous scans. If so then the idea doesn't work at all. But I don't think that's possible is it? The main problem is c) it may lose in terms of i/o for cases where the cardinality is low (ie, it's overflowing despite having low cardinality because the table is really really big too). But most cases will be spilling because the cardinality is high. So the table may be big but the spill files are nearly as big anyways and having to write and then read them means double the i/o. The upside of not having to write out temporary files is big. I find queries that require temporary sort files get hit with a *huge* performance penalty. Often an order of magnitude. Part of that could probably be mitigated by having the sort files on a separate spindle but I think it's always going to hurt especially if there are multiple operations spilling to disk simultaneously. -- greg ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] source documentation tool doxygen
Bruce Momjian pgman@candle.pha.pa.us writes: Wow, looks great. Is that URL stable? Can we link to it from the PostgreSQL developers page? The thing seems to have only the very vaguest grasp on whether it is parsing C or C++ ... or should I say that it is convinced it is parsing C++ despite all evidence to the contrary? I'd be happier with the pretty pictures if they had some connection to reality. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)
Greg Stark [EMAIL PROTECTED] writes: For a hash aggregate would it be possible to rescan the original table instead of spilling to temporary files? Sure, but the possible performance gain is finite and the possible performance loss is not. The original table could be an extremely expensive join. We'd like to think that the planner gets the input size estimate approximately right and so the amount of extra I/O caused by hash table resizing should normally be minimal. The cases where it is not right are *especially* not likely to be a trivial table scan as you are supposing. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)
On Mon, 2006-01-16 at 20:02 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Sure hash table is dynamic, but we read all inner rows to create the hash table (nodeHash) before we get the outer rows (nodeHJ). But our idea of the number of batches needed can change during that process, resulting in some inner tuples being initially assigned to the wrong temp file. This would also be true for hashagg. So we correct that before we start reading the outer table. Why would we continue to dynamically build the hash table after the start of the outer scan? The number of tuples written to a temp file might exceed what we want to hold in memory; we won't detect this until the batch is read back in, and in that case we have to split the batch at that time. For hashagg this point would apply to the aggregate states not the input tuples, but it's still a live problem (especially if the aggregate states aren't fixed-size values ... consider a concat aggregate for instance). OK, I see what you mean. Sounds like we should have a new definition for Aggregates, Sort Insensitive that allows them to work when the input ordering does not effect the result, since that case can be optimised much better when using HashAgg. Since we know that applies to the common cases of SUM, AVG etc this will certainly help people. For sort-sensitive aggregates sounds like we either: 1. Write to a single file, while we remember the start offset of the first row of each batch. 2. Write to multiple files, adding a globally incrementing sequenceid. Batches are then resorted on the sequenceid before processing. 3. We give up, delete the existing batches and restart the scan from the beginning of the outer table. Sounds like (1) is best, since the overflow just becomes a SortedAgg. But all of them sound ugly. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] [PATCH] Better way to check for getaddrinfo function.
Title: RE: [GENERAL] [PATCH] Better way to check for getaddrinfo function. That was very much situation specific. But the bottomline is the default test does not include netdb.h in the test code. So, pg uses getaddrinfo.c.And the getaddrinfo.c does not work for me. Ipv6 client authenciation fails. I have modified the patch. $ diff -r configure.in configure.in.new 918a919 AC_MSG_CHECKING([for getaddrinfo]) 920c921,926 AC_REPLACE_FUNCS([getaddrinfo]) --- AC_TRY_LINK([#include netdb.h #include assert.h], [char (*f)();f=getaddrinfo;], ac_cv_func_getaddrinfo=yes, ac_cv_func_getaddrinfo=no) if test x$ac_cv_func_getaddrinfo = xyes; then AC_DEFINE(HAVE_GETADDRINFO,1,[Define if you have the getaddrinfo function]) fi 923a930 AC_MSG_RESULT([$ac_cv_func_getaddrinfo]) Rajesh R -- This space intentionally left non-blank. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED]] Sent: Monday, January 16, 2006 11:28 PM To: R, Rajesh (STSD) Cc: pgsql-hackers@postgresql.org; pgsql-general@postgresql.org Subject: Re: [GENERAL] [PATCH] Better way to check for getaddrinfo function. R, Rajesh (STSD) [EMAIL PROTECTED] writes: Just thought that the following patch might improve checking for getaddrinfo function (in configure.in) Since AC_TRY_RUN tests cannot work in cross-compilation scenarios, you need an *extremely* good reason to put one in. I thought this might improve things doesn't qualify. Exactly what problem are you trying to solve and why is a run-time test necessary? Why doesn't the existing coding work for you? regards, tom lane configure-in.patch configure-in.patch Description: configure-in.patch ---(end of broadcast)--- TIP 6: explain analyze is your friend