Re: [HACKERS] Performance optimization of btree binary search
On Fri, Dec 6, 2013 at 4:53 PM, Peter Geoghegan p...@heroku.com wrote: I had considered that something like Intel Speedstep technology had a role here, but I'm pretty sure it steps up very aggressively when things are CPU bound - I tested that against a Core 2 Duo desktop a couple of years back, where it was easy to immediately provoke it by moving around desktop windows or something. I decided to increase the default CPU governor from ondemand to performance for each of the 8 logical cores on this system. I then re-ran the benchmark. I saw markedly better, much more *consistent* performance for master [1]. I Googled for clues, and found this: https://communities.intel.com/community/datastack/blog/2013/08/05/how-to-maximise-cpu-performance-for-the-oracle-database-on-linux (It happens to mention Oracle, but I think it would equally well apply to any database). I strongly suspect this is down to Kernel version. I should highlight this: Another further CPU setting is the Energy/Performance Bias and Red Hat and Oracle users should note that the default setting has changed in the Linux kernel used between the releases of Red Hat/Oracle Linux 5 and Red Hat/Oracle Linux 6. (Some system BIOS options may include a setting to prevent the OS changing this value). In release 5 Linux did not set a value for this setting and therefore the value remained at 0 for a bias towards performance. In Red Hat 6 this behaviour has changed and the default sets a median range to move this bias more towards conserving energy (remember the same Linux kernel is present in both ultrabooks as well as servers and on my ultrabook I use powertop and the other Linux tools and configurations discussed here to maximise battery life) and reports the following in the dmesg output on boot. ... You can also use the tool to set a lower value to change the bias entirely towards performance (the default release 5 behaviour). If there is regression in Postgres performance on more recent Linux kernels [2], perhaps this is it. I certainly don't recall hearing advice on this from the usual places. I'm surprised that turbo boost mode wasn't enabled very quickly on a workload like this. It makes a *huge* difference - at 4 clients (one per physical core), setting the CPU governor to performance increases TPS by a massive 40% compared to some earlier, comparable runs of master. These days Redhat are even pointing out that CPU governor policy can be set via cron jobs [3]. I cannot account for why the original benchmark performed was consistent with the patch having helped to such a large degree, given the large number of runs involved, and their relatively long duration for a CPU/memory bound workload. As I said, this machine is on dedicated hardware, and virtualization was not used. However, at this point I have no choice but to withdraw it from consideration, and not pursue this any further. Sorry for the noise. [1] http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/turbo/ [2] http://www.postgresql.org/message-id/529f7d58.1060...@agliodbs.com [3] https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Power_Management_Guide/cpufreq_governors.html -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Thu, Dec 5, 2013 at 2:19 AM, Peter Geoghegan p...@heroku.com wrote: I did a relatively short variant 'insert' pgbench-tools benchmark, with a serial primary index created on the pgbench_history table. Results: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/insert/ I saw something today that made me lose confidence in the results I've presented. I was unsuccessful in re-creating similar 'select' numbers at scale 15 on the same server [1] (originally I wanted to see how eliding the fmgr trapoline worked out with binary searching only). Then, when re-testing master as of commit 8e18d04d4daf34b8a557e2dc553a7754b255cd9a (my feature branches branched off from that master commit), I noticed that even after the last pgbench had run, a single postgres process was stuck at 100% CPU usage for upwards of a minute (the run referred to was for 32 clients, and I only saw that one process in 'top' output). To me this points to either 1) some kind of contamination or 2) a bug in Postgres. My first guess is that it's the issue described here [2] and elsewhere (maybe whether or not that behavior constitutes a bug in controversial, but I think it does). Consider the contrast between these two runs (against master, 2 clients) of the same, new benchmark: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/new-select/50/index.html http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/new-select/44/index.html I had considered that something like Intel Speedstep technology had a role here, but I'm pretty sure it steps up very aggressively when things are CPU bound - I tested that against a Core 2 Duo desktop a couple of years back, where it was easy to immediately provoke it by moving around desktop windows or something. I know that Greg Smith controls for that by disabling it, but I have not done so here - I assumed that he only did so because he was being particularly cautious. There are similar runs on my original 'results' benchmark (these particular should-be-comparable-but-are-not runs are with 1 client against my patch this time): http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/results/297/index.html http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/results/303/index.html References: [1] http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/new-select/ [2] http://www.postgresql.org/message-id/CAFj8pRDDa40eiP4UTrCm=+bdt0xbwf7qc8t_3y0dfqyuzk2...@mail.gmail.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 5:28 PM, Peter Geoghegan p...@heroku.com wrote: I'm also curious about the impact on insertion into primary key indexes. Presently, we hold an exclusive buffer lock for the duration of a couple of operations when checkUnique != UNIQUE_CHECK_NO. _bt_binsrch() is one such operation. The other one there, _bt_check_unique(), is likely to be a lot cheaper than _bt_binsrch() on average, I think, so I'm cautiously optimistic that it'll be noticeable. I better go and check it out. I did a relatively short variant 'insert' pgbench-tools benchmark, with a serial primary index created on the pgbench_history table. Results: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/insert/ Note that in the interest of increasing the signal to noise ratio, I used unlogged tables. These are considerably shorter two minute runs, of inserts/writes, but even still it's interesting to compare it to my original 'select' benchmark. Checkpoint settings were very high. This looks to be about a wash. I'm not that surprised, because this is all about memory bandwidth, not lock contention. Even with the lock contention on the rightmost btree page, we top out at about the same TPS level as the 'select' benchmark (at least compared to Postgres master). However, we top out at that level much sooner here (growth is much steeper), because the contention is mostly on the same one or two btree leaf pages, which are well cached by CPU L1 cache. Writes are actually quite a bit faster than uniformly-distributed reads, even with the exclusive buffer locking (and lock contention) necessitated by writing. You might even say that my patch makes the first benchmark look more like this one (a less awkward comparison could probably be made between what I've done for uniform distribution read workloads, and a read workload with far more uneven data access, which will naturally have fewer cache misses and less TLB contention). I strongly suspect I wouldn't have seen a significant number of cache misses when binary searching on btree pages if I was to instrument this workload. It might still be worth pursuing the SERIAL hinting mechanism, but it's not going to be nearly as big a win as what I've done for reads, I think. It might also be worth looking at a workload with uniformly distributed btree index tuple insertions, where I'd expect similar advantages to those originally shown for reads. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On 2013-12-04 18:48:44 -0500, Robert Haas wrote: * When a type narrower than Datum is stored in a Datum, we place it in the * low-order bits and are careful that the DatumGetXXX macro for it discards * the unused high-order bits (as opposed to, say, assuming they are zero). * This is needed to support old-style user-defined functions, since depending * on architecture and compiler, the return value of a function returning char * or short may contain garbage when called as if it returned Datum. And record_image_eq does a rather elaborate dance around here, calling the appropriate GET_x_BYTES macro depending on the type-width. If we can really count on the high-order bits to be zero, that's all completely unnecessary tomfoolery. I don't think we can get rid of that dance in record_image_eq - it very well could used on records originally generated when those bits haven't been guaranteed to be zeroed. Other usecases where the appropriate DatumGetFoo() macros are used don't have a problem with that since it's cleared again there, but in record_image_eq we can't rely on that. Or am I missing something? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On 12/05/2013 07:30 AM, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On Wed, 2013-12-04 at 20:27 -0500, Tom Lane wrote: Lazy people? I'm not in a hurry to drop it; it's not costing us much to just sit there, other than in this connection which we see how to fix. Actually, I think it probably costs a fair portion of extension authors when their initial code crashes because they forgot to declare all their functions V1. I think it might actually be more of a bother to lazy people than a benefit. Hm. We have heard one or two complaints like that, but not a huge number. It happens to me about 75% of the time when I write a new C function. These days, I usually realize it pretty quickly. I wonder how easy it would be to make the compiler produce a warning about it. Or issue a warning in PostgreSQL when you do CREATE FUNCTION and the C function appears to be a V0 function. I'm worried about breaking code that's been working since god-knows-when; but I will concede there's little evidence that there's very much of that out there either. I wouldn't mind just dropping the V0 support. It's not difficult to modify your function for V1 convention, so if there's still anyone out there using V0, it wouldn't be unreasonable to require them to update. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Andres Freund and...@2ndquadrant.com writes: On 2013-12-04 18:48:44 -0500, Robert Haas wrote: And record_image_eq does a rather elaborate dance around here, calling the appropriate GET_x_BYTES macro depending on the type-width. If we can really count on the high-order bits to be zero, that's all completely unnecessary tomfoolery. I don't think we can get rid of that dance in record_image_eq - it very well could used on records originally generated when those bits haven't been guaranteed to be zeroed. No, you're failing to think about the context here. A Datum is not an on-disk concept, only an in-memory one. In the case of record_image_eq, simplifying the comparison of by-value Datums is unconditionally safe as long as heap_deform_tuple is consistent about using the proper GetDatum macros, which it is. (So actually, that elaborate dance is a 100% waste of effort today, regardless of any changes we're discussing here.) The risk we take by simplifying comparisons in a more general context is that some function somewhere might've been sloppy about doing a native-type-to-Datum conversion on its result. In the case of V0 functions that risk is unavoidable except by adding some explicit cleanup code, but I'm a bit worried that somebody, particularly third-party code, might've sloppily written return foo in a V1 function when return Int32GetDatum(foo) would be correct. In that case, the resultant Datum might have not-per-spec high-order bits, and if it reaches the fast comparator without ever having been squeezed into a physical tuple, we've got a problem. I would certainly regard this as a bug in that function, but nonetheless it's a hazard that we need to set against any performance improvement that we can buy this way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On 2013-12-05 08:58:55 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: I don't think we can get rid of that dance in record_image_eq - it very well could used on records originally generated when those bits haven't been guaranteed to be zeroed. No, you're failing to think about the context here. Ah yes. I was completely forgetting that heap_form_tuple()/heap_fill_tuple() will properly take care to only use meaningful parts of the (to-be-)stored data, not random padding. Thanks. The risk we take by simplifying comparisons in a more general context is that some function somewhere might've been sloppy about doing a native-type-to-Datum conversion on its result. In the case of V0 functions that risk is unavoidable except by adding some explicit cleanup code, but I'm a bit worried that somebody, particularly third-party code, might've sloppily written return foo in a V1 function when return Int32GetDatum(foo) would be correct. In that case, the resultant Datum might have not-per-spec high-order bits, and if it reaches the fast comparator without ever having been squeezed into a physical tuple, we've got a problem. Too bad V1 hasn't insisted on using PG_RETURN_* macros. That would have allowed asserts checking against such cases by setting fcinfo-has_returned = true or such... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Andres Freund and...@2ndquadrant.com writes: On 2013-12-05 08:58:55 -0500, Tom Lane wrote: I'm a bit worried that somebody, particularly third-party code, might've sloppily written return foo in a V1 function when return Int32GetDatum(foo) would be correct. In that case, the resultant Datum might have not-per-spec high-order bits, and if it reaches the fast comparator without ever having been squeezed into a physical tuple, we've got a problem. Too bad V1 hasn't insisted on using PG_RETURN_* macros. That would have allowed asserts checking against such cases by setting fcinfo-has_returned = true or such... [ shrug... ] PG_RETURN_DATUM has no practical way to verify that the given Datum was constructed safely, so I think we'd just be adding overhead with not much real safety gain. In practice, if we were to change Datum to be a signed type (intptr_t not uintptr_t), the most common cases would probably do the right thing anyway, ie an int or short return value would get promoted to Datum with sign-extension. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On 2013-12-05 10:02:56 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2013-12-05 08:58:55 -0500, Tom Lane wrote: I'm a bit worried that somebody, particularly third-party code, might've sloppily written return foo in a V1 function when return Int32GetDatum(foo) would be correct. In that case, the resultant Datum might have not-per-spec high-order bits, and if it reaches the fast comparator without ever having been squeezed into a physical tuple, we've got a problem. Too bad V1 hasn't insisted on using PG_RETURN_* macros. That would have allowed asserts checking against such cases by setting fcinfo-has_returned = true or such... [ shrug... ] PG_RETURN_DATUM has no practical way to verify that the given Datum was constructed safely, so I think we'd just be adding overhead with not much real safety gain. I was thinking of doing it for assert only anyway. In practice, if we were to change Datum to be a signed type (intptr_t not uintptr_t), the most common cases would probably do the right thing anyway, ie an int or short return value would get promoted to Datum with sign-extension. I was actually thinking about making Datum (and some other types we have) structs or unions. Currently it's far, far to easy to mix them. We throw away pretty much all of the little typesafety C has by typedef'ing them to integral types with lots of autocasting behaviour. Making Datum a union with all the base-types inside, would also get rid of us violating the standard's aliasing rules... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Andres Freund and...@2ndquadrant.com writes: I was actually thinking about making Datum (and some other types we have) structs or unions. Currently it's far, far to easy to mix them. We throw away pretty much all of the little typesafety C has by typedef'ing them to integral types with lots of autocasting behaviour. That's intentional; on many ABIs, making Datum a struct would be catastrophic performance-wise because it would not be eligible for simple register pass or return conventions. In any case, the number of bugs I can remember that such a thing would've prevented is negligible. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On 2013-12-05 10:34:16 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: I was actually thinking about making Datum (and some other types we have) structs or unions. Currently it's far, far to easy to mix them. We throw away pretty much all of the little typesafety C has by typedef'ing them to integral types with lots of autocasting behaviour. That's intentional; on many ABIs, making Datum a struct would be catastrophic performance-wise because it would not be eligible for simple register pass or return conventions. Unions should behave saner in that regard tho? And it be fairly easy to make it an optional thing. In any case, the number of bugs I can remember that such a thing would've prevented is negligible. Cases talked about upthread, where a plain datatype is returned as a Datum instead of using FooGetDatum() and the reverse, would be impossible. I don't think those are that infrequent? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Andres Freund and...@2ndquadrant.com writes: On 2013-12-05 10:34:16 -0500, Tom Lane wrote: In any case, the number of bugs I can remember that such a thing would've prevented is negligible. Cases talked about upthread, where a plain datatype is returned as a Datum instead of using FooGetDatum() and the reverse, would be impossible. I don't think those are that infrequent? [ shrug... ] The performance changes we're talking about here would have the effect of making the compiler's implicit casts be the right thing anyway. In any case, I don't think you'd have accomplished much by forcing people to use FooGetDatum, unless you can force them to use the *right* FooGetDatum. The errors I can remember seeing in this area were more in the line of choosing the wrong macro. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Thu, 2013-12-05 at 15:29 +0200, Heikki Linnakangas wrote: It happens to me about 75% of the time when I write a new C function. These days, I usually realize it pretty quickly. I wonder how easy it would be to make the compiler produce a warning about it. Or issue a warning in PostgreSQL when you do CREATE FUNCTION and the C function appears to be a V0 function. Here is a related idea I've had: Integrate the function declaration into the PG_FUNCTION_INFO_V1 macro, like this: #define PG_FUNCTION_INFO_V1(funcname) \ Datum funcname(PG_FUNCTION_ARGS); \ ... Because if you don't declare the function, you will get a compiler warning for -Wmissing-prototypes. Field experience shows that many extension authors neglect to explicitly declare their entry-point functions, presumably because they don't understand the cause of the warning, or they don't look at the compiler output, so many extensions unfortunately don't compile without warnings at the moment. By putting the function declaration into the PG_FUNCTION_INFO_V1 macro, everyone wins: - fewer compiler warnings due to laziness - no code bloat because of redundant function declarations - accidental V0 functions are warned about - prototypes of V1 function implementations are implicitly checked for correctness -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Performance optimization of btree binary search
Having nothing better to do over the holiday weekend, I decided to pursue a number of ideas for improving performance that I thought about a long time ago. These include: * Pre-fetching list node pointers. This looks to be moderately promising, but I'm certainly not going to be the one to land it, given present obligations. Stephen Frost may wish to pick it up, given his previous interest in the matter. This is slightly controversial, because it uses a GCC intrinsic (__builtin_prefetch), but also because the Linux kernel removed this optimization to their generic list data-structure [1]. However, that list was what we'd call an embedded list, so we should probably shouldn't be totally deterred. The amount of effort that I put into this was, frankly, pretty low. A motivated person, willing to do the appropriate analysis could probably bring it further. For one thing, every single foreach() has a call to this intrinsic, even where the list doesn't store pointers (which is not undefined). At the very least that's going to bloat things up, frequently for no conceivable gain, and yet with the patch applied we're still able to see see quite tangible benefits, even if it isn't exactly a stellar improvement. I have an idea that prefetching the last element at the start of the loop could be much better than what I did, because we know that those lists are mostly pretty small in practice, and that's likely to help pipelining - prefetching too late or even too early makes the optimization useless, because you may still get a cache miss. * Optimizing index scans - I noticed that binary searching accounted for many cache misses during a pgbench select.sql benchmark, instrumented with perf record -e cache-misses. This warranted further investigation. I won't say anything further about the former optimization, except to note that it's included for comparative purposes in the set of benchmarks I've run (I haven't included a patch). The benchmark results are here: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/results I took two approaches to the latter. This was the more interesting piece of work. Test sets include: * Master baseline (green) * List optimization (as mentioned above, not really relevant to the main topic of this mail) (red) * fib btree, earlier patch, please disregard (blue) * Fixed fib patch, Fibonacci search, no specialization (purple) * The interesting one, Finonacci search + specialization - fib + no jump (turquoise, see below for details) Initially, I had a little bit of success with Fibonnacci search [2] in place of binary search, in the hope that it would better take advantage of CPU cache characteristics - Fibonnacci search is said to have advantages where non-uniform memory access is an issue - it minimizes the final interval. I wasn't all that optimistic that it would work that well given the smallish size of BLCKSZ relative to modern CPU L1 cache sizes [3], but it did make an appreciable dent on its own. I suppose old habits die hard, because next I hacked up _bt_compare and had it do an int4btcmp directly, in the event of encountering a scankey that had as its comparator the relevant pg_proc oid. This is very much in the style (and the spirit) of the grotty early draft patches for the inlining-comparators-for-sorting patch. Patch is attached. This is a draft, a POC, posted only to facilitate discussion and to allow my results to be independently duplicated/verified. Note that there is a bug (attributable to the new search code) that causes the regression tests to fail in exactly one place (i.e. one line of difference). I didn't think it was worth deferring discussion to deal with that, though, since I don't think it undermines anything. I'm not sure how valuable the comparator trick is if we stick with binary search - I didn't test that. I'm sure it's a question that must be considered, though. I have a fairly huge amount of data here, having run plenty of benchmarks over several nights. The short version is that the 'select' benchmark has just over 18% greater throughput on this machine at some client counts (in particular, when there are 4 clients - there are 4 cores, but 8 logical cores) with the attached patch. There is a 3.5% regression with one client, which is certainly not accounted for by noise. Note, however, that latency appears consistently better with the patch applied. This is a machine running on dedicated hardware: a 4-core i7-4770. The working set easily fits in its 32GiB of DDR3 RAM at all pgbench scales tested (1, 10 and 100). The kernel used is 3.8.0-31-generic #46~precise1-Ubuntu SMP. Postgres settings are typical for this kind of thing (6GiB shared_buffers), but you can refer to my pgbench-tools results for full details (drill down to an individual pgbench run for that - they're all the same). I'm kind of curious as to what this benchmark would like like on a server with many more cores. I guess I could write a proper patch to have code
Re: [HACKERS] Performance optimization of btree binary search
Peter Geoghegan p...@heroku.com writes: I guess I could write a proper patch to have code setting up a scankey also set a flag that indicated that it was acceptable to assume that the special built-in comparator would do fine. ... I'd be happy with a scheme with only one built-in comparator, and allowed a few types to be cataloged such that it was indicated that just using the built-in comparator was acceptable, knowledge that could be passed to _bt_compare via the scankey. I'm thinking of just int4, and maybe date and a few other such int4 covariant types. If what you're proposing is that we have a fast path that compares Datums as Datums, I should think that that would work fine for int2 as well, *and* for int8 on machines where int8 is pass-by-value. (Does anyone still care much about PG's performance on 32-bit hardware?) We might have to fool a bit with the fooGetDatum macros in some cases, eg I think Int16GetDatum isn't careful about sign extension. Admittedly, that might introduce an offsetting cost on some hardware, but I think on most machines sign-extension isn't noticeably more expensive than zero-extension. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 4:28 PM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Geoghegan p...@heroku.com writes: I guess I could write a proper patch to have code setting up a scankey also set a flag that indicated that it was acceptable to assume that the special built-in comparator would do fine. ... I'd be happy with a scheme with only one built-in comparator, and allowed a few types to be cataloged such that it was indicated that just using the built-in comparator was acceptable, knowledge that could be passed to _bt_compare via the scankey. I'm thinking of just int4, and maybe date and a few other such int4 covariant types. If what you're proposing is that we have a fast path that compares Datums as Datums, I should think that that would work fine for int2 as well, *and* for int8 on machines where int8 is pass-by-value. (Does anyone still care much about PG's performance on 32-bit hardware?) We might have to fool a bit with the fooGetDatum macros in some cases, eg I think Int16GetDatum isn't careful about sign extension. Admittedly, that might introduce an offsetting cost on some hardware, but I think on most machines sign-extension isn't noticeably more expensive than zero-extension. Yeah, I think if we can make something like this work, it would be neat-o. Getting this working for int4 would be a good win, as Peter says, but getting it working for both int4 and int8 with the same code would be a significantly better one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 1:59 PM, Robert Haas robertmh...@gmail.com wrote: Yeah, I think if we can make something like this work, it would be neat-o. Getting this working for int4 would be a good win, as Peter says, but getting it working for both int4 and int8 with the same code would be a significantly better one. No arguments here. I think I didn't initially suggest it myself out of passing concern about the guarantees around how unused Datum bits are initialized in all relevant contexts, but having looked at it for a second I see that we are of course disciplined there. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Peter Geoghegan p...@heroku.com writes: On Wed, Dec 4, 2013 at 1:59 PM, Robert Haas robertmh...@gmail.com wrote: Yeah, I think if we can make something like this work, it would be neat-o. Getting this working for int4 would be a good win, as Peter says, but getting it working for both int4 and int8 with the same code would be a significantly better one. No arguments here. I think I didn't initially suggest it myself out of passing concern about the guarantees around how unused Datum bits are initialized in all relevant contexts, but having looked at it for a second I see that we are of course disciplined there. Hm ... actually, the comment at lines 335ff of postgres.h points out that a Datum returned from a version 0 user-defined function might contain garbage in the high order bits. We could fix that, probably, with some cleanup code added to fmgr_oldstyle. It'd waste a few cycles ... but if there's anybody out there who still cares about the performance of such functions, it's high time they fixed them to be v1, anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 6:33 PM, Peter Geoghegan p...@heroku.com wrote: On Wed, Dec 4, 2013 at 1:59 PM, Robert Haas robertmh...@gmail.com wrote: Yeah, I think if we can make something like this work, it would be neat-o. Getting this working for int4 would be a good win, as Peter says, but getting it working for both int4 and int8 with the same code would be a significantly better one. No arguments here. I think I didn't initially suggest it myself out of passing concern about the guarantees around how unused Datum bits are initialized in all relevant contexts, but having looked at it for a second I see that we are of course disciplined there. Hmm. And yet, there's this: * When a type narrower than Datum is stored in a Datum, we place it in the * low-order bits and are careful that the DatumGetXXX macro for it discards * the unused high-order bits (as opposed to, say, assuming they are zero). * This is needed to support old-style user-defined functions, since depending * on architecture and compiler, the return value of a function returning char * or short may contain garbage when called as if it returned Datum. And record_image_eq does a rather elaborate dance around here, calling the appropriate GET_x_BYTES macro depending on the type-width. If we can really count on the high-order bits to be zero, that's all completely unnecessary tomfoolery. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Robert Haas robertmh...@gmail.com writes: Hmm. And yet, there's this: * When a type narrower than Datum is stored in a Datum, we place it in the * low-order bits and are careful that the DatumGetXXX macro for it discards * the unused high-order bits (as opposed to, say, assuming they are zero). * This is needed to support old-style user-defined functions, since depending * on architecture and compiler, the return value of a function returning char * or short may contain garbage when called as if it returned Datum. And record_image_eq does a rather elaborate dance around here, calling the appropriate GET_x_BYTES macro depending on the type-width. If we can really count on the high-order bits to be zero, that's all completely unnecessary tomfoolery. Yeah, that's another thing we could simplify if we fixed this problem at the source. I think these decisions date from a time when we still cared about the speed of fmgr_oldstyle. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
I wrote: Yeah, that's another thing we could simplify if we fixed this problem at the source. I think these decisions date from a time when we still cared about the speed of fmgr_oldstyle. BTW, the text you're quoting is from 2007, but it's just documenting behavior that's mostly a lot older. It's worth reading commit 23a41573 in toto in this connection. I'm not sure if we'd want to revert that DatumGetBool change or not, if we were to clean up fmgr_oldstyle. We'd be able to do whatever was cheapest, but I'm not sure what that is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 6:56 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Hmm. And yet, there's this: * When a type narrower than Datum is stored in a Datum, we place it in the * low-order bits and are careful that the DatumGetXXX macro for it discards * the unused high-order bits (as opposed to, say, assuming they are zero). * This is needed to support old-style user-defined functions, since depending * on architecture and compiler, the return value of a function returning char * or short may contain garbage when called as if it returned Datum. And record_image_eq does a rather elaborate dance around here, calling the appropriate GET_x_BYTES macro depending on the type-width. If we can really count on the high-order bits to be zero, that's all completely unnecessary tomfoolery. Yeah, that's another thing we could simplify if we fixed this problem at the source. I think these decisions date from a time when we still cared about the speed of fmgr_oldstyle. Sure, let's whack that thing with a crowbar. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, 2013-12-04 at 19:45 -0500, Robert Haas wrote: On Wed, Dec 4, 2013 at 6:56 PM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, that's another thing we could simplify if we fixed this problem at the source. I think these decisions date from a time when we still cared about the speed of fmgr_oldstyle. Sure, let's whack that thing with a crowbar. Or just remove it. Who still needs it? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Peter Eisentraut pete...@gmx.net writes: On Wed, 2013-12-04 at 19:45 -0500, Robert Haas wrote: On Wed, Dec 4, 2013 at 6:56 PM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, that's another thing we could simplify if we fixed this problem at the source. I think these decisions date from a time when we still cared about the speed of fmgr_oldstyle. Sure, let's whack that thing with a crowbar. Or just remove it. Who still needs it? Lazy people? I'm not in a hurry to drop it; it's not costing us much to just sit there, other than in this connection which we see how to fix. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 12:58 PM, Peter Geoghegan p...@heroku.com wrote: I'm kind of curious as to what this benchmark would like like on a server with many more cores. I'm also curious about the impact on insertion into primary key indexes. Presently, we hold an exclusive buffer lock for the duration of a couple of operations when checkUnique != UNIQUE_CHECK_NO. _bt_binsrch() is one such operation. The other one there, _bt_check_unique(), is likely to be a lot cheaper than _bt_binsrch() on average, I think, so I'm cautiously optimistic that it'll be noticeable. I better go and check it out. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 5:28 PM, Peter Geoghegan p...@heroku.com wrote: I'm also curious about the impact on insertion into primary key indexes. Presently, we hold an exclusive buffer lock for the duration of a couple of operations when checkUnique != UNIQUE_CHECK_NO. _bt_binsrch() is one such operation. The other one there, _bt_check_unique(), is likely to be a lot cheaper than _bt_binsrch() on average, I think, so I'm cautiously optimistic that it'll be noticeable. I better go and check it out. Depending on how well this goes, I might also teach _bt_doinsert() to hint to _bt_binsrch() (or as I'm calling it, _bt_page_search()) that it should look to the end of the page when searching, using a similar mechanism to the mechanism for hinting that the main Datum-compare optimization is applicable (this strategy would be abandoned if it didn't work immediately - as soon as the last item on the page turned out to be greater than or equal to the scankey value). This is something that I think would help with SERIAL columns, where it's possible in principle to pass that kind of insight around -- if you can live with making SERIAL more than mere syntactic sugar. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, 2013-12-04 at 20:27 -0500, Tom Lane wrote: Lazy people? I'm not in a hurry to drop it; it's not costing us much to just sit there, other than in this connection which we see how to fix. Actually, I think it probably costs a fair portion of extension authors when their initial code crashes because they forgot to declare all their functions V1. I think it might actually be more of a bother to lazy people than a benefit. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Peter Eisentraut pete...@gmx.net writes: On Wed, 2013-12-04 at 20:27 -0500, Tom Lane wrote: Lazy people? I'm not in a hurry to drop it; it's not costing us much to just sit there, other than in this connection which we see how to fix. Actually, I think it probably costs a fair portion of extension authors when their initial code crashes because they forgot to declare all their functions V1. I think it might actually be more of a bother to lazy people than a benefit. Hm. We have heard one or two complaints like that, but not a huge number. I'm worried about breaking code that's been working since god-knows-when; but I will concede there's little evidence that there's very much of that out there either. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers