Re: [HACKERS] Patch for fast gin cache performance improvement
I think it is desirable that this patch should be resubmitted for the next CommitFest for further review and testing mentioned above. So I'd like to mark this patch as Returned with Feedback. Is it OK? Sounds like a good idea. Thanks for the review! Ian Link Etsuro Fujita Thursday, October 10, 2013 1:01 AM Ian Link wrote: Although I asked this question, I've reconsidered about these parameters, and it seems that these parameters not only make code rather complex but are a little confusing to users. So I'd like to propose to introduce only one parameter: fast_cache_size. While users that give weight to update performance for the fast update technique set this parameter to a large value, users that give weight not only to update performance but to search performance set this parameter to a small value. What do you think about this? I think it makes sense to maintain this separation. If the user doesn't need a per-index setting, they don't have to use the parameter. Since the parameter is off by default, they don't even need to worry about it. There might as well be one parameter for users that don't need fine-grained control. We can document this and I don't think it will be confusing to the user. OK, though I'd like to hear the opinion of others. 4. In my understanding, the small value of gin_fast_limit/fast_cache_size leads to the increase in GIN search performance, which, however, leads to the decrease in GIN update performance. Am I right? If so, I think the tradeoff should be noted in the documentation. I believe this is correct. 5. The following documents in Chapter 57. GIN Indexes need to be updated: * 57.3.1. GIN Fast Update Technique * 57.4. GIN Tips and Tricks Sure, I can add something. 6. I would like to see the results for the additional test cases (tsvectors). I don't really have any good test cases for this available, and have very limited time for postgres at the moment. Feel free to create a test case, but I don't believe I can at the moment. Sorry! Unfortunately, I don't have much time to do such a thing, though I think we should do that. (In addition, we should do another performance test to make clear an influence of fast update performance from introducing these parameters, which would be required to determine the default values of these parameters.) 7. The commented-out elog() code should be removed. Sorry about that, I shouldn't have submitted the patch with those still there. I should have a new patch soonish, hopefully. Thanks for your feedback! I think it is desirable that this patch should be resubmitted for the next CommitFest for further review and testing mentioned above. So I'd like to mark this patch as Returned with Feedback. Is it OK? Thanks, Best regards, Etsuro Fujita Ian Link Monday, September 30, 2013 3:09 PM Hi Etsuro, Sorry for the delay but I have been very busy with work. I have been away from postgres for a while, so I will need a little time to review the code and make sure I give you an informed response. I'll get back to you as soon as I am able. Thanks for understanding. Ian Link Etsuro Fujita Friday, September 27, 2013 2:24 AM I wrote: I had a look over this patch. I think this patch is interesting and very useful. Here are my review points: 8. I think there are no issues in this patch. However, I have one question: how this patch works in the case where gin_fast_limit/fast_cache_size = 0? In this case, in my understanding, this patch inserts new entries into the pending list temporarily and immediately moves them to the main GIN data structure using ginInsertCleanup(). Am I right? If so, that is obviously inefficient. Sorry, There are incorrect expressions. I mean gin_fast_limit 0 and fast_cache_size = 0. Although I asked this question, I've reconsidered about these parameters, and it seems that these parameters not only make code rather complex but are a little confusing to users. So I'd like to propose to introduce only one parameter: fast_cache_size. While users that give weight to update performance for the fast update technique set this parameter to a large value, users that give weight not only to update performance but to search performance set this parameter to a small value. What do you think about this? Thanks, Best regards, Etsuro Fujita Etsuro Fujita Thursday, September 26, 2013 6:02 AM Hi Ian, This patch contains a performance improvement for the fast gin cache. As you may know, the performance of the fast gin cache decreases with its size. Currently, the size of the fast gin cache is tied to work_mem. The size of work_mem can often be quite high. The large size of work_mem is inappropriate for the fast gin cache size. Therefore, we created a separate cache size called gin_fast_limit. This global variable controls the size of the fast gin cache, independently of work_mem. Currently, the default
Re: [HACKERS] Patch for fast gin cache performance improvement
Although I asked this question, I've reconsidered about these parameters, and it seems that these parameters not only make code rather complex but are a little confusing to users. So I'd like to propose to introduce only one parameter: fast_cache_size. While users that give weight to update performance for the fast update technique set this parameter to a large value, users that give weight not only to update performance but to search performance set this parameter to a small value. What do you think about this? I think it makes sense to maintain this separation. If the user doesn't need a per-index setting, they don't have to use the parameter. Since the parameter is off by default, they don't even need to worry about it. There might as well be one parameter for users that don't need fine-grained control. We can document this and I don't think it will be confusing to the user. 4. In my understanding, the small value of gin_fast_limit/fast_cache_size leads to the increase in GIN search performance, which, however, leads to the decrease in GIN update performance. Am I right? If so, I think the tradeoff should be noted in the documentation. I believe this is correct. 5. The following documents in Chapter 57. GIN Indexes need to be updated: * 57.3.1. GIN Fast Update Technique * 57.4. GIN Tips and Tricks Sure, I can add something. 6. I would like to see the results for the additional test cases (tsvectors). I don't really have any good test cases for this available, and have very limited time for postgres at the moment. Feel free to create a test case, but I don't believe I can at the moment. Sorry! 7. The commented-out elog() code should be removed. Sorry about that, I shouldn't have submitted the patch with those still there. I should have a new patch soonish, hopefully. Thanks for your feedback! Ian Ian Link wrote: 8. I think there are no issues in this patch. However, I have one question: how this patch works in the case where gin_fast_limit/fast_cache_size = 0? In this case, in my understanding, this patch inserts new entries into the pending list temporarily and immediately moves them to the main GIN data structure using ginInsertCleanup(). Am I right? If so, that is obviously inefficient. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for fast gin cache performance improvement
Hi Etsuro, Sorry for the delay but I have been very busy with work. I have been away from postgres for a while, so I will need a little time to review the code and make sure I give you an informed response. I'll get back to you as soon as I am able. Thanks for understanding. Ian Link Etsuro Fujita Friday, September 27, 2013 2:24 AM I wrote: I had a look over this patch. I think this patch is interesting and very useful. Here are my review points: 8. I think there are no issues in this patch. However, I have one question: how this patch works in the case where gin_fast_limit/fast_cache_size = 0? In this case, in my understanding, this patch inserts new entries into the pending list temporarily and immediately moves them to the main GIN data structure using ginInsertCleanup(). Am I right? If so, that is obviously inefficient. Sorry, There are incorrect expressions. I mean gin_fast_limit 0 and fast_cache_size = 0. Although I asked this question, I've reconsidered about these parameters, and it seems that these parameters not only make code rather complex but are a little confusing to users. So I'd like to propose to introduce only one parameter: fast_cache_size. While users that give weight to update performance for the fast update technique set this parameter to a large value, users that give weight not only to update performance but to search performance set this parameter to a small value. What do you think about this? Thanks, Best regards, Etsuro Fujita Etsuro Fujita Thursday, September 26, 2013 6:02 AM Hi Ian, This patch contains a performance improvement for the fast gin cache. As you may know, the performance of the fast gin cache decreases with its size. Currently, the size of the fast gin cache is tied to work_mem. The size of work_mem can often be quite high. The large size of work_mem is inappropriate for the fast gin cache size. Therefore, we created a separate cache size called gin_fast_limit. This global variable controls the size of the fast gin cache, independently of work_mem. Currently, the default gin_fast_limit is set to 128kB. However, that value could need tweaking. 64kB may work better, but it's hard to say with only my single machine to test on. On my machine, this patch results in a nice speed up. Our test queries improve from about 0.9 ms to 0.030 ms. Please feel free to use the test case yourself: it should be attached. I can look into additional test cases (tsvectors) if anyone is interested. In addition to the global limit, we have provided a per-index limit: fast_cache_size. This per-index limit begins at -1, which means that it is disabled. If the user does not specify a per-index limit, the index will simply use the global limit. I had a look over this patch. I think this patch is interesting and very useful. Here are my review points: 1. Patch applies cleanly. 2. make, make install and make check is good. 3. I did performance evaluation using your test queries with 64kB and 128kB of gin_fast_limit (or fast_cache_size), and saw that both values achieved the performance gains over gin_fast_limit = '256MB'. 64kB worked better than 128kB. 64kB improved from 1.057 ms to 0.075 ms. Great! 4. In my understanding, the small value of gin_fast_limit/fast_cache_size leads to the increase in GIN search performance, which, however, leads to the decrease in GIN update performance. Am I right? If so, I think the tradeoff should be noted in the documentation. 5. The following documents in Chapter 57. GIN Indexes need to be updated: * 57.3.1. GIN Fast Update Technique * 57.4. GIN Tips and Tricks 6. I would like to see the results for the additional test cases (tsvectors). 7. The commented-out elog() code should be removed. 8. I think there are no issues in this patch. However, I have one question: how this patch works in the case where gin_fast_limit/fast_cache_size = 0? In this case, in my understanding, this patch inserts new entries into the pending list temporarily and immediately moves them to the main GIN data structure using ginInsertCleanup(). Am I right? If so, that is obviously inefficient. Sorry for the delay. Best regards, Etsuro Fujita Ian Link Monday, June 17, 2013 9:42 PM This patch contains a performance improvement for the fast gin cache. As you may know, the performance of the fast gin cache decreases with its size. Currently, the size of the fast gin cache is tied to work_mem. The size of work_mem can often be quite high. The large size of work_mem is inappropriate for the fast gin cache size. Therefore, we created a separate cache size called gin_fast_limit. This global variable controls the size of the fast gin cache, independently of work_mem. Currently, the default gin_fast_limit is set to 128kB. However, that value could need tweaking. 64kB may work better, but it's hard to say with only my single machine to test on.On my machine
Re: [HACKERS] Review: query result history in psql
Maciej - I can see your resistance to some kind of interactive mode. It would definitely be more code and create a less simple user interface. I would be perfectly happy if we left that part as it is now. However, I think it would be important to have a way of displaying the query history. Yes, you can search through your console backwards. You can still search through your old queries, but this would be a good alternative. What if the user reconnects after working for a while? Their old bash history might be gone. This would leave them with a big query history and no point of reference. Personally, I would find this feature very worthwhile. The query history wouldn't be crippled without it, but it would be a lot less flexible. Pavel Stehule Monday, July 01, 2013 4:05 AM a idea is good, but I don't think, it can be useful with currentimplementation. How I can identify, what is correct answer for myquery? Have I remember twenty numbers and twenty queries?RegardsPavel Maciej Gajewski Monday, July 01, 2013 4:01 AM When I tested this feature, I had 30 caches per 5 minutes, and only a few from these queries had a sense. Switch between off and on is not user friendly. I believe so there can be other solution than mine, but a possibility to friendly clean unwanted caches is necessary. If you know that you'll need the result of a query beforehand, you can always use SELECT ... INTO ... . No client-side features required. This feature is intended for people running plenty of ad-hoc queries, when every result could potentially be useful. Pavel Stehule Monday, July 01, 2013 1:31 AM 2013/7/1 Maciej Gajewski maciej.gajews...@gmail.com: I'm not really bought into some of the ideas. but maybe some interactive mode should be usefull - so after execution, and showing result, will be prompt if result should be saved or not. I like the idea, in addition to the ordinary mode. Personally, I would use the ordinary mode, but I can see how 'interactive' would be useful. This would require a complex change to the client code. And the result would eventually become annoying: an interactive question after each and every query. Currently, when turned on, every result is stored and simple notification is printed. . When I tested this feature, I had 30 caches per 5 minutes, and only a few from these queries had a sense. Switch between off and on is not user friendly. I believe so there can be other solution than mine, but a possibility to friendly clean unwanted caches is necessary. yes, the names :ans01, :ans02, ... miss semantics - How I can join this name (and content) with some SQL query? That makes sense. I think having part of / the whole query string would be very helpful. Great suggestion! The naming is obscure and non-informative, I agree. If you have a nice idea how to make it better, I'd love to discuss it. But please remember that it has one huge advantage: simplicity. The client is a classical command-line tool, and as such it delegates some of the functionality to external programs, like pager or terminal. Personally, I don't see a strong price for all users without friendly interface. Regards Pavel I'm pretty sure that your terminal emulator has a 'find' function that would allow you to quickly locate the variable and associated query in the scrollback. M Maciej Gajewski Monday, July 01, 2013 1:23 AM I'm not really bought into some of the ideas. but maybe some interactive mode should be usefull - so after execution, and showing result, will be prompt if result should besaved or not. I like the idea, in addition to the ordinary mode. Personally, I would use the ordinary mode, but I can see how 'interactive' would be useful. This would require a complex change to the client code. And the result would eventually become annoying: an interactive question after each and every query. Currently, when turned on, every result is stored and simple notification is printed. yes, the names :ans01, :ans02, ... miss semantics - How I can join this name (and content) with some SQL query?That makes sense. I think having part of / the whole query string would be very helpful. Great suggestion! The naming is obscure and non-informative, I agree. If you have a nice idea how to make it better, I'd love to discuss it. But please remember that it has one huge advantage: simplicity. The client is a classical command-line tool, and as such it delegates some of the functionality to external programs, like pager or terminal. I'm pretty sure that your terminal emulator has a 'find' function that would allow you to quickly locate the variable and associated query in the scrollback. M ian link Monday, July 01, 2013 12:19 AM but maybe some interactive mode should be usefull - so after execution, and showing result, will be prompt if result should besaved or not. I
Re: [HACKERS] Review: query result history in psql
I haven't been able to find any real documentation on this feature, other than the additions to the psql help. You're right, I missed that in my review. I agree that it needs some more documentation. What is ans? We covered that earlier in the email thread, but it's the current name for the query history. I think we are going to change it to something a little more descriptive. I was thinking qh for short or query-history. I'm not sure if I'll be able to implement this feature any time soon, as I'm very busy at the moment and going for a business trip in few days. I can try implementing the feature, I might have some time. On Tue, Jul 2, 2013 at 5:36 AM, Peter Eisentraut pete...@gmx.net wrote: On 7/2/13 5:53 AM, Maciej Gajewski wrote: In the meantime, I've applied your suggestions and moved the sting-manipulating functions to stringutils. Also fixed a tiny bug. Patch attached. I haven't been able to find any real documentation on this feature, other than the additions to the psql help. Could someone write some mock documentation at least, so we know what the proposed interfaces and intended ways of use are? What is ans?
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
I'm fine with moving the operators over to functions. I just don't want to implement anything that is against best practice. If we are OK with that direction, I'll go ahead and start on the new patch. Ian On Mon, Jul 1, 2013 at 9:03 PM, Tom Lane t...@sss.pgh.pa.us wrote: Craig Ringer cr...@2ndquadrant.com writes: On 07/02/2013 02:39 AM, Robert Haas wrote: I'm actually not clear that it would be all that bad to assume fixed operator names, as we apparently do in a few places despite the existence of operator classes. But if that is bad, then I don't know how using @+ and @- instead helps anything. Personally I'm not clear why it's bad to reserve certain fundamental operators like '+' and '-', requiring that they have particular semantics. It is bad. It's against project policy, not least because we have assorted *existing* datatypes for which obvious operator names like = do not have all the properties you might expect. If you need a more concrete example of why that sort of thinking is bad, you might consider the difference between and ~~ for type text. If we hard-wired knowledge about operator behavior to operator names, it would be impossible for the system to understand that both of those operators represent sorting-related behaviors. Or to be even more concrete: if we allow RANGE to suppose that there's only one possible definition of + for a datatype, we're effectively supposing that there's only one possible sort ordering for that type. Which is already a wrong assumption, and has been since Postgres was still at Berkeley. If you go this way, you won't be able to support both WINDOW ... ORDER BY foo USING RANGE ... and WINDOW ... ORDER BY foo USING ~~ RANGE ... because you won't know which addition operator to apply. (And yeah, I'm aware that the SQL standard only expects RANGE to support sort keys that are of numeric, datetime, or interval type. I would hope that we have higher expectations than that. Even if we don't, it's not exactly hard to credit that people might have multiple ideas about how to sort interval values.) There are indeed still some places where we rely on operator names to mean something, but we need to get away from that idea not add more. Ideally, any property the system understands about an operator or function should be explicitly declared through opclass membership or some similar representation. We've made substantial progress in that direction in the last fifteen years. I don't want to reverse that progress in the name of minor expediencies, especially not ones that fail to support flexibility that has been in the system for a couple or three decades already. regards, tom lane
Re: [HACKERS] Review: query result history in psql
I'm kinda not all that convinced that this feature does anything that's actually useful. If you want to save your query results, you can just stick CREATE TEMP TABLE ans AS in front of your query. What does that not give you that this gives you? Convenience. It auto-increments with each new query, giving you a different temporary table for each query. Seems pretty helpful to me. On Tue, Jul 2, 2013 at 6:16 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jul 2, 2013 at 8:16 PM, ian link i...@ilink.io wrote: We covered that earlier in the email thread, but it's the current name for the query history. I think we are going to change it to something a little more descriptive. I was thinking qh for short or query-history. I'm not sure if I'll be able to implement this feature any time soon, as I'm very busy at the moment and going for a business trip in few days. I can try implementing the feature, I might have some time. I'm kinda not all that convinced that this feature does anything that's actually useful. If you want to save your query results, you can just stick CREATE TEMP TABLE ans AS in front of your query. What does that not give you that this gives you? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
Definitely not this week. Hopefully for next commit fest. On Sun, Jun 30, 2013 at 9:56 PM, Josh Berkus j...@agliodbs.com wrote: On 06/30/2013 08:54 PM, ian link wrote: I found some time and I think I am up to speed now. I finally figured out how to add new operator strategies and made a little test operator for myself. It seems pretty clear that assuming '+' and '-' are addition and subtraction is a bad idea. I don't think it would be too tricky to add support for new operator strategies. Andrew Gierth suggested calling these new strategies offset - and offset +, which I think describes it pretty well. I assigned the operator itself to be @+ and @- but that can obviously be changed. If this sounds like a good path to you guys, I will go ahead and implement the operators for the appropriate types. Please let me know if I am misunderstanding something - I am still figuring stuff out :) Aside from the opclass stuff, there were some other important issues mentioned with the original RANGE support. I think I will address those after the opclass stuff is done. Are these things you plan to get done this week, or for next CommitFest? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: [HACKERS] Review: query result history in psql
but maybe some interactive mode should be usefull - so after execution, and showing result, will be prompt if result should be saved or not. I like the idea, in addition to the ordinary mode. Personally, I would use the ordinary mode, but I can see how 'interactive' would be useful. yes, the names :ans01, :ans02, ... miss semantics - How I can join this name (and content) with some SQL query? That makes sense. I think having part of / the whole query string would be very helpful. Great suggestion! Maciej, would you be able/have time to implement these? Or do you need any help getting them done? On Sun, Jun 30, 2013 at 11:35 PM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello 2013/7/1 ian link i...@ilink.io: Not sure about all of your suggestions. Let me see if I can clarify what you're looking for. * simply decision if content should be stored in history or not, Do you mean that the user should use a flag to place the result of a query into the history? like: --ans SELECT * FROM cities... Not sure if that's what you mean, but it seems kind of unnecesary. They can just hit the \ans flag beforehand. switching off on is not user friendly but maybe some interactive mode should be usefull - so after execution, and showing result, will be prompt if result should be saved or not. some like: \ans interactive SELECT * FROM pg_proc; result should be saved last result [y, n]? y result is saved in :ans22 * simply remove last entry (table) of history That could be useful. What do you think Maciej? yes, lot of queries is just +/- experiment and you don't would store result * queries should be joined to content, only name is not enough Don't know what you mean. Could you try re-wording that? yes, the names :ans01, :ans02, ... miss semantics - How I can join this name (and content) with some SQL query? I needs to reverese search in SQL of stored caches, and I need a information ans01 SELECT * FROM pg_proc ans02 SELECT * FROM ans02 WHERE ... ans03 ... Regards Pavel Ian On Fri, Jun 28, 2013 at 8:49 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello I am not sure, this interface is really user friendly there is not possible searching in history, and not every query push to history some interesting content. It require: * simply decision if content should be stored in history or not, * simply remove last entry (table) of history * queries should be joined to content, only name is not enough Regards Pavel 2013/6/28 Maciej Gajewski maciej.gajews...@gmail.com: Thanks for checking the patch! So what's left to fix? * Moving the escaping-related functions to separate module, * applying your corrections. Did I missed anything? I'll submit corrected patch after the weekend. M
Re: [HACKERS] Review: query result history in psql
Not sure about all of your suggestions. Let me see if I can clarify what you're looking for. * simply decision if content should be stored in history or not, Do you mean that the user should use a flag to place the result of a query into the history? like: --ans SELECT * FROM cities... Not sure if that's what you mean, but it seems kind of unnecesary. They can just hit the \ans flag beforehand. * simply remove last entry (table) of history That could be useful. What do you think Maciej? * queries should be joined to content, only name is not enough Don't know what you mean. Could you try re-wording that? Ian On Fri, Jun 28, 2013 at 8:49 AM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello I am not sure, this interface is really user friendly there is not possible searching in history, and not every query push to history some interesting content. It require: * simply decision if content should be stored in history or not, * simply remove last entry (table) of history * queries should be joined to content, only name is not enough Regards Pavel 2013/6/28 Maciej Gajewski maciej.gajews...@gmail.com: Thanks for checking the patch! So what's left to fix? * Moving the escaping-related functions to separate module, * applying your corrections. Did I missed anything? I'll submit corrected patch after the weekend. M
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
I found some time and I think I am up to speed now. I finally figured out how to add new operator strategies and made a little test operator for myself. It seems pretty clear that assuming '+' and '-' are addition and subtraction is a bad idea. I don't think it would be too tricky to add support for new operator strategies. Andrew Gierth suggested calling these new strategies offset - and offset +, which I think describes it pretty well. I assigned the operator itself to be @+ and @- but that can obviously be changed. If this sounds like a good path to you guys, I will go ahead and implement the operators for the appropriate types. Please let me know if I am misunderstanding something - I am still figuring stuff out :) Aside from the opclass stuff, there were some other important issues mentioned with the original RANGE support. I think I will address those after the opclass stuff is done. Thanks! Ian On Sat, Jun 22, 2013 at 4:38 PM, ian link i...@ilink.io wrote: Thanks Craig! That definitely does help. I probably still have some questions but I think I will read through the rest of the code before asking. Thanks again! Ian Craig Ringer Friday, June 21, 2013 8:41 PM On 06/22/2013 03:30 AM, ian link wrote: Forgive my ignorance, but I don't entirely understand the problem. What does '+' and '-' refer to exactly? Consider RANGE 4.5 PRECEDING'. You need to be able to test whether, for the current row 'b', any given row 'a' is within the range (b - 4.5) a = b . Not 100% sure about the vs = boundaries, but that's irrelevant for the example. To test that, you have to be able to do two things: you have to be able to test whether one value is greater than another, and you have to be able to add or subtract a constant from one of the values. Right now, the b-tree access method provides information on the ordering operators = = = , which provides half the answer. But these don't give any concept of *distance* - you can test ordinality but not cardinality. To implement the different by 4.5 part, you have to be able to add 4.5 to one value or subtract it from the other. The obvious way to do that is to look up the function that implements the '+' or '-' operator, and do: ((OPERATOR(+))(a, 4.5)) b AND (a = b) or ((OPERATOR(-))(b, 4.5)) a AND (a = b); The problem outlined by Tom in prior discussion about this is that PostgreSQL tries really hard not to assume that particular operator names mean particular things. Rather than knowing that + is always an operator that adds two values together; is transitive, symmetric and reflexive, PostgreSQL requires that you define an *operator class* that names the operator that has those properties. Or at least, it does for less-than, less-than-or-equals, equals, greater-than-or-equals, greater-than, and not-equals as part of the b-tree operator class, which *usually* defines these operators as = = = , but you could use any operator names you wanted if you really liked. Right now (as far as I know) there's no operator class that lets you identify operators for addition and subtraction in a similar way. So it's necessary to either add such an operator class (in which case support has to be added for it for every type), extend the existing b-tree operator class to provide the info, or blindly assume that + and - are always addition and subtraction. For an example of why such assumptions are a bad idea, consider matrix multiplication. Normally, a * b = b * a, but this isn't true for multiplication of matrices. Similarly, if someone defined a + operator as an alias for string concatenation (||), we'd be totally wrong to assume we could use that for doing range-offset windowing. So. Yeah. Operator classes required, unless we're going to change the rules and make certain operator names special in PostgreSQL, so that if you implement them they *must* have certain properties. This seems like a pretty poor reason to add such a big change. I hope this explanation (a) is actually correct and (b) is helpful. ian link Friday, June 21, 2013 12:30 PM Forgive my ignorance, but I don't entirely understand the problem. What does '+' and '-' refer to exactly? Thanks! Hitoshi Harada Friday, June 21, 2013 4:35 AM On 06/22/2013 03:30 AM, ian link wrote: Forgive my ignorance, but I don't entirely understand the problem. What does '+' and '-' refer to exactly? Consider RANGE 4.5 PRECEDING'. You need to be able to test whether, for the current row 'b', any given row 'a' is within the range (b - 4.5) a = b . Not 100% sure about the vs = boundaries, but that's irrelevant for the example. To test that, you have to be able to do two things: you have to be able to test whether one value is greater than another, and you have to be able to add or subtract a constant from one of the values. Right now, the b-tree
Re: [HACKERS] Review: query result history in psql
It's better to post a review as a reply to the message which contains the patch. Sorry about that, I did not have the email in my inbox and couldn't figure out how to use the old message ID to send a reply. Here is the thread: http://www.postgresql.org/message-id/flat/caecsyxjri++t3pevdyzawh2ygx7kg9zrhx8kawtp1fxv3h0...@mail.gmail.com#caecsyxjri++t3pevdyzawh2ygx7kg9zrhx8kawtp1fxv3h0...@mail.gmail.com The 'EscapeForCopy' was meant to mean 'Escape string in a format require by the COPY TEXT format', so 'copy' in the name refers to the escaping format, not the action performed by the function. I see, that makes sense now. Keep it as you see fit, it's not a big deal in my opinion. Some mathematical toolkits, like Matlab or Mathematica, automatically set a variable called 'ans' (short for answer) containing the result of the last operation. I was trying to emulate exactly this behaviour. I've actually been using Matlab lately, which must be why the name made sense to me intuitively. I don't know if this is the best name, however. It kind of assumes that our users use Matlab/Octave/Mathematica. Maybe 'qhist' or 'hist' or something? The history is not erased. The history is always stored in the client's memory. Ah, I did not pick up on that. Thank you for explaining it! That's actually a very neat way of doing it. Sorry I did not realize that at first. I was considering such a behaviour. But since the feature is turned off by default, I decided that whoever is using it, is aware of cost. Instead of truncating the history automatically (which could lead to a nasty surprise), I decided to equip the user with \ansclean , a command erasing the history. I believe that it is better to let the user decide when history should be erased, instead of doing it automatically. I think you are correct. However, if we turn on the feature by default (at some point in the future) the discussion should probably be re-visited. This is my first submitted patch, so I can't really comment on the process. But if you could add the author's email to CC, the message would be much easier to spot. I replied after two days only because I missed the message in the flood of other pgsql-hacker messages. I think I need to scan the list more carefully... My fault, I definitely should have CC'd you. As for the patch, I made a new version of the latest one you provided in the original thread. Let me know if anything breaks, but it compiles fine on my box. Thanks for the feedback! Ian query-history-review1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: query result history in psql
I just applied the patch to a clean branch from the latest master. I couldn't get a segfault from using the new feature. Could you provide a little more info to reproduce the segfault? Thanks On Thu, Jun 27, 2013 at 11:28 PM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello after patching I god segfault Program terminated with signal 11, Segmentation fault. #0 0x0805aab4 in get_prompt (status=PROMPT_READY) at prompt.c:98 98 for (p = prompt_string; Missing separate debuginfos, use: debuginfo-install glibc-2.13-2.i686 ncurses-libs-5.7-9.20100703.fc14.i686 readline-6.1-2.fc14.i386 (gdb) bt #0 0x0805aab4 in get_prompt (status=PROMPT_READY) at prompt.c:98 #1 0x0805786a in MainLoop (source=0xc45440) at mainloop.c:134 #2 0x0805a68d in main (argc=2, argv=0xbfcf2894) at startup.c:336 Regards Pavel Stehule 2013/6/28 ian link i...@ilink.io: It's better to post a review as a reply to the message which contains the patch. Sorry about that, I did not have the email in my inbox and couldn't figure out how to use the old message ID to send a reply. Here is the thread: http://www.postgresql.org/message-id/flat/caecsyxjri++t3pevdyzawh2ygx7kg9zrhx8kawtp1fxv3h0...@mail.gmail.com#caecsyxjri++t3pevdyzawh2ygx7kg9zrhx8kawtp1fxv3h0...@mail.gmail.com The 'EscapeForCopy' was meant to mean 'Escape string in a format require by the COPY TEXT format', so 'copy' in the name refers to the escaping format, not the action performed by the function. I see, that makes sense now. Keep it as you see fit, it's not a big deal in my opinion. Some mathematical toolkits, like Matlab or Mathematica, automatically set a variable called 'ans' (short for answer) containing the result of the last operation. I was trying to emulate exactly this behaviour. I've actually been using Matlab lately, which must be why the name made sense to me intuitively. I don't know if this is the best name, however. It kind of assumes that our users use Matlab/Octave/Mathematica. Maybe 'qhist' or 'hist' or something? The history is not erased. The history is always stored in the client's memory. Ah, I did not pick up on that. Thank you for explaining it! That's actually a very neat way of doing it. Sorry I did not realize that at first. I was considering such a behaviour. But since the feature is turned off by default, I decided that whoever is using it, is aware of cost. Instead of truncating the history automatically (which could lead to a nasty surprise), I decided to equip the user with \ansclean , a command erasing the history. I believe that it is better to let the user decide when history should be erased, instead of doing it automatically. I think you are correct. However, if we turn on the feature by default (at some point in the future) the discussion should probably be re-visited. This is my first submitted patch, so I can't really comment on the process. But if you could add the author's email to CC, the message would be much easier to spot. I replied after two days only because I missed the message in the flood of other pgsql-hacker messages. I think I need to scan the list more carefully... My fault, I definitely should have CC'd you. As for the patch, I made a new version of the latest one you provided in the original thread. Let me know if anything breaks, but it compiles fine on my box. Thanks for the feedback! Ian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: query result history in psql
No worries! :) On Fri, Jun 28, 2013 at 12:20 AM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello It's look like my bug - wrong compilation I am sorry Pavel 2013/6/28 ian link i...@ilink.io: I just applied the patch to a clean branch from the latest master. I couldn't get a segfault from using the new feature. Could you provide a little more info to reproduce the segfault? Thanks On Thu, Jun 27, 2013 at 11:28 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello after patching I god segfault Program terminated with signal 11, Segmentation fault. #0 0x0805aab4 in get_prompt (status=PROMPT_READY) at prompt.c:98 98 for (p = prompt_string; Missing separate debuginfos, use: debuginfo-install glibc-2.13-2.i686 ncurses-libs-5.7-9.20100703.fc14.i686 readline-6.1-2.fc14.i386 (gdb) bt #0 0x0805aab4 in get_prompt (status=PROMPT_READY) at prompt.c:98 #1 0x0805786a in MainLoop (source=0xc45440) at mainloop.c:134 #2 0x0805a68d in main (argc=2, argv=0xbfcf2894) at startup.c:336 Regards Pavel Stehule 2013/6/28 ian link i...@ilink.io: It's better to post a review as a reply to the message which contains the patch. Sorry about that, I did not have the email in my inbox and couldn't figure out how to use the old message ID to send a reply. Here is the thread: http://www.postgresql.org/message-id/flat/caecsyxjri++t3pevdyzawh2ygx7kg9zrhx8kawtp1fxv3h0...@mail.gmail.com#caecsyxjri++t3pevdyzawh2ygx7kg9zrhx8kawtp1fxv3h0...@mail.gmail.com The 'EscapeForCopy' was meant to mean 'Escape string in a format require by the COPY TEXT format', so 'copy' in the name refers to the escaping format, not the action performed by the function. I see, that makes sense now. Keep it as you see fit, it's not a big deal in my opinion. Some mathematical toolkits, like Matlab or Mathematica, automatically set a variable called 'ans' (short for answer) containing the result of the last operation. I was trying to emulate exactly this behaviour. I've actually been using Matlab lately, which must be why the name made sense to me intuitively. I don't know if this is the best name, however. It kind of assumes that our users use Matlab/Octave/Mathematica. Maybe 'qhist' or 'hist' or something? The history is not erased. The history is always stored in the client's memory. Ah, I did not pick up on that. Thank you for explaining it! That's actually a very neat way of doing it. Sorry I did not realize that at first. I was considering such a behaviour. But since the feature is turned off by default, I decided that whoever is using it, is aware of cost. Instead of truncating the history automatically (which could lead to a nasty surprise), I decided to equip the user with \ansclean , a command erasing the history. I believe that it is better to let the user decide when history should be erased, instead of doing it automatically. I think you are correct. However, if we turn on the feature by default (at some point in the future) the discussion should probably be re-visited. This is my first submitted patch, so I can't really comment on the process. But if you could add the author's email to CC, the message would be much easier to spot. I replied after two days only because I missed the message in the flood of other pgsql-hacker messages. I think I need to scan the list more carefully... My fault, I definitely should have CC'd you. As for the patch, I made a new version of the latest one you provided in the original thread. Let me know if anything breaks, but it compiles fine on my box. Thanks for the feedback! Ian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Review: query result history in psql
Hi Maciej, I've been reviewing your patch for the ongoing commitfest. First let me say that I think it's a great idea and provides some very useful functionality. However, there are a few minor problems. There were a few english/grammatical mistakes that I went ahead and fixed. Additionally, I think some of the string manipulation might be placed outside of the main ans.c file. I don't know if there's a better place for 'EscapeForCopy' and 'GetEscapedLen'. Not really a big deal, just an organizational idea. I also changed 'EscapeForCopy' to 'EscapeAndCopy'. I think that better describes the functionality. 'EscapeForCopy' kind of implies that another function is needed to copy the string. What does 'ans' stand for? I am not sure how it relates to the concept of a query history. It didn't stop my understanding of the code, but I don't know if a user will immediately know the meaning. Probably the biggest problem is that the query history list is missing a maximum size variable. I think this could be valuable for preventing users from shooting themselves in the foot. If the user is running large queries, they might accidentally store too much data. This probably somewhat of an edge-case but I believe it is worth considering. We could provide a sensible default limit (10 queries?) and also allow the user to change it. Finally, is it worth resetting the query history every time a user reconnects to the database? I can see how this might interrupt a user's workflow. If the user suddenly disconnects (network connection interrupted, etc) then they would lose their history. I think this is definitely up for debate. It would add more management overhead (psql options etc) and might just be unnecessary. However, with a sane limit to the size of the query history, I don't know if there would be too many drawbacks from a storage perspective. Those issues aside - I think it's a great feature! I can add the grammatical fixes I made whenever the final patch is ready. Or earlier, whatever works for you. Also, this is my first time reviewing a patch, so please let me know if I can improve on anything. Thanks! Ian Link
Re: [HACKERS] Patch for fast gin cache performance improvement
I just tried it and my account works now. Thanks! Ian Stefan Kaltenbrunner Sunday, June 23, 2013 2:05 AM have you tried to log in once to the main website per:http://www.postgresql.org/message-id/CABUevEyt9tQfcF7T2Uzcr8WeF9M=s8qSACuCmN5L2Et26=r...@mail.gmail.com?Stefan ian link Saturday, June 22, 2013 7:03 PM Looks like my community login is still not working. No rush, just wanted to let you know. Thanks!Ian Ian Link Tuesday, June 18, 2013 11:41 AM No worries! I'll just wait until it's up again. Thanks Ian Kevin Grittner Tuesday, June 18, 2013 9:15 AM Oops -- we seem to have a problem with new community logins at themoment, which will hopefully be straightened out soon. You mightwant to wait a few days if you don't already have a login.--Kevin GrittnerEnterpriseDB: http://www.enterprisedb.comThe Enterprise PostgreSQL Company Kevin Grittner Tuesday, June 18, 2013 9:09 AM Ian Link i...@ilink.io wrote: This patch contains a performance improvement for the fast gin cache. Our test queries improve from about 0.9 ms to 0.030 ms. Impressive! Thanks for reading and considering this patch! Congratulations on your first PostgreSQL patch! To get it scheduled for review, please add it to this page: https://commitfest.postgresql.org/action/commitfest_view/open You will need to get a community login (if you don't already have one), but that is a quick and painless process. Choose an appropriate topic (like "Performance") and reference the message ID of the email to which you attached the patch. Don't worry about the fields for reviewers, committer, or date closed. Sorry for the administrative overhead, but without it things can fall through the cracks. You can find an overview of the review process with links to more detail here: http://wiki.postgresql.org/wiki/CommitFest Thanks for contributing!
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
Thanks Craig! That definitely does help. I probably still have some questions but I think I will read through the rest of the code before asking. Thanks again! Ian Craig Ringer Friday, June 21, 2013 8:41 PM On 06/22/2013 03:30 AM, ian link wrote: Forgive my ignorance, but I don't entirely understand the problem. What does '+' and '-' refer to exactly? Consider RANGE 4.5 PRECEDING'. You need to be able to test whether, for the current row 'b', any given row 'a' is within the range (b - 4.5) a = b . Not 100% sure about the vs = boundaries, but that's irrelevant for the example. To test that, you have to be able to do two things: you have to be able to test whether one value is greater than another, and you have to be able to add or subtract a constant from one of the values. Right now, the b-tree access method provides information on the ordering operators = = = , which provides half the answer. But these don't give any concept of *distance* - you can test ordinality but not cardinality. To implement the different by 4.5 part, you have to be able to add 4.5 to one value or subtract it from the other. The obvious way to do that is to look up the function that implements the '+' or '-' operator, and do: ((OPERATOR(+))(a, 4.5)) b AND (a = b) or ((OPERATOR(-))(b, 4.5)) a AND (a = b); The problem outlined by Tom in prior discussion about this is that PostgreSQL tries really hard not to assume that particular operator names mean particular things. Rather than knowing that + is always an operator that adds two values together; is transitive, symmetric and reflexive, PostgreSQL requires that you define an *operator class* that names the operator that has those properties. Or at least, it does for less-than, less-than-or-equals, equals, greater-than-or-equals, greater-than, and not-equals as part of the b-tree operator class, which *usually* defines these operators as = = = , but you could use any operator names you wanted if you really liked. Right now (as far as I know) there's no operator class that lets you identify operators for addition and subtraction in a similar way. So it's necessary to either add such an operator class (in which case support has to be added for it for every type), extend the existing b-tree operator class to provide the info, or blindly assume that + and - are always addition and subtraction. For an example of why such assumptions are a bad idea, consider matrix multiplication. Normally, a * b = b * a, but this isn't true for multiplication of matrices. Similarly, if someone defined a + operator as an alias for string concatenation (||), we'd be totally wrong to assume we could use that for doing range-offset windowing. So. Yeah. Operator classes required, unless we're going to change the rules and make certain operator names special in PostgreSQL, so that if you implement them they *must* have certain properties. This seems like a pretty poor reason to add such a big change. I hope this explanation (a) is actually correct and (b) is helpful. ian link Friday, June 21, 2013 12:30 PM Forgive my ignorance, but I don't entirely understand the problem. What does '+' and '-' refer to exactly? Thanks! Hitoshi Harada Friday, June 21, 2013 4:35 AM On 06/22/2013 03:30 AM, ian link wrote: Forgive my ignorance, but I don't entirely understand the problem. What does '+' and '-' refer to exactly? Consider RANGE 4.5 PRECEDING'. You need to be able to test whether, for the current row 'b', any given row 'a' is within the range (b - 4.5) a = b . Not 100% sure about the vs = boundaries, but that's irrelevant for the example. To test that, you have to be able to do two things: you have to be able to test whether one value is greater than another, and you have to be able to add or subtract a constant from one of the values. Right now, the b-tree access method provides information on the ordering operators = = = , which provides half the answer. But these don't give any concept of *distance* - you can test ordinality but not cardinality. To implement the different by 4.5 part, you have to be able to add 4.5 to one value or subtract it from the other. The obvious way to do that is to look up the function that implements the '+' or '-' operator, and do: ((OPERATOR(+))(a, 4.5)) b AND (a = b) or ((OPERATOR(-))(b, 4.5)) a AND (a = b); The problem outlined by Tom in prior discussion about this is that PostgreSQL tries really hard not to assume that particular operator names mean particular things. Rather than knowing that + is always an operator that adds two values together; is transitive, symmetric and reflexive, PostgreSQL requires that you define an *operator class* that names the operator that has those properties. Or at least, it does for less-than, less-than-or-equals, equals, greater-than-or-equals, greater-than, and not-equals as part of the b-tree operator class, which *usually* defines
Re: [HACKERS] Patch for fast gin cache performance improvement
Looks like my community login is still not working. No rush, just wanted to let you know. Thanks! Ian On Tue, Jun 18, 2013 at 11:41 AM, Ian Link i...@ilink.io wrote: No worries! I'll just wait until it's up again. Thanks Ian Kevin Grittner kgri...@ymail.com Tuesday, June 18, 2013 9:15 AM Oops -- we seem to have a problem with new community logins at the moment, which will hopefully be straightened out soon. You might want to wait a few days if you don't already have a login. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company Kevin Grittner kgri...@ymail.com Tuesday, June 18, 2013 9:09 AM Ian Link i...@ilink.io i...@ilink.io wrote: This patch contains a performance improvement for the fast gin cache. Our test queries improve from about 0.9 ms to 0.030 ms. Impressive! Thanks for reading and considering this patch! Congratulations on your first PostgreSQL patch! To get it scheduled for review, please add it to this page: https://commitfest.postgresql.org/action/commitfest_view/open You will need to get a community login (if you don't already have one), but that is a quick and painless process. Choose an appropriate topic (like Performance) and reference the message ID of the email to which you attached the patch. Don't worry about the fields for reviewers, committer, or date closed. Sorry for the administrative overhead, but without it things can fall through the cracks. You can find an overview of the review process with links to more detail here: http://wiki.postgresql.org/wiki/CommitFest Thanks for contributing! Ian Link i...@ilink.io Monday, June 17, 2013 9:42 PM * This patch contains a performance improvement for the fast gin cache. As you may know, the performance of the fast gin cache decreases with its size. Currently, the size of the fast gin cache is tied to work_mem. The size of work_mem can often be quite high. The large size of work_mem is inappropriate for the fast gin cache size. Therefore, we created a separate cache size called gin_fast_limit. This global variable controls the size of the fast gin cache, independently of work_mem. Currently, the default gin_fast_limit is set to 128kB. However, that value could need tweaking. 64kB may work better, but it's hard to say with only my single machine to test on. On my machine, this patch results in a nice speed up. Our test queries improve from about 0.9 ms to 0.030 ms. Please feel free to use the test case yourself: it should be attached. I can look into additional test cases (tsvectors) if anyone is interested. In addition to the global limit, we have provided a per-index limit: fast_cache_size. This per-index limit begins at -1, which means that it is disabled. If the user does not specify a per-index limit, the index will simply use the global limit. I would like to thank Andrew Gierth for all his help on this patch. As this is my first patch he was extremely helpful. The idea for this performance improvement was entirely his. I just did the implementation. Thanks for reading and considering this patch!* Ian Link postbox-contact.jpgcompose-unknown-contact.jpg
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
Forgive my ignorance, but I don't entirely understand the problem. What does '+' and '-' refer to exactly? Thanks! On Fri, Jun 21, 2013 at 4:35 AM, Hitoshi Harada umi.tan...@gmail.comwrote: On Fri, Jun 21, 2013 at 3:20 AM, Craig Ringer cr...@2ndquadrant.comwrote: On 06/21/2013 05:32 PM, Hitoshi Harada wrote: I also later found that we are missing not only notion of '+' or '-', but also notion of 'zero value' in our catalog. Per spec, RANGE BETWEEN needs to detect ERROR if the offset value is negative, but it is not always easy if you think about interval, numeric types as opposed to int64 used in ROWS BETWEEN. Zero can be tested for with `val = (@ val)` ie `val = abs(val)`. That should make sense for any type in which the concept of zero makes sense. Yeah, I mean, it needs to know if offset is negative or not by testing with zero. So we need zero value or is_negative function for each type. Thanks, -- Hitoshi Harada
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
I am currently looking into this feature. However, as I am quite new to Postgres, I think it might take me a while to get up to speed. Anyways, I would also appreciate another round of discussion on the future of the windowing functions. Ian Link Craig Ringer Thursday, June 20, 2013 7:24 PM Hi allSince 8.4, PostgreSQL has had extremely useful window function support -but support for "RANGE PRECEDING / FOLLOWING" windows was dropped latein 8.4's development in order to get the rest of the feature in, perhttp://archives.postgresql.org/pgsql-hackers/2010-02/msg00540.php.It looks like there was discussion of requiring a new opclass to bedeclared for types or otherwise extending opclasses to provide theinformation required for RANGE ... PRECEDING / FOLLOWING (http://www.postgresql.org/message-id/20100211201444.ga28...@svana.org ). I can't find any sign that it went anywhere beyond some broaddiscussion:http://www.postgresql.org/message-id/13993.1265920...@sss.pgh.pa.us atthe time.I've missed this feature more than once, and am curious about whetherany more recent changes may have made it cleaner to tackle this, orwhether consensus can be formed on adding the new entries to btree'sopclass to avoid the undesirable explicit lookups of the '+' and '-'oprators.Some question seems to remain open about how ranges overtimestamps/intervals should work, but this wasn't elaborated on.There's been interest in this, eg:http://pgsql.hackers.free-usenet.eu/[HACKERS]-range-intervals-in-window-function-frames_T66085695_S1http://grokbase.com/t/postgresql/pgsql-general/105a89gm2n/postgresql-9-0-support-for-range-value-preceding-window-functions
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
Thanks! The discussions have been useful, although I am currently just reviewing the code. I think a good starting point will be to refactor/imrpove the WinGetFuncArgInPartition and WinGetFuncArgInFrame functions. Tom Lane wrote this about them before comitting the patch: I'm not terribly happy with the changes you made in WinGetFuncArgInPartitionand WinGetFuncArgInFrame to force the window function mark to not gopast frame start in some modes. Not only is that pretty ugly, but Ithink it can mask bugs in window functions: it's an error for a windowfunction to fetch a row before what it has set its mark to be, but insome cases that wouldn't be detected because of this change. I thinkit would be better to revert those changes and find another method ofprotecting fetches needed to determine the frame head. One idea isto create a separate read pointer that tracks the frame head wheneveractual fetches of the frame head might be needed by update_frameheadpos.I committed it without changing that, but I think this should berevisited before trying to add the RANGE value PRECEDING/FOLLOWINGoptions, because those will substantially expand the number of caseswhere that hack affects the behavior. I am honestly not 100% certain why these functions have issues, but this seems a good place to start investigating. Ian Link Craig Ringer Thursday, June 20, 2013 7:37 PM Good to know, and welcome.I hope the links to the archived discussions on the matter were usefulto you. Craig Ringer Thursday, June 20, 2013 7:24 PM Hi allSince 8.4, PostgreSQL has had extremely useful window function support -but support for "RANGE PRECEDING / FOLLOWING" windows was dropped latein 8.4's development in order to get the rest of the feature in, perhttp://archives.postgresql.org/pgsql-hackers/2010-02/msg00540.php.It looks like there was discussion of requiring a new opclass to bedeclared for types or otherwise extending opclasses to provide theinformation required for RANGE ... PRECEDING / FOLLOWING (http://www.postgresql.org/message-id/20100211201444.ga28...@svana.org ). I can't find any sign that it went anywhere beyond some broaddiscussion:http://www.postgresql.org/message-id/13993.1265920...@sss.pgh.pa.us atthe time.I've missed this feature more than once, and am curious about whetherany more recent changes may have made it cleaner to tackle this, orwhether consensus can be formed on adding the new entries to btree'sopclass to avoid the undesirable explicit lookups of the '+' and '-'oprators.Some question seems to remain open about how ranges overtimestamps/intervals should work, but this wasn't elaborated on.There's been interest in this, eg:http://pgsql.hackers.free-usenet.eu/[HACKERS]-range-intervals-in-window-function-frames_T66085695_S1http://grokbase.com/t/postgresql/pgsql-general/105a89gm2n/postgresql-9-0-support-for-range-value-preceding-window-functions
Re: [HACKERS] Patch for fast gin cache performance improvement
No worries! I'll just wait until it's up again. Thanks Ian Kevin Grittner Tuesday, June 18, 2013 9:15 AM Oops -- we seem to have a problem with new community logins at themoment, which will hopefully be straightened out soon. You mightwant to wait a few days if you don't already have a login.--Kevin GrittnerEnterpriseDB: http://www.enterprisedb.comThe Enterprise PostgreSQL Company Kevin Grittner Tuesday, June 18, 2013 9:09 AM Ian Link i...@ilink.io wrote: This patch contains a performance improvement for the fast gin cache. Our test queries improve from about 0.9 ms to 0.030 ms. Impressive! Thanks for reading and considering this patch! Congratulations on your first PostgreSQL patch! To get it scheduled for review, please add it to this page: https://commitfest.postgresql.org/action/commitfest_view/open You will need to get a community login (if you don't already have one), but that is a quick and painless process. Choose an appropriate topic (like "Performance") and reference the message ID of the email to which you attached the patch. Don't worry about the fields for reviewers, committer, or date closed. Sorry for the administrative overhead, but without it things can fall through the cracks. You can find an overview of the review process with links to more detail here: http://wiki.postgresql.org/wiki/CommitFest Thanks for contributing! Ian Link Monday, June 17, 2013 9:42 PM This patch contains a performance improvement for the fast gin cache. As you may know, the performance of the fast gin cache decreases with its size. Currently, the size of the fast gin cache is tied to work_mem. The size of work_mem can often be quite high. The large size of work_mem is inappropriate for the fast gin cache size. Therefore, we created a separate cache size called gin_fast_limit. This global variable controls the size of the fast gin cache, independently of work_mem. Currently, the default gin_fast_limit is set to 128kB. However, that value could need tweaking. 64kB may work better, but it's hard to say with only my single machine to test on.On my machine, this patch results in a nice speed up. Our test queries improve from about 0.9 ms to 0.030 ms. Please feel free to use the test case yourself: it should be attached. I can look into additional test cases (tsvectors) if anyone is interested. In addition to the global limit, we have provided a per-index limit: fast_cache_size. This per-index limit begins at -1, which means that it is disabled. If the user does not specify a per-index limit, the index will simply use the global limit. I would like to thank Andrew Gierth for all his help on this patch. As this is my first patch he was extremely helpful. The idea for this performance improvement was entirely his. I just did the implementation. Thanks for reading and considering this patch! Ian Link