Hi Doug, On 2017-08-03 18:01:06 +0000, Douglas Doole wrote: > Back when you were getting ready to commit your faster expressions, I > volunteered to look at switching from an array of fixed sized steps to > tightly packed structures. ( > https://www.postgresql.org/message-id/20170314221648.jrcgh5n7ld4ej...@alap3.anarazel.de). > I've finally found time to make it happen.
Thanks for working on it! > Using tightly packed structures makes the expressions a lot smaller. I > instrumented some before and after builds and ran "make check" just to see > how much memory expressions were using. What I found was: > > There were ~104K expressions. > > Memory - bytes needed to hold the steps of the expressions > Allocated Memory - memory is allocated in blocks, this is the bytes > allocated to hold the expressions > Wasted Memory - the difference between the allocated and used memory > > Original code: > Memory: min=64 max=9984 total=28232512 average=265 > Allocated Memory: min=1024 max=16384 total=111171584 average=1045 > Wasted Memory: min=0 max=6400 total=82939072 average=780 > > New code: > Memory: min=32 (50%) max=8128 (82%) total=18266712 (65%) average=175 (66%) > Allocated Memory: min=192 (19%) max=9408 (57%) total=29386176 (26%) > average=282 (27%) > Wasted Memory: min=0 (0%) max=1280 (20%) total=11119464 (13%) average=106 > (14%) That's actually not *that* big of a saving... > Another benefit of the way I did this is that the expression memory is > never reallocated. This means that it is safe for one step to point > directly to a field in another step without needing to separately palloc > storage. That should allow us to simplify the code and reduce pointer > traversals. (I haven't exploited this in the patch. I figured it would be a > task for round 2 assuming you like what I've done here.) Yes, that's a neat benefit. Although I think it'd even better if we could work to the point where the mutable and the unchanging data is separately allocated, so we can at some point avoid redundant expression "compilations". > The work was mostly mechanical. The only tricky bit was dealing with the > places where you jump to step n+1 while building step n. Since we can't > tell the address of step n+1 until it is actually built, I had to defer > resolution of the jumps. So all the interesting bits of this patch are in > ExprEvalPushStep(). I was wondering about a more general "symbol resolution" stage anyway. Then we'd allocate individual steps during ExecInitExprRec, and allocate the linear array after we know the exact size. I think it'd be important to see some performance measurements, especially for larger queries. It'd not be too surprising if the different method of dereffing the next expression actually has a negative performance effect, but I'm absolutely not sure of that. Regards, Andres -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers