Re: [HACKERS] Faster array_length()

2014-05-02 Thread Tom Lane
Hadi Moshayedi h...@moshayedi.net writes:
 The attached patch improves the performance of array_length() by detoasting
 only the overhead part of the datum.

It looks to me like this would actually make things worse for cases where
the input array wasn't toasted-out-of-line (because it would uselessly
make a copy of the header part, costing a palloc cycle).  I'm not averse
to improving the case you're worried about, but you have to pay attention
to not having bad side-effects on other cases.

Another thought is that this can only win for arrays that are external
without being compressed; when they are compressed, then
heap_tuple_untoast_attr_slice will fetch and decompress the entire array
anyway (and then, just to add insult to injury, make another copy :-().
With that in mind, I was surprised that your test case showed any
improvement at all --- it looks like the arrays aren't getting compressed
for some reason.  There are going to be a lot of other cases where this
patch doesn't help unless the user turns off compression, which will hurt
his performance in other ways.

Now, the slice detoast support was only designed to work with data stored
in external mode (that is, with compression manually disabled via the
appropriate ALTER TABLE option), and that's not unreasonable for its
originally-intended application of being able to fetch any part of an
external text string.  But it strikes me that for what you want here,
namely fetching just a few bytes from the start, it ought to be possible
to do better.  Could we teach the toast code to fetch and decompress just
an initial subset of the data?  (This might be useful even for the
original use-case of slice fetches, as long as the desired slice isn't
too close to the end of the datum.)

Bottom line for me is that you've shown that there can be a win from
improving this code, but we need to do some basic work on the
slice-fetching logic to get full value out of the idea.

 Of course this technique is applicable to some other functions too, but
 they have never become a bottleneck for me, so I decided to keep the
 changes only to this function.

I would expect a committable version of this patch to cover all the
array-dimension-related functions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster array_length()

2014-05-02 Thread Hadi Moshayedi
Thanks for looking into this.


 With that in mind, I was surprised that your test case showed any
 improvement at all --- it looks like the arrays aren't getting compressed
 for some reason.


You are right, it seems that they were not getting compressed, probably
because the arrays were seq 1 which seems to not get compressed by
pglz. When I changed the test data to an array containing 1 ones, there
were no speed improvement anymore.

I'll look into how to improve the compressed case and other issues you
raised.

Thanks,
   -- Hadi