On Fri, Aug 4, 2017 at 8:21 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Wed, Aug 2, 2017 at 9:04 PM, Robert Haas <robertmh...@gmail.com> wrote:
>> On Wed, Jul 12, 2017 at 1:10 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>>>>> Yes, I also think the same idea can be used, in fact, I have mentioned
>>>>> it [1] as soon as you have committed that patch.  Do we want to do
>>>>> anything at this stage for PG-10?  I don't think we should attempt
>>>>> something this late unless people feel this is a show-stopper issue
>>>>> for usage of hash indexes.  If required, I think a separate function
>>>>> can be provided to allow users to perform squeeze operation.
>>>> Sorry, I have no idea how critical this squeeze thing is for the
>>>> newfangled hash indexes, so I cannot comment on that.  Does this make
>>>> the indexes unusable in some way under some circumstances?
>>> It seems so.  Basically, in the case of a large number of duplicates,
>>> we hit the maximum number of overflow pages.  There is a theoretical
>>> possibility of hitting it but it could also happen that we are not
>>> free the existing unused overflow pages due to which it keeps on
>>> growing and hit the limit.  I have requested up thread to verify if
>>> that is happening in this case and I am still waiting for same.  The
>>> squeeze operation does free such unused overflow pages after cleaning
>>> them.  As this is a costly operation and needs a cleanup lock, so we
>>> currently perform it only during Vacuum and next split from the bucket
>>> which can have redundant overflow pages.
>> Oops.  It was rather short-sighted of us not to increase
>> HASH_MAX_BITMAPS when we bumped HASH_VERSION.  Actually removing that
>> limit is hard, but we could have easily bumped it for 128 to say 1024
>> without (I think) causing any problem, which would have given us quite
>> a bit of headroom here.
> Yes, that sounds sensible, but I think it will just delay the problem
> to happen.  I think here the actual problem is that we are not able to
> perform squeeze operation often enough that it frees the overflow
> pages.  Currently, we try to perform the squeeze only at the start of
> next split of the bucket or during vacuum.  The reason for doing it
> that way was that squeeze operation needs cleanup lock and we already
> have that during the start of split and vacuum. Now, to solve it I
> have already speculated few ways above [1] and among those, it is
> feasible to either do this at end of split which can have performance
> implications in some work loads, but will work fine for the case
> reported in this thread

I have implemented the patch with this approach as other approach
require quite extensive changes which I am not sure is the right thing
to do at this stage.

> I think we can fix it in one of above ways and increase the value of

I have increased the number of hash bitmap pages as a separate patch.
I am not completely sure if it is a good idea to directly increase it
to 1024 as that will increase the size of hashmetapagedata from 960
bytes to 4544 bytes.  Shall we increase it to 512?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment: fix_cleanup_bucket_after_split_v1.patch
Description: Binary data

Attachment: increase_hash_max_bitmaps_v1.patch
Description: Binary data

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to