Re: [HACKERS] TABLESAMPLE patch

Amit Kapila Sun, 08 Mar 2015 20:53:28 -0700

On Sat, Mar 7, 2015 at 10:37 PM, Petr Jelinek <[email protected]> wrote:
>
> On 05/03/15 09:21, Amit Kapila wrote:
>>
>> On Tue, Feb 17, 2015 at 3:29 AM, Petr Jelinek <[email protected]
>> <mailto:[email protected]>> wrote:
>>  >
>>  >
>>  > I didn't add the whole page visibility caching as the tuple ids we
>> get from sampling methods don't map well to the visibility info we get
>> from heapgetpage (it maps to the values in the rs_vistuples array not to
>> to its indexes). Commented about it in code also.
>>  >
>>
>> I think we should set pagemode for system sampling as it can
>> have dual benefit, one is it will allow us caching tuples and other
>> is it can allow us pruning of page which is done in heapgetpage().
>> Do you see any downside to it?
>
>
> Double checking for tuple visibility is the only downside I can think of.


That will happen if we use heapgetpage and the way currently
code is written in patch, however we can easily avoid double
checking if we don't call heapgetpage and rather do the required
work at caller's place.

> Plus some added code complexity of course. I guess we could use binary
search on rs_vistuples (it's already sorted) so that info won't be
completely useless. Not sure if worth it compared to normal visibility
check, but not hard to do.
>

It seems to me that it is better to avoid locking/unlocking buffer
wherever possible.

> I personally don't see the page pruning in TABLESAMPLE as all that
important though, considering that in practice it will only scan tiny
portion of a relation and usually one that does not get many updates (in
practice the main use-case is BI).
>

In that case, I think it should be acceptable either way, because
if there are less updates then anyway it won't incur any cost of
doing the pruning.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] TABLESAMPLE patch

Reply via email to