Re: Tid scan improvements

Edmund Horner Tue, 14 Aug 2018 16:12:58 -0700

On 12 August 2018 at 20:07, David Rowley <david.row...@2ndquadrant.com> wrote:
>> Since range scan execution is rather different from the existing
>> TidScan execution, I ended up making a new plan type, TidRangeScan.
>> There is still only one TidPath, but it has an additional member that
>> describes which method to use.
>
> I always thought that this would be implemented by overloading
> TidScan.  I thought that TidListEval() could be modified to remove
> duplicates accounting for range scans. For example:
>
> SELECT * FROM t WHERE ctid BETWEEN '(0,1)' AND (0,10') OR ctid
> IN('(0,5)','(0,30)');
>
> would first sort all the tids along with their operator and then make
> a pass over the sorted array to remove any equality ctids that are
> redundant because they're covered in a range.


Initially, I figured that 99% of the time, the user either wants to
filter by a specific set of ctids (such as those returned by a
subquery), or wants to process a table in blocks.  Picking up an
OR-set of ctid-conditions and determining which parts should be picked
up row-wise (as in the existing code) versus which parts are blocks
that should be scanned -- and ensuring that any overlaps were removed
-- seemed more complicated than it was worth.

Having thought about it, I think what you propose might be worth it;
at least it limits us to a single TidScan plan to maintain.

The existing code:
  - Looks for a qual that's an OR-list of (ctid = ?) or (ctid IN (?))
  - Costs it by assuming each matching tuple is a separate page.
  - When beginning the scan, evaluates all the ?s and builds an array
of tids to fetch.
  - Sorts and remove duplicates.
  - Iterates over the array, fetching tuples.

So we'd extend that to:
  - Include in the OR-list "range" subquals of the form (ctid > ? AND
ctid < ?) (either side could be optional, and we have to deal with >=
and <= and having ctid on the rhs, etc.).
  - Cost the range subquals by assuming they don't overlap, and
estimating how many blocks and tuples they span.
  - When beginning the scan, evaluate all the ?s and build an array of
"tid ranges" to fetch.  A tid range is a struct with a starting tid,
and an ending tid, and might just be a single tid item.
  - Sort and remove duplicates.
  - Iterate over the array, using a single fetch for single-item tid
ranges, and starting/ending a heap scan for multi-item tid ranges.

I think I'll try implementing this.

>> As part of the work I also taught TidScan that its results are ordered
>> by ctid, i.e. to set a pathkey on a TidPath.  The benefit of this is
>> that queries such as
>>
>>     SELECT MAX(ctid) FROM t;
>>     SELECT * FROM t WHERE ctid IN (...) ORDER BY ctid;
>
> I think that can be done as I see you're passing allow_sync as false
> in heap_beginscan_strat(), so the scan will start at the beginning of
> the heap.

I found that heap scan caters to parallel scans, synchronised scans,
and block range indexing; but it didn't quite work for my case of
specifying a subset of a table and scanning backward or forward over
it.  Hence my changes.  I'm not overly familiar with the heap scan
code though.

>>   - Is there a less brittle way to create tables of a specific number
>> of blocks/tuples in the regression tests?
>
> Perhaps you could just populate a table with some number of records
> then DELETE the ones above ctid (x,100) on each page, where 100 is
> whatever you can be certain will fit on a page on any platform. I'm
> not quite sure if our regress test would pass with a very small block
> size anyway, but probably worth verifying that before you write the
> first test that will break it.

I don't think I've tested with extreme block sizes.

> I'll try to look in a bit more detail during the commitfest.
>
> It's perhaps a minor detail at this stage, but generally, we don't
> have code lines over 80 chars in length. There are some exceptions,
> e.g not breaking error message strings so that they're easily
> greppable.  src/tools/pgindent has a tool that you can run to fix the
> whitespace so it's in line with project standard.

I'll try to get pgindent running before my next patch.

Thanks for the comments!

Re: Tid scan improvements

Reply via email to