Re: [HACKERS] Allow a per-tablespace effective_io_concurrency setting

Julien Rouhaud Wed, 02 Sep 2015 12:14:28 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,


On 02/09/2015 18:06, Tomas Vondra wrote:
> Hi
> 
> On 09/02/2015 03:53 PM, Andres Freund wrote:
>> 
>> Hi,
>> 
>> On 2015-07-18 12:17:39 +0200, Julien Rouhaud wrote:
>>> I didn't know that the thread must exists on -hackers to be
>>> able to add a commitfest entry, so I transfer the thread here.
>> 
>> Please, in the future, also update the title of the thread to
>> something fitting.
>> 

Sorry for that.

>>> @@ -539,6 +541,9 @@ ExecInitBitmapHeapScan(BitmapHeapScan
>>> *node, EState *estate, int eflags) { BitmapHeapScanState
>>> *scanstate; Relation    currentRelation; +#ifdef USE_PREFETCH +
>>> int new_io_concurrency; +#endif
>>> 
>>> /* check for unsupported flags */ Assert(!(eflags &
>>> (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK))); @@ -598,6 +603,25 @@
>>> ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate,
>>> int eflags) */ currentRelation = ExecOpenScanRelation(estate, 
>>> node->scan.scanrelid, eflags);
>>> 
>>> +#ifdef USE_PREFETCH +    /* check if the
>>> effective_io_concurrency has been overloaded for the +     *
>>> tablespace storing the relation and compute the 
>>> target_prefetch_pages, +     * or just get the current
>>> target_prefetch_pages +     */ +    new_io_concurrency =
>>> get_tablespace_io_concurrency( +
>>> currentRelation->rd_rel->reltablespace); + + +
>>> scanstate->target_prefetch_pages = target_prefetch_pages; + +
>>> if (new_io_concurrency != effective_io_concurrency) +    { +
>>> double prefetch_pages; +       if
>>> (compute_io_concurrency(new_io_concurrency, &prefetch_pages)) +
>>> scanstate->target_prefetch_pages = rint(prefetch_pages); +
>>> } +#endif
>> 
>> Maybe it's just me - but imo there should be as few USE_PREFETCH 
>> dependant places in the code as possible. It'll just be 0 when
>> not supported, that's fine?
> 
> It's not just you. Dealing with code with plenty of ifdefs is
> annoying - it compiles just fine most of the time, until you
> compile it with some rare configuration. Then it either starts
> producing strange warnings or the compilation fails entirely.
> 
> It might make a tiny difference on builds without prefetching
> support because of code size, but realistically I think it's
> noise.
> 
>> Especially changing the size of externally visible structs
>> depending ona configure detected ifdef seems wrong to me.
> 
> +100 to that
> 

I totally agree. I'll remove the ifdefs.

>> Nitpick: Wrong comment style (/* stands on its own line).

I did run pgindent before submitting patch, but apparently I picked
the wrong one. Already fixed in local branch.

>>> +    /*---------- +     * The user-visible GUC parameter is the
>>> number of drives (spindles), +     * which we need to translate
>>> to a number-of-pages-to-prefetch target. +     * The target
>>> value is stashed in *extra and then assigned to the actual +
>>> * variable by assign_effective_io_concurrency. +     * +     *
>>> The expected number of prefetch pages needed to keep N drives 
>>> busy is: +     * +     * drives |   I/O requests +     *
>>> -------+---------------- +     *        1 |   1 +     *
>>> 2 |   2/1 + 2/2 = 3 +     *        3 |   3/1 + 3/2 + 3/3 = 5
>>> 1/2 +     *        4 |   4/1 + 4/2 + 4/3 + 4/4 = 8 1/3 +     *
>>> n |   n * H(n)
>> 
>> I know you just moved this code. But: I don't buy this formula.
>> Like at all. Doesn't queuing and reordering entirely invalidate
>> the logic here?
> 
> Well, even the comment right next after the formula says that:
> 
> * Experimental results show that both of these formulas aren't *
> aggressiveenough, but we don't really have any better proposals.
> 
> That's the reason why users generally either use 0 or some rather
> high value (16 or 32 are the most common values see). The problem
> is that we don't really care about the number of spindles (and not
> just because SSDs don't have them at all), but about the target
> queue length per device. Spinning rust uses TCQ/NCQ to optimize the
> head movement, SSDs are parallel by nature (stacking multiple chips
> with separate channels).
> 
> I doubt we can really improve the formula, except maybe for saying
> "we want 16 requests per device" and multiplying the number by
> that. We don't really have the necessary introspection to determine
> better values (and it's not really possible, because the devices
> may be hidden behind a RAID controller (or a SAN). So we can't
> really do much.
> 
> Maybe the best thing we can do is just completely abandon the
> "number of spindles" idea, and just say "number of I/O requests to
> prefetch". Possibly with an explanation of how to estimate it
> (devices * queue length).
> 
>> I think that'd be a lot better.

+1 for that too.

If everone's ok with this change, I can submit a patch for that too.
Should I split that into two patches, and/or start a new thread?



- -- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)

iQEcBAEBAgAGBQJV50opAAoJELGaJ8vfEpOqve4H/0ZJCoFb0wHtArkGye6ietks
9uahdJy5ixO4J+AZsf2mVxV/DZK7dhK8rWIXt6yS3kfYfPDB79cRFWU5EgjEGAHB
qcB7wXCa5HibqLySgQct3zhVDj3CN3ucKT3LVp9OC9mrH2mnGtAp7PYkjd/HqBwd
AzW05pu21Ivi/z2gUBOdxNEEDxCLu8T1OtYq3WY9l7Yc4HfLG3huYLQD2LoRFRFn
lWwhQifML6uKzz7w3MfZrK4i2ffGGv/r1afHcpZvN3UsB5te1fSzr8KcUeJL7+Zg
xJTKwppiEMHpxokn5lw4LzYkjYD7W1fvwLnJhzRrs7dXGPl6rZtLmasyCld4FVk=
=r2jE
-----END PGP SIGNATURE-----


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allow a per-tablespace effective_io_concurrency setting

Reply via email to