Re: [HACKERS] Slow I/O can break throttling of base backup
On Fri, Dec 16, 2016 at 11:24 AM, Antonin Houska wrote: > Antonin Houska wrote: > > > It seems to be my bug. I'll check tomorrow. > > I could reproduce the problem by adding sufficient sleep time to the > loop. > > > Magnus Hagander wrote: > >> I wonder if the else if (sleep > 0) at the bottom of throttle() should > just > >> be a simple else and always run, resetting last_throttle? > > I agree. In fact, I could simplify the code even more. > > Since (elapsed + sleep) almost equals to GetCurrentIntegerTimestamp(), we > can > use the following statement unconditionally (I think I tried too hard to > avoid > calling GetCurrentIntegerTimestamp too often in the original patch): > > throttled_last = GetCurrentIntegerTimestamp(); > > Thus we can also get rid of the "else" branch that clears both "sleep" and > "wait_result". > > (The patch contains minor comment refinement that I found useful when > seeing > the code after years.) > > Thanks! Applied and backpatched. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Slow I/O can break throttling of base backup
Antonin Houska wrote: > It seems to be my bug. I'll check tomorrow. I could reproduce the problem by adding sufficient sleep time to the loop. > Magnus Hagander wrote: >> I wonder if the else if (sleep > 0) at the bottom of throttle() should just >> be a simple else and always run, resetting last_throttle? I agree. In fact, I could simplify the code even more. Since (elapsed + sleep) almost equals to GetCurrentIntegerTimestamp(), we can use the following statement unconditionally (I think I tried too hard to avoid calling GetCurrentIntegerTimestamp too often in the original patch): throttled_last = GetCurrentIntegerTimestamp(); Thus we can also get rid of the "else" branch that clears both "sleep" and "wait_result". (The patch contains minor comment refinement that I found useful when seeing the code after years.) -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c new file mode 100644 index ffc7e58..40b3c11 *** a/src/backend/replication/basebackup.c --- b/src/backend/replication/basebackup.c *** throttle(size_t increment) *** 1370,1395 if (wait_result & WL_LATCH_SET) CHECK_FOR_INTERRUPTS(); } - else - { - /* - * The actual transfer rate is below the limit. A negative value - * would distort the adjustment of throttled_last. - */ - wait_result = 0; - sleep = 0; - } /* ! * Only a whole multiple of throttling_sample was processed. The rest will ! * be done during the next call of this function. */ throttling_counter %= throttling_sample; ! /* Once the (possible) sleep has ended, new period starts. */ ! if (wait_result & WL_TIMEOUT) ! throttled_last += elapsed + sleep; ! else if (sleep > 0) ! /* Sleep was necessary but might have been interrupted. */ ! throttled_last = GetCurrentIntegerTimestamp(); } --- 1370,1385 if (wait_result & WL_LATCH_SET) CHECK_FOR_INTERRUPTS(); } /* ! * As we work with integers, only whole multiple of throttling_sample was ! * processed. The rest will be done during the next call of this function. */ throttling_counter %= throttling_sample; ! /* ! * Time interval for the remaining amount and possible next increments ! * starts now. ! */ ! throttled_last = GetCurrentIntegerTimestamp(); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slow I/O can break throttling of base backup
It seems to be my bug. I'll check tomorrow. Magnus Hagander wrote: > Running pg_basebackup with a throttling of say 10M runs it into the risk of > the I/O on the server actually being slower than pg_basebackup (I have > preproduced similar issues on fake-slow disks with lower rate limits). > > What happens in this case in basebackup.c is that the value for "sleep" > comes out negative. This means we don't sleep, which is fine. > > However, that also means we don't set throttle_last. > > That means that the next time we come around to throttle(), the value for > "elapsed" is even bigger, which results in an even bigger negative number, > and we're "stuck". > > AFAICT this means that a temporary I/O spike that makes reading of the disk > too slow can leave us in a situation where we never recover, and thus never > end up sleeping ever again, effectively turning off rate limiting. > > I wonder if the else if (sleep > 0) at the bottom of throttle() should just > be a simple else and always run, resetting last_throttle? > > -- > Magnus Hagander > Me: http://www.hagander.net/ > Work: http://www.redpill-linpro.com/ > -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Slow I/O can break throttling of base backup
Running pg_basebackup with a throttling of say 10M runs it into the risk of the I/O on the server actually being slower than pg_basebackup (I have preproduced similar issues on fake-slow disks with lower rate limits). What happens in this case in basebackup.c is that the value for "sleep" comes out negative. This means we don't sleep, which is fine. However, that also means we don't set throttle_last. That means that the next time we come around to throttle(), the value for "elapsed" is even bigger, which results in an even bigger negative number, and we're "stuck". AFAICT this means that a temporary I/O spike that makes reading of the disk too slow can leave us in a situation where we never recover, and thus never end up sleeping ever again, effectively turning off rate limiting. I wonder if the else if (sleep > 0) at the bottom of throttle() should just be a simple else and always run, resetting last_throttle? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/