Hi,
Since the last patch version I've done a number of experiments with this
throttling idea, so let me share some of the ideas and results, and see
where that gets us.
The patch versions so far tied everything to syncrep - commit latency
with sync replica was the original motivation, so this
Hi,
On 2023-11-08 19:29:38 +0100, Tomas Vondra wrote:
> >>> I haven't checked, but I'd assume that 100bytes back and forth should
> >>> easily
> >>> fit a new message to update LSNs and the existing feedback response. Even
> >>> just
> >>> the difference between sending 100 bytes and sending
On 11/8/23 18:11, Andres Freund wrote:
> Hi,
>
> On 2023-11-08 13:59:55 +0100, Tomas Vondra wrote:
>>> I used netperf's tcp_rr between my workstation and my laptop on a local
>>> 10Gbit
>>> network (albeit with a crappy external card for my laptop), to put some
>>> numbers to this. I used -r
Hi,
On 2023-11-08 13:59:55 +0100, Tomas Vondra wrote:
> > I used netperf's tcp_rr between my workstation and my laptop on a local
> > 10Gbit
> > network (albeit with a crappy external card for my laptop), to put some
> > numbers to this. I used -r $s,100 to test sending a variable sized data to
On 11/8/23 07:40, Andres Freund wrote:
> Hi,
>
> On 2023-11-04 20:00:46 +0100, Tomas Vondra wrote:
>> scope
>> -
>> Now, let's talk about scope - what the patch does not aim to do. The
>> patch is explicitly intended for syncrep clusters, not async. There have
>> been proposals to also
Hi,
On 2023-11-04 20:00:46 +0100, Tomas Vondra wrote:
> scope
> -
> Now, let's talk about scope - what the patch does not aim to do. The
> patch is explicitly intended for syncrep clusters, not async. There have
> been proposals to also support throttling for async replicas, logical
>
Hi,
I keep getting occasional complaints about the impact of large/bulk
transactions on latency of small OLTP transactions, so I'd like to
revive this thread a bit and move it forward.
Attached is a rebased v3, followed by 0002 patch with some review
comments, missing comments and minor tweaks.
On Thu, Feb 2, 2023 at 11:03 AM Tomas Vondra
wrote:
> > I agree that some other concurrent backend's
> > COMMIT could fsync it, but I was wondering if that's sensible
> > optimization to perform (so that issue_fsync() would be called for
> > only commit/rollback records). I can imagine a
On 2/1/23 14:40, Jakub Wartak wrote:
> On Wed, Feb 1, 2023 at 2:14 PM Tomas Vondra
> wrote:
>
>>> Maybe we should avoid calling fsyncs for WAL throttling? (by teaching
>>> HandleXLogDelayPending()->XLogFlush()->XLogWrite() to NOT to sync when
>>> we are flushing just because of WAL thortting ?)
On Wed, Feb 1, 2023 at 2:14 PM Tomas Vondra
wrote:
> > Maybe we should avoid calling fsyncs for WAL throttling? (by teaching
> > HandleXLogDelayPending()->XLogFlush()->XLogWrite() to NOT to sync when
> > we are flushing just because of WAL thortting ?) Would that still be
> > safe?
>
> It's not
On 2/1/23 11:04, Jakub Wartak wrote:
> On Mon, Jan 30, 2023 at 9:16 AM Bharath Rupireddy
> wrote:
>
> Hi Bharath, thanks for reviewing.
>
>> I think measuring the number of WAL flushes with and without this
>> feature that the postgres generates is great to know this feature
>> effects on
On Mon, Jan 30, 2023 at 9:16 AM Bharath Rupireddy
wrote:
Hi Bharath, thanks for reviewing.
> I think measuring the number of WAL flushes with and without this
> feature that the postgres generates is great to know this feature
> effects on IOPS. Probably it's even better with variations in
>
On Sat, Jan 28, 2023 at 6:06 AM Tomas Vondra
wrote:
>
> >
> > That's not the sole goal, from my end: I'd like to avoid writing out +
> > flushing the WAL in too small chunks. Imagine a few concurrent vacuums or
> > COPYs or such - if we're unlucky they'd each end up exceeding their
> >
On 1/27/23 22:19, Andres Freund wrote:
> Hi,
>
> On 2023-01-27 12:06:49 +0100, Jakub Wartak wrote:
>> On Thu, Jan 26, 2023 at 4:49 PM Andres Freund wrote:
>>
>>> Huh? Why did you remove the GUC?
>>
>> After reading previous threads, my optimism level of getting it ever
>> in shape of being
On 1/27/23 22:33, Andres Freund wrote:
> Hi,
>
> On 2023-01-27 21:45:16 +0100, Tomas Vondra wrote:
>> On 1/27/23 08:18, Bharath Rupireddy wrote:
I think my idea of only forcing to flush/wait an LSN some distance in the
past
would automatically achieve that?
>>>
>>> I'm sorry, I
Hi,
On 2023-01-27 21:45:16 +0100, Tomas Vondra wrote:
> On 1/27/23 08:18, Bharath Rupireddy wrote:
> >> I think my idea of only forcing to flush/wait an LSN some distance in the
> >> past
> >> would automatically achieve that?
> >
> > I'm sorry, I couldn't get your point, can you please explain
Hi,
On 2023-01-27 12:48:43 +0530, Bharath Rupireddy wrote:
> Looking at the patch, the feature, in its current shape, focuses on
> improving replication lag (by throttling WAL on the primary) only when
> synchronous replication is enabled. Why is that? Why can't we design
> it for replication in
Hi,
On 2023-01-27 12:06:49 +0100, Jakub Wartak wrote:
> On Thu, Jan 26, 2023 at 4:49 PM Andres Freund wrote:
>
> > Huh? Why did you remove the GUC?
>
> After reading previous threads, my optimism level of getting it ever
> in shape of being widely accepted degraded significantly (mainly due
>
On 1/27/23 08:18, Bharath Rupireddy wrote:
> On Thu, Jan 26, 2023 at 9:21 PM Andres Freund wrote:
>>
>>> 7. I think we need to not let backends throttle too frequently even
>>> though they have crossed wal_throttle_threshold bytes. The best way is
>>> to rely on replication lag, after all the
Hi Bharath,
On Fri, Jan 27, 2023 at 12:04 PM Bharath Rupireddy
wrote:
>
> On Fri, Jan 27, 2023 at 2:03 PM Alvaro Herrera
> wrote:
> >
> > On 2023-Jan-27, Bharath Rupireddy wrote:
> >
> > > Looking at the patch, the feature, in its current shape, focuses on
> > > improving replication lag (by
Hi,
v2 is attached.
On Thu, Jan 26, 2023 at 4:49 PM Andres Freund wrote:
> Huh? Why did you remove the GUC?
After reading previous threads, my optimism level of getting it ever
in shape of being widely accepted degraded significantly (mainly due
to the discussion of wider category of 'WAL I/O
On Fri, Jan 27, 2023 at 2:03 PM Alvaro Herrera wrote:
>
> On 2023-Jan-27, Bharath Rupireddy wrote:
>
> > Looking at the patch, the feature, in its current shape, focuses on
> > improving replication lag (by throttling WAL on the primary) only when
> > synchronous replication is enabled. Why is
On 2023-Jan-27, Bharath Rupireddy wrote:
> Looking at the patch, the feature, in its current shape, focuses on
> improving replication lag (by throttling WAL on the primary) only when
> synchronous replication is enabled. Why is that? Why can't we design
> it for replication in general (async,
On Thu, Jan 26, 2023 at 9:21 PM Andres Freund wrote:
>
> > 7. I think we need to not let backends throttle too frequently even
> > though they have crossed wal_throttle_threshold bytes. The best way is
> > to rely on replication lag, after all the goal of this feature is to
> > keep replication
On 1/26/23 16:40, Andres Freund wrote:
> Hi,
>
> On 2023-01-26 12:08:16 +0100, Tomas Vondra wrote:
>> It's not clear to me how could it cause deadlocks, as we're not waiting
>> for a lock/resource locked by someone else, but it's certainly an issue
>> for uninterruptible hangs.
>
> Maybe not.
Hi,
On 2023-01-26 13:33:27 +0530, Bharath Rupireddy wrote:
> 6. Backends can ignore throttling for WAL records marked as unimportant, no?
Why would that be a good idea? Not that it matters today, but those records
still need to be flushed in case of a commit by another transaction.
> 7. I
Hi,
On 2023-01-26 14:40:56 +0100, Jakub Wartak wrote:
> In summary: Attached is a slightly reworked version of this patch.
> 1. Moved logic outside XLogInsertRecord() under ProcessInterrupts()
> 2. Flushes up to the last page boundary, still uses SyncRepWaitForLSN()
> 3. Removed GUC for now
Hi,
On 2023-01-26 12:08:16 +0100, Tomas Vondra wrote:
> It's not clear to me how could it cause deadlocks, as we're not waiting
> for a lock/resource locked by someone else, but it's certainly an issue
> for uninterruptible hangs.
Maybe not. But I wouldn't want to bet on it. It's a violation of
> On 1/25/23 20:05, Andres Freund wrote:
> > Hi,
> >
> > Such a feature could be useful - but I don't think the current place of
> > throttling has any hope of working reliably:
[..]
> > You're blocking in the middle of an XLOG insertion.
[..]
> Yeah, I agree the sleep would have to happen
On 1/25/23 20:05, Andres Freund wrote:
> Hi,
>
> On 2023-01-25 14:32:51 +0100, Jakub Wartak wrote:
>> In other words it allows slow down of any backend activity. Any feedback on
>> such a feature is welcome, including better GUC name proposals ;) and
>> conditions in which such feature should
On Thu, Jan 26, 2023 at 12:35 AM Andres Freund wrote:
>
> Hi,
>
> On 2023-01-25 14:32:51 +0100, Jakub Wartak wrote:
> > In other words it allows slow down of any backend activity. Any feedback on
> > such a feature is welcome, including better GUC name proposals ;) and
> > conditions in which
Hi,
On 2023-01-25 14:32:51 +0100, Jakub Wartak wrote:
> In other words it allows slow down of any backend activity. Any feedback on
> such a feature is welcome, including better GUC name proposals ;) and
> conditions in which such feature should be disabled even if it would be
> enabled globally
Hi,
attached is proposal idea by Tomas (in CC) for protecting and
prioritizing OLTP latency on syncrep over other heavy WAL hitting
sessions. This is the result of internal testing and research related
to the syncrep behavior with Tomas, Alvaro and me. The main objective
of this
33 matches
Mail list logo