On 09/01/2013 10:37 AM, Amit Kapila wrote:
On Sat, Aug 24, 2013 at 2:38 AM, Heikki Linnakangas <[email protected]> wrote:a. In XLogFileInit(), /* ! * XXX: What should we use as max_segno? We used to use XLOGfileslop when ! * that was a constant, but that was always a bit dubious: normally, at a ! * checkpoint, XLOGfileslop was the offset from the checkpoint record, ! * but here, it was the offset from the insert location. We can't do the ! * normal XLOGfileslop calculation here because we don't have access to ! * the prior checkpoint's redo location. So somewhat arbitrarily, just ! * use CheckPointSegments. ! */ ! max_segno = logsegno + CheckPointSegments; if (!InstallXLogFileSegment(&installed_segno, tmppath, ! *use_existent, max_segno, use_lock)) Earlier max_advance is same when InstallXLogFileSegment is called from RemoveOldXlogFiles() and XLogFileInit(), but now they will be different (and it seems there is no direct relation between these 2 numbers), so will it be okay for scenario when someone else has created the file while this function was filling, because it needs to restore as future segment which will be decided based on max_segno?
I haven't really thought hard about the above. As the comment says, passing the same max_advance value here and in RemoveOldXlogFiles() was a bit dubious too, because the reference point was different.
I believe it's quite rare that two processes create a new WAL segment concurrently, so it isn't terribly important what we do here.
b. Do createrestartpoint need to update the CheckPointDistanceEstimate, as when it will try to remove old xlog files, it needs recycleSegNo which is calculated using CheckPointDistanceEstimate?
Yeah, you're right, it should. I haven't tested this with archive recovery or replication at all yet.
As a developer, I would love to have configuration knob such as min_recycle_wal_size, but not sure how many users will be comfortable setting this value, actually few users I had talked about this earlier are interested in setting max WAL size which can allow them to set an upper limit on space required by WAL. Can't we think of doing the calculation of files to recycle only based on CheckPointDistanceEstimate.
You can always just leave min_recycle_wal_size to the default. It sets a minimum for the number of preallocated segments, which can help if you have spikes that consume a lot of WAL, like nightly batch jobs. But if you don't have such spikes, or the overhead of creating new segments when such a spike happens isn't too large, you don't need to set it.
One idea is to try to make the creation of new WAL segments faster. Then it wouldn't hurt so much if you run out of preallocated/recycled segments and need to suddenly create a lot of new ones. Then we might not need a minimum setting at all.
- Heikki -- Sent via pgsql-hackers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
