Re: "Leaking" disk space
On Wed, 20 Mar 2013 16:55:34 + Dan Thomas wrote: > > a) Where do you have the wal files? > > pg_xlog is symlinked to /usr/local/pglog/pg_xlog (ie, out of the > partition mounted as /usr/local/pgsql which is exhibiting this > behaviour). > As Matthew Seaman says in other answer, this is the problem. Check http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016702.html and next thread messages. It seems that writing file follow the symlink but makes a shadow/ghost file entry in original directory/disk. I see that you don't have trim enabled on the postgres fs, tunefs -p /usr/local/pgsql/ shows option t disabled. Is trim enabled on the fs where the symlink points? (Show the output of tunefs -p /dev/_don't_know_the_dev_entry_name) > b) Are you sure that unused/old wal files are erased? > > As above, but yes they seem to be being deleted properly > > c) Do you have any postgres log level activated (like the ones used > for long queries)? > > Yes we have slow query logging enabled. pg_log is symlinked out of > that partition to /usr/local/pglog/pg_log as well. > > d) Does your queries have GROUP BY on very big data sets? Those create > big temporal data files. > > Yes we do a lot of that! However there are definitely no unlinked > files, and the problem doesn't go away when pg is shut down. However a > reboot does fix it. Those questions were only to check and be sure is not a "normal" temp files problem. What does dmesg show about filesystem check? Does it mark dirty filesystem? # WARNING!! Make a backup first!!! If you stop postgres, and shoot #fsck_ffs -E /dev/mfid1s1d , does the problem solve? # END WARNING!! Please post the output of the fsck_ffs. If the fsck_ffs doesn't solve the problem, check if there exist a lost+found directory on /usr/local/pgsql/ and it's content. > > e) With question a) and b), do you use streaming replication? > > Yes we do. This problem is not present on the warm standby servers > that are being streamed to. We have failed over to the warm standbys > previously (we're currently doing this regularly to work around the > problem without too much downtime). Once we switch the warm standby to > primary, it begins leaking space. It may store old&all wal files, but it seems a bug at filesystem level. Trim support in ufs was added to 9.0 and backported to 8 and may be a candidate to watch. --- --- Eduardo Morras ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: "Leaking" disk space
> A stab in the dark, but does # sync change anything Alas, no. On 21 March 2013 13:21, Bernt Hansson wrote: > On 2013-03-21 11:40, Dan Thomas wrote: >>> >>> Have you used fstat to identify the big growing file which is taking up >>> the space, and which process has the file open? >> >> >> It's not an unlinked file. I've tried using fstat and lsof to identify >> it, and there's no inodes with zero links or that don't have a >> matching file on disk. > > > A stab in the dark, but does # sync change anything. > > > >> Dan >> >> On 20 March 2013 23:08, Daniel O'Callaghan wrote: >>> >>> On 21/03/2013 3:55 AM, Dan Thomas wrote: Stopping Postgres doesn't fix it, but rebooting does which points at >>> >>> >>> Have you used fstat to identify the big growing file which is taking up >>> the >>> space, and which process has the file open? >>> A file which has been unlinked from all directories won't be seen by du, >>> but >>> it does not free disk space until no process has it open. >>> >>> USER CMD PID FD MOUNT INUM MODE SZ|DV R/W >>> root syslogd476488 /4317027 -rw-r--r-- 19776 w >>> root syslogd476489 /4317041 -rw--- 63 w >>> >>> That might help to track it down. >>> >>> Danny >>> >>> >>> ___ >>> freebsd-questions@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-questions >>> To unsubscribe, send any mail to >>> "freebsd-questions-unsubscr...@freebsd.org" >> >> ___ >> freebsd-questions@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-questions >> To unsubscribe, send any mail to >> "freebsd-questions-unsubscr...@freebsd.org" >> >> > ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: "Leaking" disk space
> Have you used fstat to identify the big growing file which is taking up the > space, and which process has the file open? It's not an unlinked file. I've tried using fstat and lsof to identify it, and there's no inodes with zero links or that don't have a matching file on disk. Dan On 20 March 2013 23:08, Daniel O'Callaghan wrote: > On 21/03/2013 3:55 AM, Dan Thomas wrote: >> >> Stopping Postgres doesn't fix it, but rebooting does which points at > > Have you used fstat to identify the big growing file which is taking up the > space, and which process has the file open? > A file which has been unlinked from all directories won't be seen by du, but > it does not free disk space until no process has it open. > > USER CMD PID FD MOUNT INUM MODE SZ|DV R/W > root syslogd476488 /4317027 -rw-r--r-- 19776 w > root syslogd476489 /4317041 -rw--- 63 w > > That might help to track it down. > > Danny > > > ___ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org" ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: "Leaking" disk space
On 21/03/2013 3:55 AM, Dan Thomas wrote: Stopping Postgres doesn't fix it, but rebooting does which points at Have you used fstat to identify the big growing file which is taking up the space, and which process has the file open? A file which has been unlinked from all directories won't be seen by du, but it does not free disk space until no process has it open. USER CMD PID FD MOUNT INUM MODE SZ|DV R/W root syslogd476488 /4317027 -rw-r--r-- 19776 w root syslogd476489 /4317041 -rw--- 63 w That might help to track it down. Danny ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: "Leaking" disk space
> a) Where do you have the wal files? pg_xlog is symlinked to /usr/local/pglog/pg_xlog (ie, out of the partition mounted as /usr/local/pgsql which is exhibiting this behaviour). b) Are you sure that unused/old wal files are erased? As above, but yes they seem to be being deleted properly c) Do you have any postgres log level activated (like the ones used for long queries)? Yes we have slow query logging enabled. pg_log is symlinked out of that partition to /usr/local/pglog/pg_log as well. d) Does your queries have GROUP BY on very big data sets? Those create big temporal data files. Yes we do a lot of that! However there are definitely no unlinked files, and the problem doesn't go away when pg is shut down. However a reboot does fix it. e) With question a) and b), do you use streaming replication? Yes we do. This problem is not present on the warm standby servers that are being streamed to. We have failed over to the warm standbys previously (we're currently doing this regularly to work around the problem without too much downtime). Once we switch the warm standby to primary, it begins leaking space. On 20 March 2013 16:02, Eduardo Morras wrote: > On Wed, 20 Mar 2013 15:23:18 + > Dan Thomas wrote: > >> Hi Guys, >> >> We're seeing a problem with some of our FreeBSD/PostgreSQL servers >> "leaking" quite significant amounts of disk space: >> >> > df -h /usr/local/pgsql/ >> Filesystem SizeUsed Avail Capacity Mounted on >> /dev/mfid1s1d1.1T772G222G78%/usr/local/pgsql >> >> > du -sh /usr/local/pgsql/ >> 741G/usr/local/pgsql/ >> >> Stopping Postgres doesn't fix it, but rebooting does which points at >> the OS rather than PG to me. However, the leak is only apparent in the >> dedicated pgsql partition, and only on our database servers, so >> PostgreSQL seems to at least be involved. The partition itself is a >> relatively standard UFS partition: >> >> > grep /usr/local/pgsql /etc/fstab >> /dev/mfid1s1d /usr/local/pgsqlufs rw 2 2 >> >> > tunefs -p /usr/local/pgsql/ >> tunefs: POSIX.1e ACLs: (-a)disabled >> tunefs: NFSv4 ACLs: (-N) disabled >> tunefs: MAC multilabel: (-l) disabled >> tunefs: soft updates: (-n) enabled >> tunefs: gjournal: (-J) disabled >> tunefs: trim: (-t) disabled >> tunefs: maximum blocks per file in a cylinder group: (-e) 2048 >> tunefs: average file size: (-f)16384 >> tunefs: average number of files in a directory: (-s) 64 >> tunefs: minimum percentage of free space: (-m) 8% >> tunefs: optimization preference: (-o) time >> tunefs: volume label: (-L) >> >> LSOF isn't showing any open files: >> >> > lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l >> 0 >> >> We're not creating filesystem snapshots: >> >> > find /usr/local/pgsql/ -flags snapshot >> > >> >> Not all of our servers are leaking space, it's only the more >> recently-installed systems. Here's a quick breakdown of versions: >> >> FreeBSD PostgreSQL Leaking? >> 8.0 8.4.4no >> 8.2 9.0.4no >> 8.3 9.1.4yes >> 8.3 9.2.3yes >> 9.1 9.2.3yes >> >> Any ideas what's going on here, or where we could start debugging? > > Somethings to check: > > a) Where do you have the wal files? > b) Are you sure that unused/old wal files are erased? > c) Do you have any postgres log level activated (like the ones used for long > queries)? > d) Does your queries have GROUP BY on very big data sets? Those create big > temporal data files. > e) With question a) and b), do you use streaming replication? > >> >> Thanks, >> >> Dan > > --- --- > Eduardo Morras > ___ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org" ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: "Leaking" disk space
On 20/03/2013 15:23, Dan Thomas wrote: > Hi Guys, > > We're seeing a problem with some of our FreeBSD/PostgreSQL servers > "leaking" quite significant amounts of disk space: > > > df -h /usr/local/pgsql/ > Filesystem SizeUsed Avail Capacity Mounted on > /dev/mfid1s1d1.1T772G222G78%/usr/local/pgsql > > > du -sh /usr/local/pgsql/ > 741G/usr/local/pgsql/ > > Stopping Postgres doesn't fix it, but rebooting does which points at > the OS rather than PG to me. However, the leak is only apparent in the > dedicated pgsql partition, and only on our database servers, so > PostgreSQL seems to at least be involved. The partition itself is a > relatively standard UFS partition: Hi, Dan You're not the first person to report that. Please see the thread: "leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume" on the freebsd...@freebsd.org mailing list. Kirk McKusick was investigating the original report: CC'd. Cheers, Matthew ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: "Leaking" disk space
On Wed, 20 Mar 2013 15:23:18 + Dan Thomas wrote: > Hi Guys, > > We're seeing a problem with some of our FreeBSD/PostgreSQL servers > "leaking" quite significant amounts of disk space: > > > df -h /usr/local/pgsql/ > Filesystem SizeUsed Avail Capacity Mounted on > /dev/mfid1s1d1.1T772G222G78%/usr/local/pgsql > > > du -sh /usr/local/pgsql/ > 741G/usr/local/pgsql/ > > Stopping Postgres doesn't fix it, but rebooting does which points at > the OS rather than PG to me. However, the leak is only apparent in the > dedicated pgsql partition, and only on our database servers, so > PostgreSQL seems to at least be involved. The partition itself is a > relatively standard UFS partition: > > > grep /usr/local/pgsql /etc/fstab > /dev/mfid1s1d /usr/local/pgsqlufs rw 2 2 > > > tunefs -p /usr/local/pgsql/ > tunefs: POSIX.1e ACLs: (-a)disabled > tunefs: NFSv4 ACLs: (-N) disabled > tunefs: MAC multilabel: (-l) disabled > tunefs: soft updates: (-n) enabled > tunefs: gjournal: (-J) disabled > tunefs: trim: (-t) disabled > tunefs: maximum blocks per file in a cylinder group: (-e) 2048 > tunefs: average file size: (-f)16384 > tunefs: average number of files in a directory: (-s) 64 > tunefs: minimum percentage of free space: (-m) 8% > tunefs: optimization preference: (-o) time > tunefs: volume label: (-L) > > LSOF isn't showing any open files: > > > lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l > 0 > > We're not creating filesystem snapshots: > > > find /usr/local/pgsql/ -flags snapshot > > > > Not all of our servers are leaking space, it's only the more > recently-installed systems. Here's a quick breakdown of versions: > > FreeBSD PostgreSQL Leaking? > 8.0 8.4.4no > 8.2 9.0.4no > 8.3 9.1.4yes > 8.3 9.2.3yes > 9.1 9.2.3yes > > Any ideas what's going on here, or where we could start debugging? Somethings to check: a) Where do you have the wal files? b) Are you sure that unused/old wal files are erased? c) Do you have any postgres log level activated (like the ones used for long queries)? d) Does your queries have GROUP BY on very big data sets? Those create big temporal data files. e) With question a) and b), do you use streaming replication? > > Thanks, > > Dan --- --- Eduardo Morras ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
"Leaking" disk space
Hi Guys, We're seeing a problem with some of our FreeBSD/PostgreSQL servers "leaking" quite significant amounts of disk space: > df -h /usr/local/pgsql/ Filesystem SizeUsed Avail Capacity Mounted on /dev/mfid1s1d1.1T772G222G78%/usr/local/pgsql > du -sh /usr/local/pgsql/ 741G/usr/local/pgsql/ Stopping Postgres doesn't fix it, but rebooting does which points at the OS rather than PG to me. However, the leak is only apparent in the dedicated pgsql partition, and only on our database servers, so PostgreSQL seems to at least be involved. The partition itself is a relatively standard UFS partition: > grep /usr/local/pgsql /etc/fstab /dev/mfid1s1d /usr/local/pgsqlufs rw 2 2 > tunefs -p /usr/local/pgsql/ tunefs: POSIX.1e ACLs: (-a)disabled tunefs: NFSv4 ACLs: (-N) disabled tunefs: MAC multilabel: (-l) disabled tunefs: soft updates: (-n) enabled tunefs: gjournal: (-J) disabled tunefs: trim: (-t) disabled tunefs: maximum blocks per file in a cylinder group: (-e) 2048 tunefs: average file size: (-f)16384 tunefs: average number of files in a directory: (-s) 64 tunefs: minimum percentage of free space: (-m) 8% tunefs: optimization preference: (-o) time tunefs: volume label: (-L) LSOF isn't showing any open files: > lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l 0 We're not creating filesystem snapshots: > find /usr/local/pgsql/ -flags snapshot > Not all of our servers are leaking space, it's only the more recently-installed systems. Here's a quick breakdown of versions: FreeBSD PostgreSQL Leaking? 8.0 8.4.4no 8.2 9.0.4no 8.3 9.1.4yes 8.3 9.2.3yes 9.1 9.2.3yes Any ideas what's going on here, or where we could start debugging? Thanks, Dan ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"