Re: "Leaking" disk space

2013-03-22 Thread Eduardo Morras
On Wed, 20 Mar 2013 16:55:34 +
Dan Thomas  wrote:

> > a) Where do you have the wal files?
> 
> pg_xlog is symlinked to /usr/local/pglog/pg_xlog (ie, out of the
> partition mounted as /usr/local/pgsql which is exhibiting this
> behaviour).
> 

As Matthew Seaman says in other answer, this is the problem. Check 
http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016702.html and next 
thread messages.

It seems that writing file follow the symlink but makes a shadow/ghost file 
entry in original directory/disk. I see that you don't have trim enabled on the 
postgres fs, tunefs -p /usr/local/pgsql/  shows option t disabled. Is trim 
enabled on the fs where the symlink points? (Show the output of tunefs -p 
/dev/_don't_know_the_dev_entry_name)

> b) Are you sure that unused/old wal files are erased?
> 
> As above, but yes they seem to be being deleted properly
> 
> c) Do you have any postgres log level activated (like the ones used
> for long queries)?
> 
> Yes we have slow query logging enabled. pg_log is symlinked out of
> that partition to /usr/local/pglog/pg_log as well.
> 
> d) Does your queries have GROUP BY on very big data sets? Those create
> big temporal data files.
> 
> Yes we do a lot of that! However there are definitely no unlinked
> files, and the problem doesn't go away when pg is shut down. However a
> reboot does fix it.

Those questions were only to check and be sure is not a "normal" temp files 
problem.

What does dmesg show about filesystem check? Does it mark dirty filesystem? 

# WARNING!! Make a backup first!!!
If you stop postgres, and shoot #fsck_ffs -E /dev/mfid1s1d , does the problem 
solve?
# END WARNING!!

Please post the output of the fsck_ffs.

If the fsck_ffs doesn't solve the problem, check if there exist a lost+found 
directory on /usr/local/pgsql/ and it's content.


> 
> e) With question a) and b), do you use streaming replication?
> 
> Yes we do. This problem is not present on the warm standby servers
> that are being streamed to. We have failed over to the warm standbys
> previously (we're currently doing this regularly to work around the
> problem without too much downtime). Once we switch the warm standby to
> primary, it begins leaking space.

It may store old&all wal files, but it seems a bug at filesystem level. Trim 
support in ufs was added to 9.0 and backported to 8 and may be a candidate to 
watch.


---   ---
Eduardo Morras 
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: "Leaking" disk space

2013-03-22 Thread Dan Thomas
> A stab in the dark, but does # sync change anything

Alas, no.

On 21 March 2013 13:21, Bernt Hansson  wrote:
> On 2013-03-21 11:40, Dan Thomas wrote:
>>>
>>> Have you used fstat to identify the big growing file which is taking up
>>> the space, and which process has the file open?
>>
>>
>> It's not an unlinked file. I've tried using fstat and lsof to identify
>> it, and there's no inodes with zero links or that don't have a
>> matching file on disk.
>
>
> A stab in the dark, but does # sync change anything.
>
>
>
>> Dan
>>
>> On 20 March 2013 23:08, Daniel O'Callaghan  wrote:
>>>
>>> On 21/03/2013 3:55 AM, Dan Thomas wrote:


 Stopping Postgres doesn't fix it, but rebooting does which points at
>>>
>>>
>>> Have you used fstat to identify the big growing file which is taking up
>>> the
>>> space, and which process has the file open?
>>> A file which has been unlinked from all directories won't be seen by du,
>>> but
>>> it does not free disk space until no process has it open.
>>>
>>> USER CMD  PID   FD MOUNT  INUM MODE SZ|DV R/W
>>> root syslogd476488 /4317027 -rw-r--r--   19776  w
>>> root syslogd476489 /4317041 -rw---  63  w
>>>
>>> That might help to track it down.
>>>
>>> Danny
>>>
>>>
>>> ___
>>> freebsd-questions@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
>>> To unsubscribe, send any mail to
>>> "freebsd-questions-unsubscr...@freebsd.org"
>>
>> ___
>> freebsd-questions@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
>> To unsubscribe, send any mail to
>> "freebsd-questions-unsubscr...@freebsd.org"
>>
>>
>
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: "Leaking" disk space

2013-03-21 Thread Dan Thomas
> Have you used fstat to identify the big growing file which is taking up the 
> space, and which process has the file open?

It's not an unlinked file. I've tried using fstat and lsof to identify
it, and there's no inodes with zero links or that don't have a
matching file on disk.

Dan

On 20 March 2013 23:08, Daniel O'Callaghan  wrote:
> On 21/03/2013 3:55 AM, Dan Thomas wrote:
>>
>> Stopping Postgres doesn't fix it, but rebooting does which points at
>
> Have you used fstat to identify the big growing file which is taking up the
> space, and which process has the file open?
> A file which has been unlinked from all directories won't be seen by du, but
> it does not free disk space until no process has it open.
>
> USER CMD  PID   FD MOUNT  INUM MODE SZ|DV R/W
> root syslogd476488 /4317027 -rw-r--r--   19776  w
> root syslogd476489 /4317041 -rw---  63  w
>
> That might help to track it down.
>
> Danny
>
>
> ___
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: "Leaking" disk space

2013-03-20 Thread Daniel O'Callaghan

On 21/03/2013 3:55 AM, Dan Thomas wrote:

Stopping Postgres doesn't fix it, but rebooting does which points at
Have you used fstat to identify the big growing file which is taking up 
the space, and which process has the file open?
A file which has been unlinked from all directories won't be seen by du, 
but it does not free disk space until no process has it open.


USER CMD  PID   FD MOUNT  INUM MODE SZ|DV R/W
root syslogd476488 /4317027 -rw-r--r--   19776  w
root syslogd476489 /4317041 -rw---  63  w

That might help to track it down.

Danny

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: "Leaking" disk space

2013-03-20 Thread Dan Thomas
> a) Where do you have the wal files?

pg_xlog is symlinked to /usr/local/pglog/pg_xlog (ie, out of the
partition mounted as /usr/local/pgsql which is exhibiting this
behaviour).

b) Are you sure that unused/old wal files are erased?

As above, but yes they seem to be being deleted properly

c) Do you have any postgres log level activated (like the ones used
for long queries)?

Yes we have slow query logging enabled. pg_log is symlinked out of
that partition to /usr/local/pglog/pg_log as well.

d) Does your queries have GROUP BY on very big data sets? Those create
big temporal data files.

Yes we do a lot of that! However there are definitely no unlinked
files, and the problem doesn't go away when pg is shut down. However a
reboot does fix it.

e) With question a) and b), do you use streaming replication?

Yes we do. This problem is not present on the warm standby servers
that are being streamed to. We have failed over to the warm standbys
previously (we're currently doing this regularly to work around the
problem without too much downtime). Once we switch the warm standby to
primary, it begins leaking space.

On 20 March 2013 16:02, Eduardo Morras  wrote:
> On Wed, 20 Mar 2013 15:23:18 +
> Dan Thomas  wrote:
>
>> Hi Guys,
>>
>> We're seeing a problem with some of our FreeBSD/PostgreSQL servers
>> "leaking" quite significant amounts of disk space:
>>
>> > df -h /usr/local/pgsql/
>> Filesystem   SizeUsed   Avail Capacity  Mounted on
>> /dev/mfid1s1d1.1T772G222G78%/usr/local/pgsql
>>
>> > du -sh /usr/local/pgsql/
>> 741G/usr/local/pgsql/
>>
>> Stopping Postgres doesn't fix it, but rebooting does which points at
>> the OS rather than PG to me. However, the leak is only apparent in the
>> dedicated pgsql partition, and only on our database servers, so
>> PostgreSQL seems to at least be involved. The partition itself is a
>> relatively standard UFS partition:
>>
>> > grep /usr/local/pgsql /etc/fstab
>> /dev/mfid1s1d   /usr/local/pgsqlufs   rw   2   2
>>
>> > tunefs -p /usr/local/pgsql/
>> tunefs: POSIX.1e ACLs: (-a)disabled
>> tunefs: NFSv4 ACLs: (-N)   disabled
>> tunefs: MAC multilabel: (-l)   disabled
>> tunefs: soft updates: (-n) enabled
>> tunefs: gjournal: (-J) disabled
>> tunefs: trim: (-t) disabled
>> tunefs: maximum blocks per file in a cylinder group: (-e)  2048
>> tunefs: average file size: (-f)16384
>> tunefs: average number of files in a directory: (-s)   64
>> tunefs: minimum percentage of free space: (-m) 8%
>> tunefs: optimization preference: (-o)  time
>> tunefs: volume label: (-L)
>>
>> LSOF isn't showing any open files:
>>
>> > lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l
>> 0
>>
>> We're not creating filesystem snapshots:
>>
>> > find /usr/local/pgsql/ -flags snapshot
>> >
>>
>> Not all of our servers are leaking space, it's only the more
>> recently-installed systems. Here's a quick breakdown of versions:
>>
>> FreeBSD   PostgreSQL   Leaking?
>> 8.0   8.4.4no
>> 8.2   9.0.4no
>> 8.3   9.1.4yes
>> 8.3   9.2.3yes
>> 9.1   9.2.3yes
>>
>> Any ideas what's going on here, or where we could start debugging?
>
> Somethings to check:
>
> a) Where do you have the wal files?
> b) Are you sure that unused/old wal files are erased?
> c) Do you have any postgres log level activated (like the ones used for long 
> queries)?
> d) Does your queries have GROUP BY on very big data sets? Those create big 
> temporal data files.
> e) With question a) and b), do you use streaming replication?
>
>>
>> Thanks,
>>
>> Dan
>
> ---   ---
> Eduardo Morras 
> ___
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: "Leaking" disk space

2013-03-20 Thread Matthew Seaman
On 20/03/2013 15:23, Dan Thomas wrote:
> Hi Guys,
> 
> We're seeing a problem with some of our FreeBSD/PostgreSQL servers
> "leaking" quite significant amounts of disk space:
> 
> > df -h /usr/local/pgsql/
> Filesystem   SizeUsed   Avail Capacity  Mounted on
> /dev/mfid1s1d1.1T772G222G78%/usr/local/pgsql
> 
> > du -sh /usr/local/pgsql/
> 741G/usr/local/pgsql/
> 
> Stopping Postgres doesn't fix it, but rebooting does which points at
> the OS rather than PG to me. However, the leak is only apparent in the
> dedicated pgsql partition, and only on our database servers, so
> PostgreSQL seems to at least be involved. The partition itself is a
> relatively standard UFS partition:

Hi, Dan

You're not the first person to report that.  Please see the thread:

"leaking lots of unreferenced inodes (pg_xlog files?), maybe after
moving tables and indexes to tablespace on different volume"

on the freebsd...@freebsd.org mailing list.

Kirk McKusick was investigating the original report: CC'd.

Cheers,

Matthew

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: "Leaking" disk space

2013-03-20 Thread Eduardo Morras
On Wed, 20 Mar 2013 15:23:18 +
Dan Thomas  wrote:

> Hi Guys,
> 
> We're seeing a problem with some of our FreeBSD/PostgreSQL servers
> "leaking" quite significant amounts of disk space:
> 
> > df -h /usr/local/pgsql/
> Filesystem   SizeUsed   Avail Capacity  Mounted on
> /dev/mfid1s1d1.1T772G222G78%/usr/local/pgsql
> 
> > du -sh /usr/local/pgsql/
> 741G/usr/local/pgsql/
> 
> Stopping Postgres doesn't fix it, but rebooting does which points at
> the OS rather than PG to me. However, the leak is only apparent in the
> dedicated pgsql partition, and only on our database servers, so
> PostgreSQL seems to at least be involved. The partition itself is a
> relatively standard UFS partition:
> 
> > grep /usr/local/pgsql /etc/fstab
> /dev/mfid1s1d   /usr/local/pgsqlufs   rw   2   2
> 
> > tunefs -p /usr/local/pgsql/
> tunefs: POSIX.1e ACLs: (-a)disabled
> tunefs: NFSv4 ACLs: (-N)   disabled
> tunefs: MAC multilabel: (-l)   disabled
> tunefs: soft updates: (-n) enabled
> tunefs: gjournal: (-J) disabled
> tunefs: trim: (-t) disabled
> tunefs: maximum blocks per file in a cylinder group: (-e)  2048
> tunefs: average file size: (-f)16384
> tunefs: average number of files in a directory: (-s)   64
> tunefs: minimum percentage of free space: (-m) 8%
> tunefs: optimization preference: (-o)  time
> tunefs: volume label: (-L)
> 
> LSOF isn't showing any open files:
> 
> > lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l
> 0
> 
> We're not creating filesystem snapshots:
> 
> > find /usr/local/pgsql/ -flags snapshot
> >
> 
> Not all of our servers are leaking space, it's only the more
> recently-installed systems. Here's a quick breakdown of versions:
> 
> FreeBSD   PostgreSQL   Leaking?
> 8.0   8.4.4no
> 8.2   9.0.4no
> 8.3   9.1.4yes
> 8.3   9.2.3yes
> 9.1   9.2.3yes
> 
> Any ideas what's going on here, or where we could start debugging?

Somethings to check:

a) Where do you have the wal files? 
b) Are you sure that unused/old wal files are erased?
c) Do you have any postgres log level activated (like the ones used for long 
queries)?
d) Does your queries have GROUP BY on very big data sets? Those create big 
temporal data files.
e) With question a) and b), do you use streaming replication?

> 
> Thanks,
> 
> Dan

---   ---
Eduardo Morras 
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


"Leaking" disk space

2013-03-20 Thread Dan Thomas
Hi Guys,

We're seeing a problem with some of our FreeBSD/PostgreSQL servers
"leaking" quite significant amounts of disk space:

> df -h /usr/local/pgsql/
Filesystem   SizeUsed   Avail Capacity  Mounted on
/dev/mfid1s1d1.1T772G222G78%/usr/local/pgsql

> du -sh /usr/local/pgsql/
741G/usr/local/pgsql/

Stopping Postgres doesn't fix it, but rebooting does which points at
the OS rather than PG to me. However, the leak is only apparent in the
dedicated pgsql partition, and only on our database servers, so
PostgreSQL seems to at least be involved. The partition itself is a
relatively standard UFS partition:

> grep /usr/local/pgsql /etc/fstab
/dev/mfid1s1d   /usr/local/pgsqlufs   rw   2   2

> tunefs -p /usr/local/pgsql/
tunefs: POSIX.1e ACLs: (-a)disabled
tunefs: NFSv4 ACLs: (-N)   disabled
tunefs: MAC multilabel: (-l)   disabled
tunefs: soft updates: (-n) enabled
tunefs: gjournal: (-J) disabled
tunefs: trim: (-t) disabled
tunefs: maximum blocks per file in a cylinder group: (-e)  2048
tunefs: average file size: (-f)16384
tunefs: average number of files in a directory: (-s)   64
tunefs: minimum percentage of free space: (-m) 8%
tunefs: optimization preference: (-o)  time
tunefs: volume label: (-L)

LSOF isn't showing any open files:

> lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l
0

We're not creating filesystem snapshots:

> find /usr/local/pgsql/ -flags snapshot
>

Not all of our servers are leaking space, it's only the more
recently-installed systems. Here's a quick breakdown of versions:

FreeBSD   PostgreSQL   Leaking?
8.0   8.4.4no
8.2   9.0.4no
8.3   9.1.4yes
8.3   9.2.3yes
9.1   9.2.3yes

Any ideas what's going on here, or where we could start debugging?

Thanks,

Dan
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"