On 14.12.2011 22:22, Jeremy Chadwick wrote:
On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:
Hi Jeremy,

This is not hardware problem, I've already checked that. I also ran
fsck today and got no errors.

After some more exploration of how mongodb works, I found that then
listing hangs, one of mongodb thread is in "biowr" state for a long
time. It periodically calls msync(MS_SYNC) accordingly to ktrace
out.

If I'll remove msync() calls from mongodb, how often data will be
sync by OS?

--
Andrey Zonov

On 14.12.2011 2:15, Jeremy Chadwick wrote:
On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:

Have you any ideas what is going on? or how to catch the problem?

Assuming this isn't a file on the root filesystem, try booting the
machine in single-user mode and using "fsck -f" on the filesystem in
question.

Can you verify there's no problems with the disk this file lives on as
well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
thought I'd mention it.

I have no real answer, I'm sorry.  msync(2) indicates it's effectively
deprecated (see BUGS).  It looks like this is effectively a mmap-version
of fsync(2).

I replaced msync(2) with fsync(2). Unfortunately, from man pages it is not obvious that I can do this. Anyway, thanks.


I'm extremely confused by this problem.  What you're describing above is
that the process is "stuck in biowr state for a long time", but what you
stated originally was that the process was "stuck in ufs state for a
few minutes":

Listing of the directory with mongodb files by ls(1) stuck in "ufs" state when one of mongodb's thread in "biowr" state. It looks like system holds global lock of the file which is msync(2)-ed and can't immediately return from lstat(2) call.


I've got STABLE-8 (r221983) with mongodb-1.8.1 installed on it.  A
couple days ago I observed that listing of mongodb directory stuck in
a few minutes in "ufs" state.

Can we narrow down what we're talking about here?  Does the process
actually deadlock?  Or are you concerned about performance implications?

I know nothing about this "mongodb" software, but the reason it's
calling msync() is because it wants to try and ensure that the data it
changed in an mmap()-mapped page to be reflected (fully written) on the
disk.  This behaviour is fairly common within database software, but
"how often" the software chooses to do this is entirely a design
implementation choice by the authors.

Meaning: if mongodb is either 1) continually calling msync(), or 2)
waiting for too long a period of time before calling msync(),
performance within the process will suffer.  #1 could result in overall
bad performance, while #2 could result in a process that's spending a
lot of time doing I/O (flushing to disk) and therefore appears
"deadlocked" when in fact the kernel/subsystems are doing exactly what
they were told to do.

Removing the msync() call could result in inconsistent data (possibly
non-recoverable) if the mongodb software crashes or if some other piece
(thread or child?  Not sure) expects to open a new fd on that file which
has mmap()'d data.

Yes, I clearly understand this. I think of any system tuning instead, but nothing arose in my head.


This is about all I know.  I would love to be able to tell you "consider
a different database" but that seems like an excuse rather than an
actual solution.  I guess if all you're seeing is the process "stall"
for long periods of time, but recover normally, then I would open up a
support ticket with the mongodb folks to discuss performance.



--
Andrey Zonov
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to