Re: directory listing hangs in ufs state

2011-12-22 Thread Kostik Belousov
On Wed, Dec 21, 2011 at 09:03:02PM +0400, Andrey Zonov wrote:
 On 15.12.2011 17:01, Kostik Belousov wrote:
 On Thu, Dec 15, 2011 at 03:51:02PM +0400, Andrey Zonov wrote:
 On Thu, Dec 15, 2011 at 12:42 AM, Jeremy Chadwick
 free...@jdc.parodius.comwrote:
 
 On Wed, Dec 14, 2011 at 11:47:10PM +0400, Andrey Zonov wrote:
 On 14.12.2011 22:22, Jeremy Chadwick wrote:
 On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:
 Hi Jeremy,
 
 This is not hardware problem, I've already checked that. I also ran
 fsck today and got no errors.
 
 After some more exploration of how mongodb works, I found that then
 listing hangs, one of mongodb thread is in biowr state for a long
 time. It periodically calls msync(MS_SYNC) accordingly to ktrace
 out.
 
 If I'll remove msync() calls from mongodb, how often data will be
 sync by OS?
 
 --
 Andrey Zonov
 
 On 14.12.2011 2:15, Jeremy Chadwick wrote:
 On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:
 
 Have you any ideas what is going on? or how to catch the problem?
 
 Assuming this isn't a file on the root filesystem, try booting the
 machine in single-user mode and using fsck -f on the filesystem in
 question.
 
 Can you verify there's no problems with the disk this file lives on 
 as
 well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
 thought I'd mention it.
 
 I have no real answer, I'm sorry.  msync(2) indicates it's effectively
 deprecated (see BUGS).  It looks like this is effectively a 
 mmap-version
 of fsync(2).
 
 I replaced msync(2) with fsync(2).  Unfortunately, from man pages it
 is not obvious that I can do this. Anyway, thanks.
 
 Sorry, that wasn't what I was implying.  Let me try to explain
 differently.
 
 msync(2) looks, to me, like an mmap-specific version of fsync(2).  Based
 on the man page, it seems that the with msync() you can effectively
 guaranteed flushing of certain pages within an mmap()'d region to disk.
 fsync() would flush **all** buffers/internal pages to be flushed to
 disk.
 
 One would need to look at the code to mongodb to find out what it's
 actually doing with msync().  That is to say, if it's doing something
 like this (I probably have the semantics wrong -- I've never spent much
 time with mmap()):
 
 fd = open(/some/file, O_RDWR);
 ptr = mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
 ret = msync(ptr, 65536, MS_SYNC);
 /* or alternatively, this:
 ret = msync(ptr, NULL, MS_SYNC);
 */
 
 Then this, to me, would be mostly the equivalent to:
 
 fd = fopen(/some/file, r+);
 ret = fsync(fd);
 
 Otherwise, if it's calling msync() only on an address/location within
 the region ptr points to, then that may be more efficient (less pages to
 flush).
 
 
 They call msync() for the whole file.  So, there will not be any 
 difference.
 
 
 The mmap() arguments -- specifically flags (see man page) -- also play
 a role here.  The one that catches my attention is MAP_NOSYNC.  So you
 may need to look at the mongodb code to figure out what it's mmap()
 call is.
 
 One might wonder why they don't just use open() with the O_SYNC.  I
 imagine that has to do with, again, performance; possibly the don't want
 all I/O synchronous, and would rather flush certain pages in the mmap'd
 region to disk as needed.  I see the legitimacy in that approach (vs.
 just using O_SYNC).
 
 There's really no easy way for me to tell you which is more efficient,
 better, blah blah without spending a lot of time with a benchmarking
 program that tests all of this, *plus* an entire system (world) built
 with profiling.
 
 
 I ran for two hours mongodb with fsync() and got the following:
 STARTED  INBLK OUBLK MAJFLT MINFLT
 Thu Dec 15 10:34:52 2011 3 192744314 3080182
 
 This is output of `ps -o lstart,inblock,oublock,majflt,minflt -U mongodb'.
 
 Then I ran it with default msync():
 STARTED  INBLK OUBLK MAJFLT MINFLT
 Thu Dec 15 12:34:53 2011 0 7241555 79 5401945
 
 There are also two graphics of disk business [1] [2].
 
 The difference is significant, in 37 times!  That what I expected to get.
 
 In commentaries for vm_object_page_clean() I found this:
 
   *  When stuffing pages asynchronously, allow clustering.  XXX we 
   need a
   *  synchronous clustering mode implementation.
 
 It means for me that msync(MS_SYNC) flush every page on disk in single IO
 transaction.  If we multiply 4K and 37 we get 150K.  This number is size 
 of
 the single transaction in my experience.
 
 +alc@, kib@
 
 Am I right? Is there any plan to implement this?
 Current buffer clustering code can only do only async writes. In fact, I
 am not quite sure what would consitute the sync clustering, because the
 ability to delay the write is important to be able to cluster at all.
 
 Also, I am not sure that lack of clustering is the biggest problem.
 IMO, the fact that each write is sync is the first problem there. It
 would be quite a work to add the tracking of the issued writes 

Re: directory listing hangs in ufs state

2011-12-22 Thread Alan Cox

On 12/22/2011 03:48, Kostik Belousov wrote:

On Wed, Dec 21, 2011 at 09:03:02PM +0400, Andrey Zonov wrote:

On 15.12.2011 17:01, Kostik Belousov wrote:

On Thu, Dec 15, 2011 at 03:51:02PM +0400, Andrey Zonov wrote:

On Thu, Dec 15, 2011 at 12:42 AM, Jeremy Chadwick
free...@jdc.parodius.comwrote:


On Wed, Dec 14, 2011 at 11:47:10PM +0400, Andrey Zonov wrote:

On 14.12.2011 22:22, Jeremy Chadwick wrote:

On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:

Hi Jeremy,

This is not hardware problem, I've already checked that. I also ran
fsck today and got no errors.

After some more exploration of how mongodb works, I found that then
listing hangs, one of mongodb thread is in biowr state for a long
time. It periodically calls msync(MS_SYNC) accordingly to ktrace
out.

If I'll remove msync() calls from mongodb, how often data will be
sync by OS?

--
Andrey Zonov

On 14.12.2011 2:15, Jeremy Chadwick wrote:

On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:

Have you any ideas what is going on? or how to catch the problem?

Assuming this isn't a file on the root filesystem, try booting the
machine in single-user mode and using fsck -f on the filesystem in
question.

Can you verify there's no problems with the disk this file lives on
as
well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
thought I'd mention it.

I have no real answer, I'm sorry.  msync(2) indicates it's effectively
deprecated (see BUGS).  It looks like this is effectively a
mmap-version
of fsync(2).

I replaced msync(2) with fsync(2).  Unfortunately, from man pages it
is not obvious that I can do this. Anyway, thanks.

Sorry, that wasn't what I was implying.  Let me try to explain
differently.

msync(2) looks, to me, like an mmap-specific version of fsync(2).  Based
on the man page, it seems that the with msync() you can effectively
guaranteed flushing of certain pages within an mmap()'d region to disk.
fsync() would flush **all** buffers/internal pages to be flushed to
disk.

One would need to look at the code to mongodb to find out what it's
actually doing with msync().  That is to say, if it's doing something
like this (I probably have the semantics wrong -- I've never spent much
time with mmap()):

fd = open(/some/file, O_RDWR);
ptr = mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
ret = msync(ptr, 65536, MS_SYNC);
/* or alternatively, this:
ret = msync(ptr, NULL, MS_SYNC);
*/

Then this, to me, would be mostly the equivalent to:

fd = fopen(/some/file, r+);
ret = fsync(fd);

Otherwise, if it's calling msync() only on an address/location within
the region ptr points to, then that may be more efficient (less pages to
flush).


They call msync() for the whole file.  So, there will not be any
difference.



The mmap() arguments -- specifically flags (see man page) -- also play
a role here.  The one that catches my attention is MAP_NOSYNC.  So you
may need to look at the mongodb code to figure out what it's mmap()
call is.

One might wonder why they don't just use open() with the O_SYNC.  I
imagine that has to do with, again, performance; possibly the don't want
all I/O synchronous, and would rather flush certain pages in the mmap'd
region to disk as needed.  I see the legitimacy in that approach (vs.
just using O_SYNC).

There's really no easy way for me to tell you which is more efficient,
better, blah blah without spending a lot of time with a benchmarking
program that tests all of this, *plus* an entire system (world) built
with profiling.


I ran for two hours mongodb with fsync() and got the following:
STARTED  INBLK OUBLK MAJFLT MINFLT
Thu Dec 15 10:34:52 2011 3 192744314 3080182

This is output of `ps -o lstart,inblock,oublock,majflt,minflt -U mongodb'.

Then I ran it with default msync():
STARTED  INBLK OUBLK MAJFLT MINFLT
Thu Dec 15 12:34:53 2011 0 7241555 79 5401945

There are also two graphics of disk business [1] [2].

The difference is significant, in 37 times!  That what I expected to get.

In commentaries for vm_object_page_clean() I found this:

  *  When stuffing pages asynchronously, allow clustering.  XXX we
  need a
  *  synchronous clustering mode implementation.

It means for me that msync(MS_SYNC) flush every page on disk in single IO
transaction.  If we multiply 4K and 37 we get 150K.  This number is size
of
the single transaction in my experience.

+alc@, kib@

Am I right? Is there any plan to implement this?

Current buffer clustering code can only do only async writes. In fact, I
am not quite sure what would consitute the sync clustering, because the
ability to delay the write is important to be able to cluster at all.

Also, I am not sure that lack of clustering is the biggest problem.
IMO, the fact that each write is sync is the first problem there. It
would be quite a work to add the tracking of the issued writes to the
vm_object_page_clean() and down the stack. Esp. due to custom page
write 

Re: directory listing hangs in ufs state

2011-12-21 Thread Andrey Zonov

On 15.12.2011 17:01, Kostik Belousov wrote:

On Thu, Dec 15, 2011 at 03:51:02PM +0400, Andrey Zonov wrote:

On Thu, Dec 15, 2011 at 12:42 AM, Jeremy Chadwick
free...@jdc.parodius.comwrote:


On Wed, Dec 14, 2011 at 11:47:10PM +0400, Andrey Zonov wrote:

On 14.12.2011 22:22, Jeremy Chadwick wrote:

On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:

Hi Jeremy,

This is not hardware problem, I've already checked that. I also ran
fsck today and got no errors.

After some more exploration of how mongodb works, I found that then
listing hangs, one of mongodb thread is in biowr state for a long
time. It periodically calls msync(MS_SYNC) accordingly to ktrace
out.

If I'll remove msync() calls from mongodb, how often data will be
sync by OS?

--
Andrey Zonov

On 14.12.2011 2:15, Jeremy Chadwick wrote:

On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:


Have you any ideas what is going on? or how to catch the problem?


Assuming this isn't a file on the root filesystem, try booting the
machine in single-user mode and using fsck -f on the filesystem in
question.

Can you verify there's no problems with the disk this file lives on as
well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
thought I'd mention it.


I have no real answer, I'm sorry.  msync(2) indicates it's effectively
deprecated (see BUGS).  It looks like this is effectively a mmap-version
of fsync(2).


I replaced msync(2) with fsync(2).  Unfortunately, from man pages it
is not obvious that I can do this. Anyway, thanks.


Sorry, that wasn't what I was implying.  Let me try to explain
differently.

msync(2) looks, to me, like an mmap-specific version of fsync(2).  Based
on the man page, it seems that the with msync() you can effectively
guaranteed flushing of certain pages within an mmap()'d region to disk.
fsync() would flush **all** buffers/internal pages to be flushed to
disk.

One would need to look at the code to mongodb to find out what it's
actually doing with msync().  That is to say, if it's doing something
like this (I probably have the semantics wrong -- I've never spent much
time with mmap()):

fd = open(/some/file, O_RDWR);
ptr = mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
ret = msync(ptr, 65536, MS_SYNC);
/* or alternatively, this:
ret = msync(ptr, NULL, MS_SYNC);
*/

Then this, to me, would be mostly the equivalent to:

fd = fopen(/some/file, r+);
ret = fsync(fd);

Otherwise, if it's calling msync() only on an address/location within
the region ptr points to, then that may be more efficient (less pages to
flush).



They call msync() for the whole file.  So, there will not be any difference.



The mmap() arguments -- specifically flags (see man page) -- also play
a role here.  The one that catches my attention is MAP_NOSYNC.  So you
may need to look at the mongodb code to figure out what it's mmap()
call is.

One might wonder why they don't just use open() with the O_SYNC.  I
imagine that has to do with, again, performance; possibly the don't want
all I/O synchronous, and would rather flush certain pages in the mmap'd
region to disk as needed.  I see the legitimacy in that approach (vs.
just using O_SYNC).

There's really no easy way for me to tell you which is more efficient,
better, blah blah without spending a lot of time with a benchmarking
program that tests all of this, *plus* an entire system (world) built
with profiling.



I ran for two hours mongodb with fsync() and got the following:
STARTED  INBLK OUBLK MAJFLT MINFLT
Thu Dec 15 10:34:52 2011 3 192744314 3080182

This is output of `ps -o lstart,inblock,oublock,majflt,minflt -U mongodb'.

Then I ran it with default msync():
STARTED  INBLK OUBLK MAJFLT MINFLT
Thu Dec 15 12:34:53 2011 0 7241555 79 5401945

There are also two graphics of disk business [1] [2].

The difference is significant, in 37 times!  That what I expected to get.

In commentaries for vm_object_page_clean() I found this:

  *  When stuffing pages asynchronously, allow clustering.  XXX we need a
  *  synchronous clustering mode implementation.

It means for me that msync(MS_SYNC) flush every page on disk in single IO
transaction.  If we multiply 4K and 37 we get 150K.  This number is size of
the single transaction in my experience.

+alc@, kib@

Am I right? Is there any plan to implement this?

Current buffer clustering code can only do only async writes. In fact, I
am not quite sure what would consitute the sync clustering, because the
ability to delay the write is important to be able to cluster at all.

Also, I am not sure that lack of clustering is the biggest problem.
IMO, the fact that each write is sync is the first problem there. It
would be quite a work to add the tracking of the issued writes to the
vm_object_page_clean() and down the stack. Esp. due to custom page
write vops in several fses.

The only guarantee that POSIX requires from msync(MS_SYNC) is that
the writes are 

Re: directory listing hangs in ufs state

2011-12-15 Thread Andrey Zonov
On Thu, Dec 15, 2011 at 12:42 AM, Jeremy Chadwick
free...@jdc.parodius.comwrote:

 On Wed, Dec 14, 2011 at 11:47:10PM +0400, Andrey Zonov wrote:
  On 14.12.2011 22:22, Jeremy Chadwick wrote:
  On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:
  Hi Jeremy,
  
  This is not hardware problem, I've already checked that. I also ran
  fsck today and got no errors.
  
  After some more exploration of how mongodb works, I found that then
  listing hangs, one of mongodb thread is in biowr state for a long
  time. It periodically calls msync(MS_SYNC) accordingly to ktrace
  out.
  
  If I'll remove msync() calls from mongodb, how often data will be
  sync by OS?
  
  --
  Andrey Zonov
  
  On 14.12.2011 2:15, Jeremy Chadwick wrote:
  On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:
  
  Have you any ideas what is going on? or how to catch the problem?
  
  Assuming this isn't a file on the root filesystem, try booting the
  machine in single-user mode and using fsck -f on the filesystem in
  question.
  
  Can you verify there's no problems with the disk this file lives on as
  well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
  thought I'd mention it.
  
  I have no real answer, I'm sorry.  msync(2) indicates it's effectively
  deprecated (see BUGS).  It looks like this is effectively a mmap-version
  of fsync(2).
 
  I replaced msync(2) with fsync(2).  Unfortunately, from man pages it
  is not obvious that I can do this. Anyway, thanks.

 Sorry, that wasn't what I was implying.  Let me try to explain
 differently.

 msync(2) looks, to me, like an mmap-specific version of fsync(2).  Based
 on the man page, it seems that the with msync() you can effectively
 guaranteed flushing of certain pages within an mmap()'d region to disk.
 fsync() would flush **all** buffers/internal pages to be flushed to
 disk.

 One would need to look at the code to mongodb to find out what it's
 actually doing with msync().  That is to say, if it's doing something
 like this (I probably have the semantics wrong -- I've never spent much
 time with mmap()):

 fd = open(/some/file, O_RDWR);
 ptr = mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
 ret = msync(ptr, 65536, MS_SYNC);
 /* or alternatively, this:
 ret = msync(ptr, NULL, MS_SYNC);
 */

 Then this, to me, would be mostly the equivalent to:

 fd = fopen(/some/file, r+);
 ret = fsync(fd);

 Otherwise, if it's calling msync() only on an address/location within
 the region ptr points to, then that may be more efficient (less pages to
 flush).


They call msync() for the whole file.  So, there will not be any difference.


 The mmap() arguments -- specifically flags (see man page) -- also play
 a role here.  The one that catches my attention is MAP_NOSYNC.  So you
 may need to look at the mongodb code to figure out what it's mmap()
 call is.

 One might wonder why they don't just use open() with the O_SYNC.  I
 imagine that has to do with, again, performance; possibly the don't want
 all I/O synchronous, and would rather flush certain pages in the mmap'd
 region to disk as needed.  I see the legitimacy in that approach (vs.
 just using O_SYNC).

 There's really no easy way for me to tell you which is more efficient,
 better, blah blah without spending a lot of time with a benchmarking
 program that tests all of this, *plus* an entire system (world) built
 with profiling.


I ran for two hours mongodb with fsync() and got the following:
STARTED  INBLK OUBLK MAJFLT MINFLT
Thu Dec 15 10:34:52 2011 3 192744314 3080182

This is output of `ps -o lstart,inblock,oublock,majflt,minflt -U mongodb'.

Then I ran it with default msync():
STARTED  INBLK OUBLK MAJFLT MINFLT
Thu Dec 15 12:34:53 2011 0 7241555 79 5401945

There are also two graphics of disk business [1] [2].

The difference is significant, in 37 times!  That what I expected to get.

In commentaries for vm_object_page_clean() I found this:

 *  When stuffing pages asynchronously, allow clustering.  XXX we need a
 *  synchronous clustering mode implementation.

It means for me that msync(MS_SYNC) flush every page on disk in single IO
transaction.  If we multiply 4K and 37 we get 150K.  This number is size of
the single transaction in my experience.

+alc@, kib@

Am I right? Is there any plan to implement this?


 All of this would really fall into the hands of the mongodb people to
 figure out, if you ask me.  But I should note that mmap() on BSD behaves
 and performs very differently than on, say, Linux; so if the authors
 wrote what they did intended for Linux systems, I wouldn't be too
 surprised.  :-)


https://jira.mongodb.org/browse/SERVER-663


   I'm extremely confused by this problem.  What you're describing above
 is
  that the process is stuck in biowr state for a long time, but what you
  stated originally was that the process was stuck in ufs state for a
  few minutes:
 
  Listing of the directory 

Re: directory listing hangs in ufs state

2011-12-15 Thread Kostik Belousov
On Thu, Dec 15, 2011 at 03:51:02PM +0400, Andrey Zonov wrote:
 On Thu, Dec 15, 2011 at 12:42 AM, Jeremy Chadwick
 free...@jdc.parodius.comwrote:
 
  On Wed, Dec 14, 2011 at 11:47:10PM +0400, Andrey Zonov wrote:
   On 14.12.2011 22:22, Jeremy Chadwick wrote:
   On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:
   Hi Jeremy,
   
   This is not hardware problem, I've already checked that. I also ran
   fsck today and got no errors.
   
   After some more exploration of how mongodb works, I found that then
   listing hangs, one of mongodb thread is in biowr state for a long
   time. It periodically calls msync(MS_SYNC) accordingly to ktrace
   out.
   
   If I'll remove msync() calls from mongodb, how often data will be
   sync by OS?
   
   --
   Andrey Zonov
   
   On 14.12.2011 2:15, Jeremy Chadwick wrote:
   On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:
   
   Have you any ideas what is going on? or how to catch the problem?
   
   Assuming this isn't a file on the root filesystem, try booting the
   machine in single-user mode and using fsck -f on the filesystem in
   question.
   
   Can you verify there's no problems with the disk this file lives on as
   well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
   thought I'd mention it.
   
   I have no real answer, I'm sorry.  msync(2) indicates it's effectively
   deprecated (see BUGS).  It looks like this is effectively a mmap-version
   of fsync(2).
  
   I replaced msync(2) with fsync(2).  Unfortunately, from man pages it
   is not obvious that I can do this. Anyway, thanks.
 
  Sorry, that wasn't what I was implying.  Let me try to explain
  differently.
 
  msync(2) looks, to me, like an mmap-specific version of fsync(2).  Based
  on the man page, it seems that the with msync() you can effectively
  guaranteed flushing of certain pages within an mmap()'d region to disk.
  fsync() would flush **all** buffers/internal pages to be flushed to
  disk.
 
  One would need to look at the code to mongodb to find out what it's
  actually doing with msync().  That is to say, if it's doing something
  like this (I probably have the semantics wrong -- I've never spent much
  time with mmap()):
 
  fd = open(/some/file, O_RDWR);
  ptr = mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
  ret = msync(ptr, 65536, MS_SYNC);
  /* or alternatively, this:
  ret = msync(ptr, NULL, MS_SYNC);
  */
 
  Then this, to me, would be mostly the equivalent to:
 
  fd = fopen(/some/file, r+);
  ret = fsync(fd);
 
  Otherwise, if it's calling msync() only on an address/location within
  the region ptr points to, then that may be more efficient (less pages to
  flush).
 
 
 They call msync() for the whole file.  So, there will not be any difference.
 
 
  The mmap() arguments -- specifically flags (see man page) -- also play
  a role here.  The one that catches my attention is MAP_NOSYNC.  So you
  may need to look at the mongodb code to figure out what it's mmap()
  call is.
 
  One might wonder why they don't just use open() with the O_SYNC.  I
  imagine that has to do with, again, performance; possibly the don't want
  all I/O synchronous, and would rather flush certain pages in the mmap'd
  region to disk as needed.  I see the legitimacy in that approach (vs.
  just using O_SYNC).
 
  There's really no easy way for me to tell you which is more efficient,
  better, blah blah without spending a lot of time with a benchmarking
  program that tests all of this, *plus* an entire system (world) built
  with profiling.
 
 
 I ran for two hours mongodb with fsync() and got the following:
 STARTED  INBLK OUBLK MAJFLT MINFLT
 Thu Dec 15 10:34:52 2011 3 192744314 3080182
 
 This is output of `ps -o lstart,inblock,oublock,majflt,minflt -U mongodb'.
 
 Then I ran it with default msync():
 STARTED  INBLK OUBLK MAJFLT MINFLT
 Thu Dec 15 12:34:53 2011 0 7241555 79 5401945
 
 There are also two graphics of disk business [1] [2].
 
 The difference is significant, in 37 times!  That what I expected to get.
 
 In commentaries for vm_object_page_clean() I found this:
 
  *  When stuffing pages asynchronously, allow clustering.  XXX we need a
  *  synchronous clustering mode implementation.
 
 It means for me that msync(MS_SYNC) flush every page on disk in single IO
 transaction.  If we multiply 4K and 37 we get 150K.  This number is size of
 the single transaction in my experience.
 
 +alc@, kib@
 
 Am I right? Is there any plan to implement this?
Current buffer clustering code can only do only async writes. In fact, I
am not quite sure what would consitute the sync clustering, because the
ability to delay the write is important to be able to cluster at all.

Also, I am not sure that lack of clustering is the biggest problem.
IMO, the fact that each write is sync is the first problem there. It
would be quite a work to add the tracking of the issued writes to the

Re: directory listing hangs in ufs state

2011-12-14 Thread Andrey Zonov

Hi Jeremy,

This is not hardware problem, I've already checked that. I also ran fsck 
today and got no errors.


After some more exploration of how mongodb works, I found that then 
listing hangs, one of mongodb thread is in biowr state for a long 
time. It periodically calls msync(MS_SYNC) accordingly to ktrace out.


If I'll remove msync() calls from mongodb, how often data will be sync 
by OS?


--
Andrey Zonov

On 14.12.2011 2:15, Jeremy Chadwick wrote:

On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:


Have you any ideas what is going on? or how to catch the problem?


Assuming this isn't a file on the root filesystem, try booting the
machine in single-user mode and using fsck -f on the filesystem in
question.

Can you verify there's no problems with the disk this file lives on as
well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
thought I'd mention it.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: directory listing hangs in ufs state

2011-12-14 Thread Jeremy Chadwick
On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:
 Hi Jeremy,
 
 This is not hardware problem, I've already checked that. I also ran
 fsck today and got no errors.
 
 After some more exploration of how mongodb works, I found that then
 listing hangs, one of mongodb thread is in biowr state for a long
 time. It periodically calls msync(MS_SYNC) accordingly to ktrace
 out.
 
 If I'll remove msync() calls from mongodb, how often data will be
 sync by OS?
 
 -- 
 Andrey Zonov
 
 On 14.12.2011 2:15, Jeremy Chadwick wrote:
 On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:
 
 Have you any ideas what is going on? or how to catch the problem?
 
 Assuming this isn't a file on the root filesystem, try booting the
 machine in single-user mode and using fsck -f on the filesystem in
 question.
 
 Can you verify there's no problems with the disk this file lives on as
 well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
 thought I'd mention it.

I have no real answer, I'm sorry.  msync(2) indicates it's effectively
deprecated (see BUGS).  It looks like this is effectively a mmap-version
of fsync(2).

I'm extremely confused by this problem.  What you're describing above is
that the process is stuck in biowr state for a long time, but what you
stated originally was that the process was stuck in ufs state for a
few minutes:

 I've got STABLE-8 (r221983) with mongodb-1.8.1 installed on it.  A
 couple days ago I observed that listing of mongodb directory stuck in
 a few minutes in ufs state.

Can we narrow down what we're talking about here?  Does the process
actually deadlock?  Or are you concerned about performance implications?

I know nothing about this mongodb software, but the reason it's
calling msync() is because it wants to try and ensure that the data it
changed in an mmap()-mapped page to be reflected (fully written) on the
disk.  This behaviour is fairly common within database software, but
how often the software chooses to do this is entirely a design
implementation choice by the authors.

Meaning: if mongodb is either 1) continually calling msync(), or 2)
waiting for too long a period of time before calling msync(),
performance within the process will suffer.  #1 could result in overall
bad performance, while #2 could result in a process that's spending a
lot of time doing I/O (flushing to disk) and therefore appears
deadlocked when in fact the kernel/subsystems are doing exactly what
they were told to do.

Removing the msync() call could result in inconsistent data (possibly
non-recoverable) if the mongodb software crashes or if some other piece
(thread or child?  Not sure) expects to open a new fd on that file which
has mmap()'d data.

This is about all I know.  I would love to be able to tell you consider
a different database but that seems like an excuse rather than an
actual solution.  I guess if all you're seeing is the process stall
for long periods of time, but recover normally, then I would open up a
support ticket with the mongodb folks to discuss performance.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: directory listing hangs in ufs state

2011-12-14 Thread Alan Cox
On Wed, Dec 14, 2011 at 12:22 PM, Jeremy Chadwick
free...@jdc.parodius.comwrote:

 On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:
  Hi Jeremy,
 
  This is not hardware problem, I've already checked that. I also ran
  fsck today and got no errors.
 
  After some more exploration of how mongodb works, I found that then
  listing hangs, one of mongodb thread is in biowr state for a long
  time. It periodically calls msync(MS_SYNC) accordingly to ktrace
  out.
 
  If I'll remove msync() calls from mongodb, how often data will be
  sync by OS?
 
  --
  Andrey Zonov
 
  On 14.12.2011 2:15, Jeremy Chadwick wrote:
  On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:
  
  Have you any ideas what is going on? or how to catch the problem?
  
  Assuming this isn't a file on the root filesystem, try booting the
  machine in single-user mode and using fsck -f on the filesystem in
  question.
  
  Can you verify there's no problems with the disk this file lives on as
  well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
  thought I'd mention it.

 I have no real answer, I'm sorry.  msync(2) indicates it's effectively
 deprecated (see BUGS).  It looks like this is effectively a mmap-version
 of fsync(2).


Yikes, I just looked at this man page.  I'm afraid that the text in the
BUGS section is highly misleading.  The MS_INVALIDATE option should be
obsolete for the reason given there.  Under a strict reading of the
applicable standard, FreeBSD could implement this option as a NOP.
However, we treat it something like madvise(MADV_DONTNEED|FREE).  In
contrast, MS_SYNC is definitely not obsolete.

Alan

P.S. If someone wants to take a crack at fixing this man page, contact me
off list.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: directory listing hangs in ufs state

2011-12-14 Thread Andrey Zonov

On 14.12.2011 22:53, Alan Cox wrote:

On Wed, Dec 14, 2011 at 12:22 PM, Jeremy Chadwick
free...@jdc.parodius.com mailto:free...@jdc.parodius.com wrote:

On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:
  Hi Jeremy,
 
  This is not hardware problem, I've already checked that. I also ran
  fsck today and got no errors.
 
  After some more exploration of how mongodb works, I found that then
  listing hangs, one of mongodb thread is in biowr state for a long
  time. It periodically calls msync(MS_SYNC) accordingly to ktrace
  out.
 
  If I'll remove msync() calls from mongodb, how often data will be
  sync by OS?
 
  --
  Andrey Zonov
 
  On 14.12.2011 2:15, Jeremy Chadwick wrote:
  On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:
  
  Have you any ideas what is going on? or how to catch the problem?
  
  Assuming this isn't a file on the root filesystem, try booting the
  machine in single-user mode and using fsck -f on the filesystem in
  question.
  
  Can you verify there's no problems with the disk this file lives
on as
  well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
  thought I'd mention it.

I have no real answer, I'm sorry.  msync(2) indicates it's effectively
deprecated (see BUGS).  It looks like this is effectively a mmap-version
of fsync(2).


Yikes, I just looked at this man page.  I'm afraid that the text in the
BUGS section is highly misleading.  The MS_INVALIDATE option should be
obsolete for the reason given there.  Under a strict reading of the
applicable standard, FreeBSD could implement this option as a NOP.
However, we treat it something like madvise(MADV_DONTNEED|FREE).  In
contrast, MS_SYNC is definitely not obsolete.

Alan

P.S. If someone wants to take a crack at fixing this man page, contact
me off list.



Please don't remove support for MS_INVALIDATE, this is only one way to 
purge disk cache. MADV_DONTNEED does nothing here in my experience.


--
Andrey Zonov
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: directory listing hangs in ufs state

2011-12-14 Thread Andrey Zonov

On 14.12.2011 22:22, Jeremy Chadwick wrote:

On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:

Hi Jeremy,

This is not hardware problem, I've already checked that. I also ran
fsck today and got no errors.

After some more exploration of how mongodb works, I found that then
listing hangs, one of mongodb thread is in biowr state for a long
time. It periodically calls msync(MS_SYNC) accordingly to ktrace
out.

If I'll remove msync() calls from mongodb, how often data will be
sync by OS?

--
Andrey Zonov

On 14.12.2011 2:15, Jeremy Chadwick wrote:

On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:


Have you any ideas what is going on? or how to catch the problem?


Assuming this isn't a file on the root filesystem, try booting the
machine in single-user mode and using fsck -f on the filesystem in
question.

Can you verify there's no problems with the disk this file lives on as
well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
thought I'd mention it.


I have no real answer, I'm sorry.  msync(2) indicates it's effectively
deprecated (see BUGS).  It looks like this is effectively a mmap-version
of fsync(2).


I replaced msync(2) with fsync(2).  Unfortunately, from man pages it is 
not obvious that I can do this. Anyway, thanks.




I'm extremely confused by this problem.  What you're describing above is
that the process is stuck in biowr state for a long time, but what you
stated originally was that the process was stuck in ufs state for a
few minutes:


Listing of the directory with mongodb files by ls(1) stuck in ufs 
state when one of mongodb's thread in biowr state.  It looks like 
system holds global lock of the file which is msync(2)-ed and can't 
immediately return from lstat(2) call.





I've got STABLE-8 (r221983) with mongodb-1.8.1 installed on it.  A
couple days ago I observed that listing of mongodb directory stuck in
a few minutes in ufs state.


Can we narrow down what we're talking about here?  Does the process
actually deadlock?  Or are you concerned about performance implications?

I know nothing about this mongodb software, but the reason it's
calling msync() is because it wants to try and ensure that the data it
changed in an mmap()-mapped page to be reflected (fully written) on the
disk.  This behaviour is fairly common within database software, but
how often the software chooses to do this is entirely a design
implementation choice by the authors.

Meaning: if mongodb is either 1) continually calling msync(), or 2)
waiting for too long a period of time before calling msync(),
performance within the process will suffer.  #1 could result in overall
bad performance, while #2 could result in a process that's spending a
lot of time doing I/O (flushing to disk) and therefore appears
deadlocked when in fact the kernel/subsystems are doing exactly what
they were told to do.

Removing the msync() call could result in inconsistent data (possibly
non-recoverable) if the mongodb software crashes or if some other piece
(thread or child?  Not sure) expects to open a new fd on that file which
has mmap()'d data.


Yes, I clearly understand this.  I think of any system tuning instead, 
but nothing arose in my head.




This is about all I know.  I would love to be able to tell you consider
a different database but that seems like an excuse rather than an
actual solution.  I guess if all you're seeing is the process stall
for long periods of time, but recover normally, then I would open up a
support ticket with the mongodb folks to discuss performance.




--
Andrey Zonov
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: directory listing hangs in ufs state

2011-12-14 Thread Jeremy Chadwick
On Wed, Dec 14, 2011 at 11:47:10PM +0400, Andrey Zonov wrote:
 On 14.12.2011 22:22, Jeremy Chadwick wrote:
 On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:
 Hi Jeremy,
 
 This is not hardware problem, I've already checked that. I also ran
 fsck today and got no errors.
 
 After some more exploration of how mongodb works, I found that then
 listing hangs, one of mongodb thread is in biowr state for a long
 time. It periodically calls msync(MS_SYNC) accordingly to ktrace
 out.
 
 If I'll remove msync() calls from mongodb, how often data will be
 sync by OS?
 
 --
 Andrey Zonov
 
 On 14.12.2011 2:15, Jeremy Chadwick wrote:
 On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:
 
 Have you any ideas what is going on? or how to catch the problem?
 
 Assuming this isn't a file on the root filesystem, try booting the
 machine in single-user mode and using fsck -f on the filesystem in
 question.
 
 Can you verify there's no problems with the disk this file lives on as
 well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
 thought I'd mention it.
 
 I have no real answer, I'm sorry.  msync(2) indicates it's effectively
 deprecated (see BUGS).  It looks like this is effectively a mmap-version
 of fsync(2).
 
 I replaced msync(2) with fsync(2).  Unfortunately, from man pages it
 is not obvious that I can do this. Anyway, thanks.

Sorry, that wasn't what I was implying.  Let me try to explain
differently.

msync(2) looks, to me, like an mmap-specific version of fsync(2).  Based
on the man page, it seems that the with msync() you can effectively
guaranteed flushing of certain pages within an mmap()'d region to disk.
fsync() would flush **all** buffers/internal pages to be flushed to
disk.

One would need to look at the code to mongodb to find out what it's
actually doing with msync().  That is to say, if it's doing something
like this (I probably have the semantics wrong -- I've never spent much
time with mmap()):

fd = open(/some/file, O_RDWR);
ptr = mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
ret = msync(ptr, 65536, MS_SYNC);
/* or alternatively, this:
ret = msync(ptr, NULL, MS_SYNC);
*/

Then this, to me, would be mostly the equivalent to:

fd = fopen(/some/file, r+);
ret = fsync(fd);

Otherwise, if it's calling msync() only on an address/location within
the region ptr points to, then that may be more efficient (less pages to
flush).

The mmap() arguments -- specifically flags (see man page) -- also play
a role here.  The one that catches my attention is MAP_NOSYNC.  So you
may need to look at the mongodb code to figure out what it's mmap()
call is.

One might wonder why they don't just use open() with the O_SYNC.  I
imagine that has to do with, again, performance; possibly the don't want
all I/O synchronous, and would rather flush certain pages in the mmap'd
region to disk as needed.  I see the legitimacy in that approach (vs.
just using O_SYNC).

There's really no easy way for me to tell you which is more efficient,
better, blah blah without spending a lot of time with a benchmarking
program that tests all of this, *plus* an entire system (world) built
with profiling.

All of this would really fall into the hands of the mongodb people to
figure out, if you ask me.  But I should note that mmap() on BSD behaves
and performs very differently than on, say, Linux; so if the authors
wrote what they did intended for Linux systems, I wouldn't be too
surprised.  :-)

 I'm extremely confused by this problem.  What you're describing above is
 that the process is stuck in biowr state for a long time, but what you
 stated originally was that the process was stuck in ufs state for a
 few minutes:
 
 Listing of the directory with mongodb files by ls(1) stuck in ufs
 state when one of mongodb's thread in biowr state.  It looks like
 system holds global lock of the file which is msync(2)-ed and can't
 immediately return from lstat(2) call.

Thanks for the clarification -- yes this helps.  To some degree it makes
sense, some piece of the filesystem or VFS layer is blocking
intentionally.  How to figure out what layer I do not know.  Kernel
folks familiar with this aspect would need to chime in here.

 I've got STABLE-8 (r221983) with mongodb-1.8.1 installed on it.  A
 couple days ago I observed that listing of mongodb directory stuck in
 a few minutes in ufs state.
 
 Can we narrow down what we're talking about here?  Does the process
 actually deadlock?  Or are you concerned about performance implications?
 
 I know nothing about this mongodb software, but the reason it's
 calling msync() is because it wants to try and ensure that the data it
 changed in an mmap()-mapped page to be reflected (fully written) on the
 disk.  This behaviour is fairly common within database software, but
 how often the software chooses to do this is entirely a design
 implementation choice by the authors.
 
 Meaning: if mongodb is either 1) continually calling msync(), or 2)
 waiting for too long a 

directory listing hangs in ufs state

2011-12-13 Thread Andrey Zonov

Hi,

I've got STABLE-8 (r221983) with mongodb-1.8.1 installed on it.  A 
couple days ago I observed that listing of mongodb directory stuck in a 
few minutes in ufs state. I've run it again with ktrace and got 
following (kdump -R):


 91324 ls   0.03 CALL  lstat(0x32c199c8,0x32c19950)
 91324 ls   0.03 NAMI  base.1
 91324 ls   21.357255 STRU  struct stat {dev=116, ino=45125633, 
mode=-rw--- , nlink=1, uid=922, gid=922, rdev=180226648, 
atime=1323709877, stime=1323776461, ctime=1323776461, 
birthtime=1314798592, size=134217728, blksize=16384, blocks=262304, 
flags=0x0 }

 91324 ls   0.14 RET   lstat 0

kgdb backtrace of this process was looked like this:

Thread 297 (Thread 100372):
#0  sched_switch (td=0xff0095c008c0, newtd=0xff000357b8c0, 
flags=) at /usr/src/sys/kern/sched_ule.c:1866
#1  0x80406696 in mi_switch (flags=260, newtd=0x0) at 
/usr/src/sys/kern/kern_synch.c:449
#2  0x8043c072 in sleepq_wait (wchan=0xff0103aaf7f8, pri=80) 
at /usr/src/sys/kern/subr_sleepqueue.c:609
#3  0x803e4a5a in __lockmgr_args (lk=0xff0103aaf7f8, 
flags=2097408, ilk=0xff0103aaf820, wmesg=) at 
/usr/src/sys/kern/kern_lock.c:220

#5  0x8061239c in ffs_lock (ap=0xff84867fc550) at lockmgr.h:94
#5  0x806d2462 in VOP_LOCK1_APV (vop=0x80921fe0, 
a=0xff84867fc550) at vnode_if.c:1988
#6  0x804a58b7 in _vn_lock (vp=0xff0103aaf760, 
flags=2097152, file=0x80736e70 /usr/src/sys/kern/vfs_subr.c, 
line=2137) at vnode_if.h:859
#7  0x80498bc0 in vget (vp=0xff0103aaf760, flags=2097408, 
td=0xff0095c008c0) at /usr/src/sys/kern/vfs_subr.c:2137
#8  0x804845f4 in cache_lookup (dvp=0xff0095675b10, 
vpp=0xff84867fc910, cnp=0xff84867fc938) at 
/usr/src/sys/kern/vfs_cache.c:587
#9  0x80484a30 in vfs_cache_lookup (ap=) at 
/usr/src/sys/kern/vfs_cache.c:905
#10 0x806d2e7c in VOP_LOOKUP_APV (vop=0x80922820, 
a=0xff84867fc790) at vnode_if.c:123

#11 0x8048bc80 in lookup (ndp=0xff84867fc8e0) at vnode_if.h:54
#12 0x8048cf0e in namei (ndp=0xff84867fc8e0) at 
/usr/src/sys/kern/vfs_lookup.c:269
#13 0x8049c972 in kern_statat_vnhook (td=0xff0095c008c0, 
flag=) at /usr/src/sys/kern/vfs_syscalls.c:2346
#14 0x8049cbb5 in kern_statat (td=) at 
/usr/src/sys/kern/vfs_syscalls.c:2327
#15 0x8049cc7a in lstat (td=) at 
/usr/src/sys/kern/vfs_syscalls.c:2390
#16 0x8043e7dd in syscallenter (td=0xff0095c008c0, 
sa=0xff84867fcbb0) at /usr/src/sys/kern/subr_trap.c:326
#17 0x8066a5eb in syscall (frame=0xff84867fcc50) at 
/usr/src/sys/amd64/amd64/trap.c:916
#18 0x806517f2 in Xfast_syscall () at 
/usr/src/sys/amd64/amd64/exception.S:384

#19 0x3298f75c in ?? ()

The very first idea was to turn off name caching (set debug.vfscache to 
0), but it didn't help. The second idea was to reboot, but it didn't 
help too.


This directory locks fine. It has 10 files and 1 empty directory.

Have you any ideas what is going on? or how to catch the problem?

--
Andrey Zonov

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: directory listing hangs in ufs state

2011-12-13 Thread Jeremy Chadwick
On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:
 Hi,
 
 I've got STABLE-8 (r221983) with mongodb-1.8.1 installed on it.  A
 couple days ago I observed that listing of mongodb directory stuck
 in a few minutes in ufs state. I've run it again with ktrace and
 got following (kdump -R):
 
  91324 ls   0.03 CALL  lstat(0x32c199c8,0x32c19950)
  91324 ls   0.03 NAMI  base.1
  91324 ls   21.357255 STRU  struct stat {dev=116, ino=45125633,
 mode=-rw--- , nlink=1, uid=922, gid=922, rdev=180226648,
 atime=1323709877, stime=1323776461, ctime=1323776461,
 birthtime=1314798592, size=134217728, blksize=16384, blocks=262304,
 flags=0x0 }
  91324 ls   0.14 RET   lstat 0
 
 kgdb backtrace of this process was looked like this:
 
 Thread 297 (Thread 100372):
 #0  sched_switch (td=0xff0095c008c0, newtd=0xff000357b8c0,
 flags=) at /usr/src/sys/kern/sched_ule.c:1866
 #1  0x80406696 in mi_switch (flags=260, newtd=0x0) at
 /usr/src/sys/kern/kern_synch.c:449
 #2  0x8043c072 in sleepq_wait (wchan=0xff0103aaf7f8,
 pri=80) at /usr/src/sys/kern/subr_sleepqueue.c:609
 #3  0x803e4a5a in __lockmgr_args (lk=0xff0103aaf7f8,
 flags=2097408, ilk=0xff0103aaf820, wmesg=) at
 /usr/src/sys/kern/kern_lock.c:220
 #5  0x8061239c in ffs_lock (ap=0xff84867fc550) at lockmgr.h:94
 #5  0x806d2462 in VOP_LOCK1_APV (vop=0x80921fe0,
 a=0xff84867fc550) at vnode_if.c:1988
 #6  0x804a58b7 in _vn_lock (vp=0xff0103aaf760,
 flags=2097152, file=0x80736e70
 /usr/src/sys/kern/vfs_subr.c, line=2137) at vnode_if.h:859
 #7  0x80498bc0 in vget (vp=0xff0103aaf760,
 flags=2097408, td=0xff0095c008c0) at
 /usr/src/sys/kern/vfs_subr.c:2137
 #8  0x804845f4 in cache_lookup (dvp=0xff0095675b10,
 vpp=0xff84867fc910, cnp=0xff84867fc938) at
 /usr/src/sys/kern/vfs_cache.c:587
 #9  0x80484a30 in vfs_cache_lookup (ap=) at
 /usr/src/sys/kern/vfs_cache.c:905
 #10 0x806d2e7c in VOP_LOOKUP_APV (vop=0x80922820,
 a=0xff84867fc790) at vnode_if.c:123
 #11 0x8048bc80 in lookup (ndp=0xff84867fc8e0) at vnode_if.h:54
 #12 0x8048cf0e in namei (ndp=0xff84867fc8e0) at
 /usr/src/sys/kern/vfs_lookup.c:269
 #13 0x8049c972 in kern_statat_vnhook (td=0xff0095c008c0,
 flag=) at /usr/src/sys/kern/vfs_syscalls.c:2346
 #14 0x8049cbb5 in kern_statat (td=) at
 /usr/src/sys/kern/vfs_syscalls.c:2327
 #15 0x8049cc7a in lstat (td=) at
 /usr/src/sys/kern/vfs_syscalls.c:2390
 #16 0x8043e7dd in syscallenter (td=0xff0095c008c0,
 sa=0xff84867fcbb0) at /usr/src/sys/kern/subr_trap.c:326
 #17 0x8066a5eb in syscall (frame=0xff84867fcc50) at
 /usr/src/sys/amd64/amd64/trap.c:916
 #18 0x806517f2 in Xfast_syscall () at
 /usr/src/sys/amd64/amd64/exception.S:384
 #19 0x3298f75c in ?? ()
 
 The very first idea was to turn off name caching (set debug.vfscache
 to 0), but it didn't help. The second idea was to reboot, but it
 didn't help too.
 
 This directory locks fine. It has 10 files and 1 empty directory.
 
 Have you any ideas what is going on? or how to catch the problem?

Assuming this isn't a file on the root filesystem, try booting the
machine in single-user mode and using fsck -f on the filesystem in
question.

Can you verify there's no problems with the disk this file lives on as
well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
thought I'd mention it.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org