Re: Disk/FS I/O issues in -CURRENT

2003-07-02 Thread Alan L. Cox
David O'Brien wrote:
> 
> On Mon, Jun 30, 2003 at 03:30:06PM -0500, Alan L. Cox wrote:
> > I've been able to reproduce what I believe is the problem.  (In my case,
> > I reset my machine and watched the background fsck slowly grind to a
> > halt.  Foreground fsck is fine.)
> >
> > The problem actually appears to be in vm_page_alloc(), not
> > vm_pageout.c.  Look for a commit to resolve this in a few hours.
> 
> Has this been committed.  I *so* ran into this yesterday (3 times) with a
> 70gb /usr and a June 30th kernel.
> 

Yes.

Alan
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Disk/FS I/O issues in -CURRENT

2003-07-02 Thread David O'Brien
On Mon, Jun 30, 2003 at 03:30:06PM -0500, Alan L. Cox wrote:
> I've been able to reproduce what I believe is the problem.  (In my case,
> I reset my machine and watched the background fsck slowly grind to a
> halt.  Foreground fsck is fine.)
> 
> The problem actually appears to be in vm_page_alloc(), not
> vm_pageout.c.  Look for a commit to resolve this in a few hours.

Has this been committed.  I *so* ran into this yesterday (3 times) with a
70gb /usr and a June 30th kernel.
 
-- 
-- David  ([EMAIL PROTECTED])
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Disk/FS I/O issues in -CURRENT

2003-07-01 Thread Eirik Oeverby
Thanks. I'll be keeping my eyes wide open for this one.

/Eirik

On Mon, 30 Jun 2003 15:30:06 -0500
"Alan L. Cox" <[EMAIL PROTECTED]> wrote:

> Peter Holm wrote:
> > 
> > On Mon, Jun 30, 2003 at 03:26:13PM +0200, Eirik Oeverby wrote:
> > > Hi,
> > >
> > > Good to see I'm not the only one.
> > > I'm currently going back to a kernel dated 2003.06.27.12.00.00,
> > > and I'll test again with that one.
> > >
> > 
> > Ok.
> > 
> > I see that alc@ made some recent changes to the vm (vm_pageout.c).
> > I don't know if there's any connection to this problem?
> > 
> 
> I've been able to reproduce what I believe is the problem.  (In my
> case, I reset my machine and watched the background fsck slowly grind
> to a halt.  Foreground fsck is fine.)
> 
> The problem actually appears to be in vm_page_alloc(), not
> vm_pageout.c.  Look for a commit to resolve this in a few hours.
> 
> Regards,
> Alan
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to
> "[EMAIL PROTECTED]"




pgp0.pgp
Description: PGP signature


Re: Disk/FS I/O issues in -CURRENT

2003-06-30 Thread Alan L. Cox
Peter Holm wrote:
> 
> On Mon, Jun 30, 2003 at 03:26:13PM +0200, Eirik Oeverby wrote:
> > Hi,
> >
> > Good to see I'm not the only one.
> > I'm currently going back to a kernel dated 2003.06.27.12.00.00, and I'll test
> > again with that one.
> >
> 
> Ok.
> 
> I see that alc@ made some recent changes to the vm (vm_pageout.c).
> I don't know if there's any connection to this problem?
> 

I've been able to reproduce what I believe is the problem.  (In my case,
I reset my machine and watched the background fsck slowly grind to a
halt.  Foreground fsck is fine.)

The problem actually appears to be in vm_page_alloc(), not
vm_pageout.c.  Look for a commit to resolve this in a few hours.

Regards,
Alan
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Disk/FS I/O issues in -CURRENT

2003-06-30 Thread Eirik Oeverby
Hi,

I've found a kernel that works, from the 20th of June. 
The regression happened somewhere between 2003.06.20.12.00.00 and
2003.06.27.12.00.00 ... Hope that helps. :)

/Eirik

On Mon, 30 Jun 2003 10:27:08 -0400
Josh Elsasser <[EMAIL PROTECTED]> wrote:

> I think I experienced the same bug on my Sony Vaio FX200 with -CURRENT
> from Sat Jun 28.  I had an unrelated panic, and after rebooting, the
> machine locked up after a minute or so during the background fsck.
> After rebooting several times, I finally had to boot it single-user
> and fsck -y, which did not lock it up.  Perhaps creating/using the
> filesystem snapshots was triggering the lockup.
> 
>  -jre
> 
> On Mon, Jun 30, 2003 at 03:42:58PM +0200, Eirik Oeverby wrote:
> > Hi,
> > 
> > Kernel from 27.06 has same behaviour. I would prefer not to have to
> > install yet another kernel right now, since I need to get some work
> > done. If anyone else has any possible clues as to when this
> > regression happened, that would help me (or whoever else would want
> > to test by adding date=.mm.dd.hh.mm.ss to their supfile)
> > determine what date to pick for testing.
> > 
> > /Eirik
> > 
> > On Mon, 30 Jun 2003 23:32:26 +1000
> > Mark Sergeant <[EMAIL PROTECTED]> wrote:
> > 
> > > I get the same problem on a smp machine and my laptop both running
> > > kernels as from today.
> > > 
> > > On Mon, Jun 30, 2003 at 03:26:13PM +0200, Eirik Oeverby wrote:
> > > > Hi,
> > > > 
> > > > Good to see I'm not the only one.
> > > > I'm currently going back to a kernel dated 2003.06.27.12.00.00,
> > > > and I'll test again with that one.
> > > > 
> > > > /Eirik
> > > > 
> > > > On Mon, 30 Jun 2003, Peter Holm wrote:
> > > ___
> > > [EMAIL PROTECTED] mailing list
> > > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > > To unsubscribe, send any mail to
> > > "[EMAIL PROTECTED]"
> > 
> > 
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to
> "[EMAIL PROTECTED]"




pgp0.pgp
Description: PGP signature


Re: Disk/FS I/O issues in -CURRENT

2003-06-30 Thread Josh Elsasser
I think I experienced the same bug on my Sony Vaio FX200 with -CURRENT
from Sat Jun 28.  I had an unrelated panic, and after rebooting, the
machine locked up after a minute or so during the background fsck.
After rebooting several times, I finally had to boot it single-user
and fsck -y, which did not lock it up.  Perhaps creating/using the
filesystem snapshots was triggering the lockup.

 -jre

On Mon, Jun 30, 2003 at 03:42:58PM +0200, Eirik Oeverby wrote:
> Hi,
> 
> Kernel from 27.06 has same behaviour. I would prefer not to have to
> install yet another kernel right now, since I need to get some work
> done. If anyone else has any possible clues as to when this regression
> happened, that would help me (or whoever else would want to test by
> adding date=.mm.dd.hh.mm.ss to their supfile) determine what date to
> pick for testing.
> 
> /Eirik
> 
> On Mon, 30 Jun 2003 23:32:26 +1000
> Mark Sergeant <[EMAIL PROTECTED]> wrote:
> 
> > I get the same problem on a smp machine and my laptop both running
> > kernels as from today.
> > 
> > On Mon, Jun 30, 2003 at 03:26:13PM +0200, Eirik Oeverby wrote:
> > > Hi,
> > > 
> > > Good to see I'm not the only one.
> > > I'm currently going back to a kernel dated 2003.06.27.12.00.00, and
> > > I'll test again with that one.
> > > 
> > > /Eirik
> > > 
> > > On Mon, 30 Jun 2003, Peter Holm wrote:
> > ___
> > [EMAIL PROTECTED] mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to
> > "[EMAIL PROTECTED]"
> 
> 
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Disk/FS I/O issues in -CURRENT

2003-06-30 Thread Eirik Oeverby
Hi,

Kernel from 27.06 has same behaviour. I would prefer not to have to
install yet another kernel right now, since I need to get some work
done. If anyone else has any possible clues as to when this regression
happened, that would help me (or whoever else would want to test by
adding date=.mm.dd.hh.mm.ss to their supfile) determine what date to
pick for testing.

/Eirik

On Mon, 30 Jun 2003 23:32:26 +1000
Mark Sergeant <[EMAIL PROTECTED]> wrote:

> I get the same problem on a smp machine and my laptop both running
> kernels as from today.
> 
> On Mon, Jun 30, 2003 at 03:26:13PM +0200, Eirik Oeverby wrote:
> > Hi,
> > 
> > Good to see I'm not the only one.
> > I'm currently going back to a kernel dated 2003.06.27.12.00.00, and
> > I'll test again with that one.
> > 
> > /Eirik
> > 
> > On Mon, 30 Jun 2003, Peter Holm wrote:
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to
> "[EMAIL PROTECTED]"




pgp0.pgp
Description: PGP signature


Re: Disk/FS I/O issues in -CURRENT

2003-06-30 Thread Mark Sergeant
I get the same problem on a smp machine and my laptop both running
kernels as from today.

On Mon, Jun 30, 2003 at 03:26:13PM +0200, Eirik Oeverby wrote:
> Hi,
> 
> Good to see I'm not the only one.
> I'm currently going back to a kernel dated 2003.06.27.12.00.00, and I'll test
> again with that one.
> 
> /Eirik
> 
> On Mon, 30 Jun 2003, Peter Holm wrote:
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Disk/FS I/O issues in -CURRENT

2003-06-30 Thread Eirik Oeverby
Hi,

Good to see I'm not the only one.
I'm currently going back to a kernel dated 2003.06.27.12.00.00, and I'll test
again with that one.

/Eirik

On Mon, 30 Jun 2003, Peter Holm wrote:

> On Mon, Jun 30, 2003 at 01:42:15PM +0200, Eirik Oeverby wrote:
> > Hi folks,
> >
> > I am having some very weird problems on my laptop,
>
> I can repeat the problem (noticed with savecore) on a
> kernel from Jun 30 05:23 UTC:
>
> current# df -h .
> FilesystemSize   Used  Avail Capacity  Mounted on
> /dev/ad0s1f   8.2G   1.9G   5.6G25%/usr
> current# dd if=/dev/zero of=100mb bs=1024 count=102400
> load: 4.04  cmd: dd 25063 [running] 0.33u 28.67s 4% 100k
> 97657+0 records in
> 97657+0 records out
> 10768 bytes transferred in 48.549837 secs (2059755 bytes/sec)
>
> db> ps
>   pid   proc addruid  ppid  pgrp  flag   stat  wmesgwchan  cmd
> 25063 c1e5d3c8 cd19d0000 25060 25063 0004002 [RUNQ] dd
> 8 c197ed3c ccc9e0000 0 0 204 [CPU 0] pagedaemon
>
> db> t 8
> siointr1(c0b6c800,0,c051fcc5,693,c867abf0) at siointr1+0xd5
> siointr(c0b6c800) at siointr+0x35
> Xfastintr4() at Xfastintr4+0x63
> --- interrupt, eip = 0xc0377480, esp = 0xc867abdc, ebp = 0xc867abf0 ---
> strncmp(c051e785,c051df68,123,0,477b) at strncmp
> witness_unlock(c059c280,8,c051e785,35c,1) at witness_unlock+0x5a
> _mtx_unlock_flags(c059c280,0,c051e785,35c,c0ba69ec) at _mtx_unlock_flags+0x80
> vm_pageout_scan(0,0,c051e785,5dd,1f4) at vm_pageout_scan+0x40c
> vm_pageout(0,c867ad48,c0505bfc,312,0) at vm_pageout+0x2ce
> fork_exit(c04688d0,0,c867ad48) at fork_exit+0xc0
> fork_trampoline() at fork_trampoline+0x1a
> --- trap 0x1, eip = 0, esp = 0xc867ad7c, ebp = 0 ---
> db>
>
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"
>

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Disk/FS I/O issues in -CURRENT

2003-06-30 Thread Peter Holm
On Mon, Jun 30, 2003 at 01:42:15PM +0200, Eirik Oeverby wrote:
> Hi folks,
> 
> I am having some very weird problems on my laptop,

I can repeat the problem (noticed with savecore) on a
kernel from Jun 30 05:23 UTC:

current# df -h .
FilesystemSize   Used  Avail Capacity  Mounted on
/dev/ad0s1f   8.2G   1.9G   5.6G25%/usr
current# dd if=/dev/zero of=100mb bs=1024 count=102400
load: 4.04  cmd: dd 25063 [running] 0.33u 28.67s 4% 100k
97657+0 records in
97657+0 records out
10768 bytes transferred in 48.549837 secs (2059755 bytes/sec)

db> ps
  pid   proc addruid  ppid  pgrp  flag   stat  wmesgwchan  cmd
25063 c1e5d3c8 cd19d0000 25060 25063 0004002 [RUNQ] dd
8 c197ed3c ccc9e0000 0 0 204 [CPU 0] pagedaemon

db> t 8
siointr1(c0b6c800,0,c051fcc5,693,c867abf0) at siointr1+0xd5
siointr(c0b6c800) at siointr+0x35
Xfastintr4() at Xfastintr4+0x63
--- interrupt, eip = 0xc0377480, esp = 0xc867abdc, ebp = 0xc867abf0 ---
strncmp(c051e785,c051df68,123,0,477b) at strncmp
witness_unlock(c059c280,8,c051e785,35c,1) at witness_unlock+0x5a
_mtx_unlock_flags(c059c280,0,c051e785,35c,c0ba69ec) at _mtx_unlock_flags+0x80
vm_pageout_scan(0,0,c051e785,5dd,1f4) at vm_pageout_scan+0x40c
vm_pageout(0,c867ad48,c0505bfc,312,0) at vm_pageout+0x2ce
fork_exit(c04688d0,0,c867ad48) at fork_exit+0xc0
fork_trampoline() at fork_trampoline+0x1a
--- trap 0x1, eip = 0, esp = 0xc867ad7c, ebp = 0 ---
db> 

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Disk/FS I/O issues in -CURRENT

2003-06-30 Thread Eirik Oeverby
Hi folks,

I am having some very weird problems on my laptop, which I at first
thought were due to a failing drive. However after a lot of testing and
trial and error I've concluded that it must be a software problem - I'm
smelling filesystem issues or something.

Whenever I transfer large files to or from my harddrive, I will *always*
get a lockup after ~480 MB transferred. Example scenario:
- I use another machine to FTP to my laptop.
- I start downloading a file of about 700 MB, and after ~480 MB it
simply stops. The HD led on my laptop goes off, I can switch VTs but
nothing else. Even keyboard input is non-functional.
- After a few seconds another small burst of data (perhaps a few hundred
KB or something) will go through, keyboard buffers will be processed,
and then it freezes again with only one option - power off. C-A-D is a
no-go. The IP stack is up though, I can ping the machine, and I can
still change VTs.
- If I pause/cancel the download immediately when it freezes, my laptop
will come back to life a few seconds later. I can then start downloading
another file, or resume the file I had started - everything seems to be
back to normal. But whenever I hit ~480 MB transferred from *one* file,
it will freeze again.
- Downloading many smaller files in one batch, that sum up to >500 MB,
is not a problem.

All the above points applies to data transfer in the opposite direction
aswell. 
Another way to reproduce this is doing a 'dd if=/dev/random of=testfile
bs=8192', or the other way round - eventually (and after between 450 and
500 MB - a bit hard to tell) it will freeze.
However, if I use a partition as the output instead of a file, it will
*NOT* encounter any such problems.
Transferring files between two partitions on the same HD also seems to
work fine - atleast when using Midnight Commander.

I first encountered the problem when using unison to sync my /home with
another machine, and it got a hickup when it started finding these large
files (ISOs and the like) and was scanning them to produce a checksum.
Once I moved the large (>450 MB) files out of the way, it worked nicely.
When I started copying the files over manually, I experienced the
freezes again. Moving the files to another partition before copying it
to the external machine did not help the issue.

I have ran the Drive Fitness Test from IBM (this is a ThinkPad T21)
twice, and it finds no errors whatsoever. And I know for a fact that
this problem did *not* occur with 5.1-RELEASE, as it was with that
version I copied the files to my disk in the first place.

I hope this information makes sense to someone. I suppose this is the
punishment for following -CURRENT - but unfortunately 5.1-RELEASE had
other issues that I couldn't live with.


Best regards,
/Eirik




pgp0.pgp
Description: PGP signature