Re: have live system with NFS client cache problems what do i do?

1999-04-13 Thread Maxim Sobolev
?

Alfred Perlstein wrote:

> On Mon, 12 Apr 1999, Maxim Sobolev wrote:
>
> > Alfred Perlstein wrote:
> >
> > > Hey, i was just doing a kernel compile over NFS and i have a weird
> > > situtation.? After compiling everything the linker barfs on linking.
> > >
> > > gensetdefs: cd9660_bmap.o: not an ELF file
> > >
> > > for about 12 files...
> > >
> > > the compile is being done on a laptop that has my desktop's src dir
> > > NFS mounted.
> > >
> > Hey I have pretty the same problems on my 4.0 cvsup'ed and builded few
> > days ago!
> >
> > As NFS server I have 3.1-stable box.
> ?

> Let's try to figure out some other commonalities to assist debugging this.
> can you please fill this in:
> ?

?? Me? You
Server version 4.0-apr6th? 3.1- apr6th
?Netcard?? pn0?? ed0 (Realtek 8029)
?local disk??? da? wd0
?options?? softupdates??? noatime
?nfsd? yes yes
?ram?? 32meg??? 24meg

Client version 4.0-apr9th?? 4.0- apr6th
?Netcard?? ep0 3comIII pcmcia?? ed0 (AR-P500 pcmcia)
?local disk??? wd?? ad0
?options?? softupdates?? noatime
?nfsiod??? don't think so??? don't think so
?ram?? 48meg??? 32meg

> ?
> Mount type:??? mount server:/usr/src
> ??? /usr/src
>
> how bug??? build kernel in
> happened:? NFS mounted /usr/src
> ?? make depend && make -j6 all
>
> bug:?? client sees files are filled
> ?? with zeros, but server has
> ?? non-corrupted files
>
> ?? will not link on client
> ?? links fine on server
> ?? if the files are copied from
> ?? the NFS mount to local disk
> ?? they "un-corrupt" themselves.
>
> well?? I'm tempted to blame the 3com, but that doesn't make sense
> as when you copy to local disk the files seem to become normal again...
> ?

Mount type:??? mount -o noatime server:/usr/src
? mount -o noatime server:/usr/ports

how bug??? build ports and kernel
happened:? NFS mounted /usr/ports
 cd /usr/ports/textproc/docproj
 make all install
 NFS mounted /usr/src
 make depend && make -j4

bug:?? client sees files are filled
 with zeros, but server has
 non-corrupted files

 will not link on client
?? links fine on server
?? if the files are copied from
?? the NFS mount to local disk
?? they "un-corrupt" themselves.
?? very good chance to link correctly is to try run make on client
once more (cache flushes??)




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-12 Thread Alfred Perlstein
On Mon, 12 Apr 1999, Maxim Sobolev wrote:

> Alfred Perlstein wrote:
> 
> > Hey, i was just doing a kernel compile over NFS and i have a weird
> > situtation.  After compiling everything the linker barfs on linking.
> >
> > gensetdefs: cd9660_bmap.o: not an ELF file
> >
> > for about 12 files...
> >
> > the compile is being done on a laptop that has my desktop's src dir
> > NFS mounted.
> >
> Hey I have pretty the same problems on my 4.0 cvsup'ed and builded few
> days ago!
> 
> As NFS server I have 3.1-stable box. 

Let's try to figure out some other commonalities to assist debugging this.  
can you please fill this in:

   Me  You   
Server version 4.0-apr6th   3.1- ???   
 Netcard   pn0 ??
 local diskda
 options   softupdates
 nfsd? yes
 ram   32meg

Client version 4.0-apr9th   4.0- ???
 Netcard   ep0 3comIII pcmcia
 local diskwd
 options   softupdates
 nfsioddon't think so
 ram   48meg

Mount type:mount server:/usr/src
/usr/src

how bugbuild kernel in
happened:  NFS mounted /usr/src
   make depend && make -j6 all

bug:   client sees files are filled
   with zeros, but server has 
   non-corrupted files

   will not link on client
   links fine on server
   if the files are copied from
   the NFS mount to local disk
   they "un-corrupt" themselves.

well?  I'm tempted to blame the 3com, but that doesn't make sense
as when you copy to local disk the files seem to become normal again...

-Alfred




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-12 Thread Maxim Sobolev
Hey I have pretty the same problems on my 4.0 cvsup'ed and builded few
days ago!

As NFS server I have 3.1-stable box. 

Sincerely,

Maxim Sobolev

Alfred Perlstein wrote:

> Hey, i was just doing a kernel compile over NFS and i have a weird
> situtation.  After compiling everything the linker barfs on linking.
>
> gensetdefs: cd9660_bmap.o: not an ELF file
>
> for about 12 files...
>
> the compile is being done on a laptop that has my desktop's src dir
> NFS mounted.
>
> the card in the laptop is a 3comIII and the dekstop a pn0
>
> doing a 'file cd9660_bmap.o' on laptop (NFS client) gives me a
> cd9660_bmap.o: MS Windows COFF Unknown CPU
>
> while on the server:
> strlen.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (FreeBSD), not 
> stri
>
> it seems my client is having a really tough time with these files, i have
> copied the files that the client has corrupted to localdisk but now
> they seem fine...
>
> the client laptop is unable to link the kernel it's getting tons of
> corrupted data is seems, on the server i am able to link the kernel just
> fine.
>
> i'm using the default NFS mounts and no nfsiod.
>
> enabling nfsiod didn't help, however unmounting the NFS share and
> remounting seems to have fixed it
>
> i guess i should have taken a crash dump when the system was all
> hosed but it's fine now... *sigh*
>
> Alfred Perlstein - Admin, coder, and admirer of all things BSD.
> -- There are operating systems, and then there's FreeBSD.
> -- http://www.freebsd.org/4.0-current
>
> To Unsubscribe: send mail to majord...@freebsd.org
> with "unsubscribe freebsd-current" in the body of the message


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-11 Thread Brian Feldman
On Mon, 12 Apr 1999, Stephen McKay wrote:

> On Sunday, 11th April 1999, Brian Feldman wrote:
> 
> >This has nothing to do with DOS. In case you didn't get my other hint:
> >{"/home/green"}$ dd if=/dev/zero count=1 2>/dev/null | file -
> >standard input:  MS Windows COFF Unknown CPU
> 
> Don't ya just hate it when your mail is slow!  Sigh...

Yep ;)

You know, I think a much better idea for being in magic(5) would be
a check for lots of NULLs and calling the file "NULL data", rather than
MS Windows COFF Unknown CPU. This identification is a bug in the magic
file, really, since you can't call a short 0x any kind of magic
number!

> 
> Stephen.
> 
> 
> To Unsubscribe: send mail to majord...@freebsd.org
> with "unsubscribe freebsd-current" in the body of the message
> 

 Brian Feldman_ __ ___   ___ ___ ___  
 gr...@unixhelp.org_ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!  _ __ | _ \__ \ |) |
 http://www.freebsd.org   _ |___/___/___/ 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-11 Thread Stephen McKay
On Sunday, 11th April 1999, Brian Feldman wrote:

>This has nothing to do with DOS. In case you didn't get my other hint:
>{"/home/green"}$ dd if=/dev/zero count=1 2>/dev/null | file -
>standard input:  MS Windows COFF Unknown CPU

Don't ya just hate it when your mail is slow!  Sigh...

Stephen.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-11 Thread Stephen McKay
On Sunday, 11th April 1999, Alfred Perlstein wrote:

>On Sun, 11 Apr 1999, Matthew Dillon wrote:
>
>> doing a 'file cd9660_bmap.o' on laptop (NFS client) gives me a 
>> cd9660_bmap.o: MS Windows COFF Unknown CPU
>> 
>> An MS Windows binary?  Do you have any msdos mounts on
>> the client or server?  How is /usr/obj mounted?

>no i have no msdos mounted filesystems, i do however have an
>unmounted win98 partition and a cdrom with joliet extentions mounted
>however the cdrom only contains mp3s.

This is a red herring:

$ dd if=/dev/zero of=foo count=1
1+0 records in
1+0 records out
512 bytes transferred in 0.000114 secs (4487949 bytes/sec)
$ file foo
foo: MS Windows COFF Unknown CPU
$

Look for the usual pack-of-nulls corruption instead.

Stephen.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-11 Thread Matthew Dillon
:> corruption is occuring on the client.  But if the make procedure is not
:> accessing (much of) the client's hard drive, where on the client could 
:> the corruption be coming from?
:
:This has nothing to do with DOS. In case you didn't get my other hint:
:{"/home/green"}$ dd if=/dev/zero count=1 2>/dev/null | file -
:standard input:  MS Windows COFF Unknown CPU

Ah.  In that case, next time it happens you need to make a copy of
the broken file and a copy of the good file and save them, so we 
can look at the hexdumps.

If 'cp' does not copy the corrupted file, try using 'cat' to copy 
it.  'cp' uses mmap.  'cat' does not.

There are still a few minor issues when mmap()ing NFS files, but
none that ought to effect the beginning of the file.  We'll have
to figure out a way to reproduce the problem to really track it
down.


-Matt




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-11 Thread Brian Feldman
On Sun, 11 Apr 1999, Matthew Dillon wrote:

> :/usr/obj isn't mounted, i'm just compiling a kernel.
> :
> :no i have no msdos mounted filesystems, i do however have an
> :unmounted win98 partition and a cdrom with joliet extentions mounted
> :however the cdrom only contains mp3s.
> :
> :After doing more data manipulation (copying files around to flush
> :the NFS cache) it seems to reload the data then it finds them ok
> :and tries to link, during the link i get missing references to
> :several symbols, symbol sizes changed etc etc...
> :
> :Just seems like bad data.
> :
> :Now if i just go into the dir on the server and link the kernel
> :it's fine, no problems whatsoever.  (compile on local disk)
> :
> :-Alfred
> :
> :PS, i suspect the 3comIII card in the laptop # ep: 3Com 3C509 (buggy)
> :
> :NFS perfomance and stability in 3.1-stable and 4.0 have been surperb lately.
> :one reboot when i was killing a low ram NFS server a few weeks ago and just
> :this today (which could be the NIC) otherwise very impressive.
> 
> Ok.  This is something to watch, then.  The corruption is worrysome.  It
> is odd to get this sort of corruption in a client's buffer cache when
> all the file I/O is running over NFS.   The real question is : where did
> the corruption come from?  The client's IDE drive or something originally
> on the server?
> 
> Are there any dos partitions on the server at all?  If not, then the
> corruption is occuring on the client.  But if the make procedure is not
> accessing (much of) the client's hard drive, where on the client could 
> the corruption be coming from?

This has nothing to do with DOS. In case you didn't get my other hint:
{"/home/green"}$ dd if=/dev/zero count=1 2>/dev/null | file -
standard input:  MS Windows COFF Unknown CPU


> 
>   -Matt
>   Matthew Dillon 
>   
> 
> 
> 
> To Unsubscribe: send mail to majord...@freebsd.org
> with "unsubscribe freebsd-current" in the body of the message
> 

 Brian Feldman_ __ ___   ___ ___ ___  
 gr...@unixhelp.org_ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!  _ __ | _ \__ \ |) |
 http://www.freebsd.org   _ |___/___/___/ 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-11 Thread Brian Feldman
On Sun, 11 Apr 1999, Matthew Dillon wrote:

> :-current as of tuesday night. although the laptop is now moved
> :to -current as of today.
> :
> :i have 192.168.1.44:/usr/src on /usr/src
> :
> :this is only building the kernel in /usr/src/sys/compile/laptop
> :
> :server:
> :FreeBSD myname.my.domain 4.0-CURRENT FreeBSD 4.0-CURRENT #0: Fri Apr  9 
> 11:34:01 PDT 1999 bri...@myname.my.domain:/usr/src/sys/compile/halah  i386
> :
> :client:
> :FreeBSD myname.my.domain 4.0-CURRENT FreeBSD 4.0-CURRENT #1: Sun Apr 11 
> 17:46:19 PDT 1999 bri...@myname.my.domain:/usr/src/sys/compile/laptop  
> i386
> :
> :i think it may be easily reproducable.  
> :
> :-Alfred
> 
> This is very odd:
> 
> doing a 'file cd9660_bmap.o' on laptop (NFS client) gives me a 
> cd9660_bmap.o: MS Windows COFF Unknown CPU
> 
> An MS Windows binary?  Do you have any msdos mounts on
> the client or server?  How is /usr/obj mounted?

Hey Matt,
0   leshort 0x  MS Windows COFF Unknown CPU

;)

> 
>   -Matt
>   Matthew Dillon 
>   
> 
> 
> 
> To Unsubscribe: send mail to majord...@freebsd.org
> with "unsubscribe freebsd-current" in the body of the message
> 

 Brian Feldman_ __ ___   ___ ___ ___  
 gr...@unixhelp.org_ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!  _ __ | _ \__ \ |) |
 http://www.freebsd.org   _ |___/___/___/ 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-11 Thread Matthew Dillon
:/usr/obj isn't mounted, i'm just compiling a kernel.
:
:no i have no msdos mounted filesystems, i do however have an
:unmounted win98 partition and a cdrom with joliet extentions mounted
:however the cdrom only contains mp3s.
:
:After doing more data manipulation (copying files around to flush
:the NFS cache) it seems to reload the data then it finds them ok
:and tries to link, during the link i get missing references to
:several symbols, symbol sizes changed etc etc...
:
:Just seems like bad data.
:
:Now if i just go into the dir on the server and link the kernel
:it's fine, no problems whatsoever.  (compile on local disk)
:
:-Alfred
:
:PS, i suspect the 3comIII card in the laptop # ep: 3Com 3C509 (buggy)
:
:NFS perfomance and stability in 3.1-stable and 4.0 have been surperb lately.
:one reboot when i was killing a low ram NFS server a few weeks ago and just
:this today (which could be the NIC) otherwise very impressive.

Ok.  This is something to watch, then.  The corruption is worrysome.  It
is odd to get this sort of corruption in a client's buffer cache when
all the file I/O is running over NFS.   The real question is : where did
the corruption come from?  The client's IDE drive or something originally
on the server?

Are there any dos partitions on the server at all?  If not, then the
corruption is occuring on the client.  But if the make procedure is not
accessing (much of) the client's hard drive, where on the client could 
the corruption be coming from?

-Matt
Matthew Dillon 




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-11 Thread Alfred Perlstein
On Sun, 11 Apr 1999, Matthew Dillon wrote:

> :-current as of tuesday night. although the laptop is now moved
> :to -current as of today.
> :
> :i have 192.168.1.44:/usr/src on /usr/src
> :
> :this is only building the kernel in /usr/src/sys/compile/laptop
> :
> :server:
> :FreeBSD myname.my.domain 4.0-CURRENT FreeBSD 4.0-CURRENT #0: Fri Apr  9 
> 11:34:01 PDT 1999 bri...@myname.my.domain:/usr/src/sys/compile/halah  i386
> :
> :client:
> :FreeBSD myname.my.domain 4.0-CURRENT FreeBSD 4.0-CURRENT #1: Sun Apr 11 
> 17:46:19 PDT 1999 bri...@myname.my.domain:/usr/src/sys/compile/laptop  
> i386
> :
> :i think it may be easily reproducable.  
> :
> :-Alfred
> 
> This is very odd:
> 
> doing a 'file cd9660_bmap.o' on laptop (NFS client) gives me a 
> cd9660_bmap.o: MS Windows COFF Unknown CPU
> 
> An MS Windows binary?  Do you have any msdos mounts on
> the client or server?  How is /usr/obj mounted?

/usr/obj isn't mounted, i'm just compiling a kernel.

no i have no msdos mounted filesystems, i do however have an
unmounted win98 partition and a cdrom with joliet extentions mounted
however the cdrom only contains mp3s.

After doing more data manipulation (copying files around to flush
the NFS cache) it seems to reload the data then it finds them ok
and tries to link, during the link i get missing references to
several symbols, symbol sizes changed etc etc...

Just seems like bad data.

Now if i just go into the dir on the server and link the kernel
it's fine, no problems whatsoever.  (compile on local disk)

-Alfred

PS, i suspect the 3comIII card in the laptop # ep: 3Com 3C509 (buggy)

NFS perfomance and stability in 3.1-stable and 4.0 have been surperb lately.
one reboot when i was killing a low ram NFS server a few weeks ago and just
this today (which could be the NIC) otherwise very impressive.




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-11 Thread Matthew Dillon
:-current as of tuesday night. although the laptop is now moved
:to -current as of today.
:
:i have 192.168.1.44:/usr/src on /usr/src
:
:this is only building the kernel in /usr/src/sys/compile/laptop
:
:server:
:FreeBSD myname.my.domain 4.0-CURRENT FreeBSD 4.0-CURRENT #0: Fri Apr  9 
11:34:01 PDT 1999 bri...@myname.my.domain:/usr/src/sys/compile/halah  i386
:
:client:
:FreeBSD myname.my.domain 4.0-CURRENT FreeBSD 4.0-CURRENT #1: Sun Apr 11 
17:46:19 PDT 1999 bri...@myname.my.domain:/usr/src/sys/compile/laptop  i386
:
:i think it may be easily reproducable.  
:
:-Alfred

This is very odd:

doing a 'file cd9660_bmap.o' on laptop (NFS client) gives me a 
cd9660_bmap.o: MS Windows COFF Unknown CPU

An MS Windows binary?  Do you have any msdos mounts on
the client or server?  How is /usr/obj mounted?

-Matt
Matthew Dillon 




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-11 Thread Alfred Perlstein
On Sun, 11 Apr 1999, Matthew Dillon wrote:

> :Hey, i was just doing a kernel compile over NFS and i have a weird
> :situtation.  After compiling everything the linker barfs on linking.
> :
> :gensetdefs: cd9660_bmap.o: not an ELF file
> 
> What exact release of the kernel is running on the client and on the
> server?
> 
> What is being NFS mounted?  src tree?  obj tree?  both?

-current as of tuesday night. although the laptop is now moved
to -current as of today.

i have 192.168.1.44:/usr/src on /usr/src

this is only building the kernel in /usr/src/sys/compile/laptop

server:
FreeBSD myname.my.domain 4.0-CURRENT FreeBSD 4.0-CURRENT #0: Fri Apr  9 
11:34:01 PDT 1999 bri...@myname.my.domain:/usr/src/sys/compile/halah  i386

client:
FreeBSD myname.my.domain 4.0-CURRENT FreeBSD 4.0-CURRENT #1: Sun Apr 11 
17:46:19 PDT 1999 bri...@myname.my.domain:/usr/src/sys/compile/laptop  i386

i think it may be easily reproducable.  

-Alfred





To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: have live system with NFS client cache problems what do i do?

1999-04-11 Thread Matthew Dillon
:Hey, i was just doing a kernel compile over NFS and i have a weird
:situtation.  After compiling everything the linker barfs on linking.
:
:gensetdefs: cd9660_bmap.o: not an ELF file

What exact release of the kernel is running on the client and on the
server?

What is being NFS mounted?  src tree?  obj tree?  both?

-Matt
Matthew Dillon 




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message