Re: [Stable 7] CPIO breakage/

2010-06-15 Thread Sean Bruno
On Tue, 2010-06-15 at 16:30 -0700, Xin LI wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> Hi, Sean,
> 
> On 2010/06/15 15:10, Sean Bruno wrote:
> > http://svn.freebsd.org/viewvc/base?limit_changes=0&view=revision&revision=208361
> > 
> > I'm not sure what's up with this update, but it hosed up the default
> > behavior of cpio.
> [...]
> > We've had to revert this change from our local tree, suggestions?
> 
> Could you please test the attached patch?
> 
> Cheers,
> - -- 
> Xin LI   http://www.delphij.net/

Xin:

I will test it in the morning after my latest builds go out the door.
Thank you for the update and quick response.

Sean

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: [Stable 7] CPIO breakage/

2010-06-15 Thread Doug Barton

On 06/15/10 22:53, Daniel Braniss wrote:

A similar 'security feature' was introduced sometime ago, wich 'silently'
broke firefox instalation , it refused to allow symlinks in destination
directory, of course the error was ignored by 'make install' so it took
some time later to find out that nothing was installed - my /usr/local is
symlinked. The solution was to 'fix' cpio to behave as before, since adding
the ignore-symlinks feature to firefox's makefile was beyond me:-)


I'm sorry to hear that you had problems with this, but I'd like to take 
this opportunity to make a plea that when you (pl.) run into problems 
like this that you report them when they happen. I know that after 
taking time to track down problems the last thing you want to do is take 
MORE time to report them, but the 5 minutes you spend reporting it today 
can save hours for other users down the road.



Doug

--

... and that's just a little bit of history repeating.
-- Propellerheads

Improve the effectiveness of your Internet presence with
a domain name makeover!http://SupersetSolutions.com/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: [Stable 7] CPIO breakage/

2010-06-15 Thread Daniel Braniss
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> On 2010/06/15 17:05, Sean Bruno wrote:
> > On Tue, 2010-06-15 at 17:10 -0500, Sean Bruno wrote:
> >> http://svn.freebsd.org/viewvc/base?limit_changes=0&view=revision&revision=208361
> >>
> >> I'm not sure what's up with this update, but it hosed up the default
> >> behavior of cpio.
> >>
> >> It appears now that -o won't do the same things that it used to:
> >>
> >> + cd /
> >> + find -x .
> >> + egrep -v '^\.(/snap|/usr/sup|/boot/kernel/kernel
> >> \.[[:alpha:]_]+\.[[:digit:]]+|/boot/kernel/kernel
> >> \.old|/etc/start_if.*|/etc/ssh/ssh_host_.*key|/etc/hostid|/etc/(master.passwd|passwd|spwd.db|pwd.db))'
> >> + '[' -n '' ']'
> >> + '[' 7 = 4 ']'
> >> + '[' -n '' -a -z '' ']'
> >> + '[' -n /home/backup ']'
> >> + echo 'dumping / ...'
> >> dumping / ...
> >> + cpio -o --quiet --format crc -O /home/backup/root.amd64.cpio
> >> cpio: ./dev not dumped: minor number would be truncated
> >> cpio: Removing leading `/' from member names
> >> cpio: ./proc not dumped: minor number would be truncated
> >> cpio: Removing leading `../' from member names
> >>
> >> We've had to revert this change from our local tree, suggestions?
> >>
> >> Sean
> > 
> > 
> > A little more background.  It looks like symlinks are getting stripped
> > of their '/' which sucks.  Ideas?
> > 
> > Sean
> > 
> > e.g. /home/foo/bar -> /opt/baz/blob
> > 
> > becomes
> > 
> > home/foo/bar -> opt/baz/blob   
> > 
> > Yuck.
> 
> This is a security measurement I think.
> 
> - --absolute-filenames disables this behavior.

A similar 'security feature' was introduced sometime ago, wich 'silently'
broke firefox instalation , it refused to allow symlinks in destination
directory, of course the error was ignored by 'make install' so it took
some time later to find out that nothing was installed - my /usr/local is 
symlinked. The solution was to 'fix' cpio to behave as before, since adding
the ignore-symlinks feature to firefox's makefile was beyond me :-)

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: [Stable 7] CPIO breakage/

2010-06-15 Thread Scott Long
On Jun 15, 2010, at 6:22 PM, Xin LI wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> On 2010/06/15 17:05, Sean Bruno wrote:
>> On Tue, 2010-06-15 at 17:10 -0500, Sean Bruno wrote:
>>> http://svn.freebsd.org/viewvc/base?limit_changes=0&view=revision&revision=208361
>>> 
>>> I'm not sure what's up with this update, but it hosed up the default
>>> behavior of cpio.
>>> 
>>> It appears now that -o won't do the same things that it used to:
>>> 
>>> + cd /
>>> + find -x .
>>> + egrep -v '^\.(/snap|/usr/sup|/boot/kernel/kernel
>>> \.[[:alpha:]_]+\.[[:digit:]]+|/boot/kernel/kernel
>>> \.old|/etc/start_if.*|/etc/ssh/ssh_host_.*key|/etc/hostid|/etc/(master.passwd|passwd|spwd.db|pwd.db))'
>>> + '[' -n '' ']'
>>> + '[' 7 = 4 ']'
>>> + '[' -n '' -a -z '' ']'
>>> + '[' -n /home/backup ']'
>>> + echo 'dumping / ...'
>>> dumping / ...
>>> + cpio -o --quiet --format crc -O /home/backup/root.amd64.cpio
>>> cpio: ./dev not dumped: minor number would be truncated
>>> cpio: Removing leading `/' from member names
>>> cpio: ./proc not dumped: minor number would be truncated
>>> cpio: Removing leading `../' from member names
>>> 
>>> We've had to revert this change from our local tree, suggestions?
>>> 
>>> Sean
>> 
>> 
>> A little more background.  It looks like symlinks are getting stripped
>> of their '/' which sucks.  Ideas?
>> 
>> Sean
>> 
>> e.g. /home/foo/bar -> /opt/baz/blob
>> 
>> becomes
>> 
>> home/foo/bar -> opt/baz/blob   
>> 
>> Yuck.
> 
> This is a security measurement I think.
> 
> - --absolute-filenames disables this behavior.

This is exactly the kind of stuff that was supposed to be avoided in stable 
branches.  Your import of cpio cost us several days of debugging.

Scott

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: [Stable 7] CPIO breakage/

2010-06-15 Thread Xin LI
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 2010/06/15 17:05, Sean Bruno wrote:
> On Tue, 2010-06-15 at 17:10 -0500, Sean Bruno wrote:
>> http://svn.freebsd.org/viewvc/base?limit_changes=0&view=revision&revision=208361
>>
>> I'm not sure what's up with this update, but it hosed up the default
>> behavior of cpio.
>>
>> It appears now that -o won't do the same things that it used to:
>>
>> + cd /
>> + find -x .
>> + egrep -v '^\.(/snap|/usr/sup|/boot/kernel/kernel
>> \.[[:alpha:]_]+\.[[:digit:]]+|/boot/kernel/kernel
>> \.old|/etc/start_if.*|/etc/ssh/ssh_host_.*key|/etc/hostid|/etc/(master.passwd|passwd|spwd.db|pwd.db))'
>> + '[' -n '' ']'
>> + '[' 7 = 4 ']'
>> + '[' -n '' -a -z '' ']'
>> + '[' -n /home/backup ']'
>> + echo 'dumping / ...'
>> dumping / ...
>> + cpio -o --quiet --format crc -O /home/backup/root.amd64.cpio
>> cpio: ./dev not dumped: minor number would be truncated
>> cpio: Removing leading `/' from member names
>> cpio: ./proc not dumped: minor number would be truncated
>> cpio: Removing leading `../' from member names
>>
>> We've had to revert this change from our local tree, suggestions?
>>
>> Sean
> 
> 
> A little more background.  It looks like symlinks are getting stripped
> of their '/' which sucks.  Ideas?
> 
> Sean
> 
> e.g. /home/foo/bar -> /opt/baz/blob
> 
> becomes
> 
> home/foo/bar -> opt/baz/blob   
> 
> Yuck.

This is a security measurement I think.

- --absolute-filenames disables this behavior.

Cheers,
- -- 
Xin LI http://www.delphij.net/
FreeBSD - The Power to Serve!  Live free or die
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.14 (FreeBSD)

iQEcBAEBCAAGBQJMGBlZAAoJEATO+BI/yjfBAekH/1F/g1JUZWricyMmy2hF5f5x
EvHnp6j5GClGRFm/llh5FdYhEMlS7gYEgcHhT96TSicetgy7Jzs3+Cq7aAcDAXCv
jyHlf0EMvTSDKHO8tDn8EXxlhxiBIRM1iMPvuzKAiH3HqPFufOvK41ozc4dpkXzS
YLtbKUE4heEXIDP6Pm3nMDupc1BAax0JHqCmU7a/Th3WMWBmllpCQmKqfRP8w11i
GLmrQ/nWwX/y7eSKlr9azB/uZr6cCdo4bB+VcuyWO9hyHf5QtLv5peHqAD8iO9Ph
VhyRUzcTlYhBtYHOvStIAyWh3c9WV/D0nsh3+NugajSRMoD9oAVKsLOWSlmtKCw=
=t5Tx
-END PGP SIGNATURE-
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: [Stable 7] CPIO breakage/

2010-06-15 Thread Sean Bruno
On Tue, 2010-06-15 at 17:10 -0500, Sean Bruno wrote:
> http://svn.freebsd.org/viewvc/base?limit_changes=0&view=revision&revision=208361
> 
> I'm not sure what's up with this update, but it hosed up the default
> behavior of cpio.
> 
> It appears now that -o won't do the same things that it used to:
> 
> + cd /
> + find -x .
> + egrep -v '^\.(/snap|/usr/sup|/boot/kernel/kernel
> \.[[:alpha:]_]+\.[[:digit:]]+|/boot/kernel/kernel
> \.old|/etc/start_if.*|/etc/ssh/ssh_host_.*key|/etc/hostid|/etc/(master.passwd|passwd|spwd.db|pwd.db))'
> + '[' -n '' ']'
> + '[' 7 = 4 ']'
> + '[' -n '' -a -z '' ']'
> + '[' -n /home/backup ']'
> + echo 'dumping / ...'
> dumping / ...
> + cpio -o --quiet --format crc -O /home/backup/root.amd64.cpio
> cpio: ./dev not dumped: minor number would be truncated
> cpio: Removing leading `/' from member names
> cpio: ./proc not dumped: minor number would be truncated
> cpio: Removing leading `../' from member names
> 
> We've had to revert this change from our local tree, suggestions?
> 
> Sean


A little more background.  It looks like symlinks are getting stripped
of their '/' which sucks.  Ideas?

Sean

e.g. /home/foo/bar -> /opt/baz/blob

becomes

home/foo/bar -> opt/baz/blob   

Yuck.




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Problems with bge (possibly related to r208993)

2010-06-15 Thread Pyun YongHyeon
On Wed, Jun 16, 2010 at 03:50:39AM +0400, Artem Kim wrote:
> On Wednesday 16 June 2010 03:21:09 Pyun YongHyeon wrote:
> 
> > Hmm, why you need link0 flag? The link0 flag is used to force the
> > interface MASTER. Normally this configuration is automatically
> > done during auto-negotiation such that one is configured as MASTER
> > and the other is configured as SLAVE. If you manually configure
> > this setting you should be very careful not to use the same
> > configuration of MASTER/SLAVE of link partner. If you have to use
> > link0 option, the link partner should be configured to use SLAVE.
> > Normally you should always use auto-negotiation on 1000baseT unless
> > link partner is severely broken to support NWAY.
> > 
> > It seems link partner does not agree on resolved speed/duplex
> > configuration of bge1. Check link partner's resolved link
> > configuration.
> > 
> 
> In any case, now I do not use the flag link0. 
> Now the master/slave is  assigned through auto-negotiation. 
> 
> link0 not been set before the problem occurred. I reset link0 flag on NAS2 
> when I got a problem the first time.
> 
> Now x900 port1.0.12 and bge0 configured automatically.
> 
> >bge1: flags=8843 metric 0 mtu 1500
> >media: Ethernet autoselect (1000baseT )
> >status: active
> 
> >awplus>show int port1.0.12 status
> >Port   Name   Status   Vlan Duplex   Speed Type
> >port1.0.12connected  55 a-full  a-1000 1000BASE-T

Would you try attached patch and let me know what you can see on
your console? The patch will display error code on console. Note,
it may spam your console a lot if you see many RX errors so please
don't apply it on your production server.
Index: sys/dev/bge/if_bge.c
===
--- sys/dev/bge/if_bge.c	(revision 209211)
+++ sys/dev/bge/if_bge.c	(working copy)
@@ -3382,6 +3382,9 @@
 			stdcnt++;
 			m = sc->bge_cdata.bge_rx_std_chain[rxidx];
 			if (cur_rx->bge_flags & BGE_RXBDFLAG_ERROR) {
+#if 1
+printf("%04x ", cur_rx->bge_error_flag);
+#endif
 bge_rxreuse_std(sc, rxidx);
 continue;
 			}
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Problems with bge (possibly related to r208993)

2010-06-15 Thread Artem Kim
On Wednesday 16 June 2010 03:21:09 Pyun YongHyeon wrote:

> Hmm, why you need link0 flag? The link0 flag is used to force the
> interface MASTER. Normally this configuration is automatically
> done during auto-negotiation such that one is configured as MASTER
> and the other is configured as SLAVE. If you manually configure
> this setting you should be very careful not to use the same
> configuration of MASTER/SLAVE of link partner. If you have to use
> link0 option, the link partner should be configured to use SLAVE.
> Normally you should always use auto-negotiation on 1000baseT unless
> link partner is severely broken to support NWAY.
> 
> It seems link partner does not agree on resolved speed/duplex
> configuration of bge1. Check link partner's resolved link
> configuration.
> 

In any case, now I do not use the flag link0. 
Now the master/slave is  assigned through auto-negotiation. 

link0 not been set before the problem occurred. I reset link0 flag on NAS2 
when I got a problem the first time.

Now x900 port1.0.12 and bge0 configured automatically.

>bge1: flags=8843 metric 0 mtu 1500
>media: Ethernet autoselect (1000baseT )
>status: active

>awplus>show int port1.0.12 status
>Port   Name   Status   Vlan Duplex   Speed Type
>port1.0.12connected  55 a-full  a-1000 1000BASE-T
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: [Stable 7] CPIO breakage/

2010-06-15 Thread Xin LI
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi, Sean,

On 2010/06/15 15:10, Sean Bruno wrote:
> http://svn.freebsd.org/viewvc/base?limit_changes=0&view=revision&revision=208361
> 
> I'm not sure what's up with this update, but it hosed up the default
> behavior of cpio.
[...]
> We've had to revert this change from our local tree, suggestions?

Could you please test the attached patch?

Cheers,
- -- 
Xin LI http://www.delphij.net/
FreeBSD - The Power to Serve!  Live free or die
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.14 (FreeBSD)

iQEcBAEBCAAGBQJMGAz/AAoJEATO+BI/yjfBoMsH/09y4W745hnblSDFM3J8zBaa
rQjXnN08vtitqu55uFz1dBwFrP9IvbIU0yCNqOOiYduBvhjQt4IMM/FY+sXDBbHo
U5jZ7jQvu+usw3mewRMrnc1weCQnehyQMC9u5ZCVZYXp2aK/MhOXTX6/faZpxseW
zStQYjAtXOzMQ7oEWV6DBFbwOXaCGybfqNKoygeqTlGDDrdh0RXbXBzeYDmh9FNt
lA4+utFRcDOgupZDP+bDe3tR7Tl/keBFCCOkuBjrYtZaMDePxSWFC9ES2zvOue1c
IekVMB71elpgnUsjv/ryqwLB4SanDB5c/QCCFHtr77FsxJh4muv2ecX2sNj2zvg=
=YFSQ
-END PGP SIGNATURE-
Index: contrib/cpio/src/util.c
===
--- contrib/cpio/src/util.c (revision 209216)
+++ contrib/cpio/src/util.c (working copy)
@@ -1252,8 +1252,25 @@ stat_to_cpio (struct cpio_file_stat *hdr, struct s
   hdr->c_uid = CPIO_UID (st->st_uid);
   hdr->c_gid = CPIO_GID (st->st_gid);
   hdr->c_nlink = st->st_nlink;
-  hdr->c_rdev_maj = major (st->st_rdev);
-  hdr->c_rdev_min = minor (st->st_rdev);
+
+  switch (hdr->c_mode & CP_IFMT)
+  {
+case CP_IFBLK:
+case CP_IFCHR:
+#ifdef CP_IFIFO
+case CP_IFIFO:
+#endif
+#ifdef CP_IFSOCK
+case CP_IFSOCK:
+#endif
+  hdr->c_rdev_maj = major (st->st_rdev);
+  hdr->c_rdev_min = minor (st->st_rdev);
+  break;
+default:
+  hdr->c_rdev_maj = 0;
+  hdr->c_rdev_min = 0;
+  break;
+  }
   hdr->c_mtime = st->st_mtime;
   hdr->c_filesize = st->st_size;
   hdr->c_chksum = 0;
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Problems with bge (possibly related to r208993)

2010-06-15 Thread Pyun YongHyeon
On Wed, Jun 16, 2010 at 02:57:01AM +0400, Artem Kim wrote:
> On Tuesday 15 June 2010 21:50:03 you wrote:
> . . .
> > > nas2 # netstat-ndI bge1
> > > Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop
> > > bge1 1500  00:1 b: 78: a3: 3c: 01 418543876 1972918 0 446063237 0
> > > 0 0 bge1 1500 XX.XX.6.12 XX.XX.6.133 890,306 - - 1,076,833 - - -
> > 
> > Ok, I see very large number of Ierrs here. When you send some packets
> > from other hosts to nas2(bge1), do you see Ierrs counter is
> > increasing?
> . . .
>  
> > It seems RX does not work at all. Because you have zero Drop(from
> > netstat) I think you didn't hit mbuf resource shortage situation.
> > Ierr counter is increased whenever controller drops frames due to
> > receiving errors(e.g. CRC). Given that you have no cabling issue,
> > it could be caused by speed/duplex mismatches between bge1 and link
> > partner. Does the link partner also agrees on resolved speed/duplex
> > of bge1?
> 
> I had some negotiation problems. But the problems were observed on the other 
> NIC - bge0. bge0 is connected to the dlink-3627 and bge1 is not always setup 
> speed/duplex mode correctly. Usually this is solved by link0 setting. Flag 
> link0 I set for bge1 and bge0. Flag link0 used quite a long time (years).
> 

Hmm, why you need link0 flag? The link0 flag is used to force the
interface MASTER. Normally this configuration is automatically
done during auto-negotiation such that one is configured as MASTER
and the other is configured as SLAVE. If you manually configure
this setting you should be very careful not to use the same
configuration of MASTER/SLAVE of link partner. If you have to use
link0 option, the link partner should be configured to use SLAVE.
Normally you should always use auto-negotiation on 1000baseT unless
link partner is severely broken to support NWAY.

It seems link partner does not agree on resolved speed/duplex
configuration of bge1. Check link partner's resolved link
configuration.

> bge1 and bge0 have link0, when I got the problem on NAS2 first time. Then I 
> reset link0 and reboot NAS2. After some time I got the same problem again 
> (current state). However, I do not see any obvious problems with  bge0 <-> AT-
> x900.
> 
> current state of the bge0 link partner:
> 
> awplus>show int port1.0.12
> Interface port1.0.12
>   Scope: both
>   Link is UP, administrative state is UP
>   Thrash-limiting
> Status Not Detected, Action learn-disable, Timeout 1(s)
>   Hardware is Ethernet, address is .cd29.6e09
>   index 5012 metric 1 mru 1522
>   current duplex full, current speed 1000, polarity auto
>   configured duplex auto, configured speed auto
>   
>   VRF Binding: Not bound
>   SNMP link-status traps: Disabled
> input packets 136255660241, bytes 119549292157319, dropped 0, multicast 
> packets 5482013
> output packets 122988526534, bytes 121030195520423, multicast packets 
> 532582 broadcast packets 2198512
> 
> awplus>show int port1.0.12 status
> Port   Name   Status   Vlan Duplex   Speed Type
> port1.0.12connected  55 a-full  a-1000 1000BASE-T
> 
> awplus>sh mac address-table |i port1.0.12
> 55   port1.0.12   001b.78a3.3c01   forward   dynamic
> 
> 
> nas2# ifconfig bge1
> bge1: flags=8843 metric 0 mtu 1500
> 
> options=8009b
> ether 00:1b:78:a3:3c:01
> inet XX.XX.6.133 netmask 0xffc0 broadcast XX.XX.6.191
> media: Ethernet autoselect (1000baseT )
> status: active
> 
> 
> I tried to do ping -i .01 XX.XX.6.133  from other host:
> nas2# netstat -hI bge1 1
> input (bge1)   output
>packets  errs idrops  bytespackets  errs  bytes colls
>  0 0 0  0  0 0  0 0
>  0 1 0  0  0 0  0 0
>  0 0 0  0  0 0  0 0
>  0 0 0  0  0 0  0 0
> ping-> 033 0  0  0 0  0 0
>  094 0  0  0 0  0 0
>  093 0  0  0 0  0 0
>  094 0  0  0 0  0 0
>  094 0  0  0 0  0 0
>  083 0  0  0 0  0 0
>  0 0 0  0  0 0  0 0
> 
> 
> ping -i .01 XX.XX.6.129 from NAS2 (XX.XX.6.129 have static arp-entry):
> 
> nas2# netstat -hI bge1 1
> input (bge1)   output
>packets  errs idrops  bytespackets  errs  bytes colls
>  0 1 0  0  0 0  0 0
>  0 1 0  0  0 0  0 0
>  0 0 0  0  0 0  0 0
> ping->   0 0 0  0  0 0 

Re: Problems with bge (possibly related to r208993)

2010-06-15 Thread Artem Kim
On Tuesday 15 June 2010 21:50:03 you wrote:
. . .
> > nas2 # netstat-ndI bge1
> > Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop
> > bge1 1500  00:1 b: 78: a3: 3c: 01 418543876 1972918 0 446063237 0
> > 0 0 bge1 1500 XX.XX.6.12 XX.XX.6.133 890,306 - - 1,076,833 - - -
> 
> Ok, I see very large number of Ierrs here. When you send some packets
> from other hosts to nas2(bge1), do you see Ierrs counter is
> increasing?
. . .
 
> It seems RX does not work at all. Because you have zero Drop(from
> netstat) I think you didn't hit mbuf resource shortage situation.
> Ierr counter is increased whenever controller drops frames due to
> receiving errors(e.g. CRC). Given that you have no cabling issue,
> it could be caused by speed/duplex mismatches between bge1 and link
> partner. Does the link partner also agrees on resolved speed/duplex
> of bge1?

I had some negotiation problems. But the problems were observed on the other 
NIC - bge0. bge0 is connected to the dlink-3627 and bge1 is not always setup 
speed/duplex mode correctly. Usually this is solved by link0 setting. Flag 
link0 I set for bge1 and bge0. Flag link0 used quite a long time (years).

bge1 and bge0 have link0, when I got the problem on NAS2 first time. Then I 
reset link0 and reboot NAS2. After some time I got the same problem again 
(current state). However, I do not see any obvious problems with  bge0 <-> AT-
x900.

current state of the bge0 link partner:

awplus>show int port1.0.12
Interface port1.0.12
  Scope: both
  Link is UP, administrative state is UP
  Thrash-limiting
Status Not Detected, Action learn-disable, Timeout 1(s)
  Hardware is Ethernet, address is .cd29.6e09
  index 5012 metric 1 mru 1522
  current duplex full, current speed 1000, polarity auto
  configured duplex auto, configured speed auto
  
  VRF Binding: Not bound
  SNMP link-status traps: Disabled
input packets 136255660241, bytes 119549292157319, dropped 0, multicast 
packets 5482013
output packets 122988526534, bytes 121030195520423, multicast packets 
532582 broadcast packets 2198512

awplus>show int port1.0.12 status
Port   Name   Status   Vlan Duplex   Speed Type
port1.0.12connected  55 a-full  a-1000 1000BASE-T

awplus>sh mac address-table |i port1.0.12
55   port1.0.12   001b.78a3.3c01   forward   dynamic


nas2# ifconfig bge1
bge1: flags=8843 metric 0 mtu 1500

options=8009b
ether 00:1b:78:a3:3c:01
inet XX.XX.6.133 netmask 0xffc0 broadcast XX.XX.6.191
media: Ethernet autoselect (1000baseT )
status: active


I tried to do ping -i .01 XX.XX.6.133  from other host:
nas2# netstat -hI bge1 1
input (bge1)   output
   packets  errs idrops  bytespackets  errs  bytes colls
 0 0 0  0  0 0  0 0
 0 1 0  0  0 0  0 0
 0 0 0  0  0 0  0 0
 0 0 0  0  0 0  0 0
ping-> 033 0  0  0 0  0 0
 094 0  0  0 0  0 0
 093 0  0  0 0  0 0
 094 0  0  0 0  0 0
 094 0  0  0 0  0 0
 083 0  0  0 0  0 0
 0 0 0  0  0 0  0 0


ping -i .01 XX.XX.6.129 from NAS2 (XX.XX.6.129 have static arp-entry):

nas2# netstat -hI bge1 1
input (bge1)   output
   packets  errs idrops  bytespackets  errs  bytes colls
 0 1 0  0  0 0  0 0
 0 1 0  0  0 0  0 0
 0 0 0  0  0 0  0 0
ping->   0 0 0  0  0 0  0 0
 040 0  0 62 0   5.9K 0
 093 0  0 89 0   8.5K 0
 091 0  0 89 0   8.5K 0
 091 0  0 88 0   8.4K 0
 091 0  0 89 0   8.5K 0
 092 0  0 88 0   8.4K 0
 093 0  0 88 0   8.4K 0
 092 0  0 89 0   8.5K 0
 0 0 0  0 85 0   8.1K 0
 087 0  0  0 0  0 0


ping -i .01 XX.XX.6.133 from other host:

before:
nas2# netstat -ndI bge1

NameMtu Network   Address  Ipkts Ierrs IdropOpkts 
Oerrs  Coll Drop
bge1   1500   00:1b:78:a3:3c:01 418543876 2042520 0 446111781   
  
0 00
bge1   

[Stable 7] CPIO breakage/

2010-06-15 Thread Sean Bruno
http://svn.freebsd.org/viewvc/base?limit_changes=0&view=revision&revision=208361

I'm not sure what's up with this update, but it hosed up the default
behavior of cpio.

It appears now that -o won't do the same things that it used to:

+ cd /
+ find -x .
+ egrep -v '^\.(/snap|/usr/sup|/boot/kernel/kernel
\.[[:alpha:]_]+\.[[:digit:]]+|/boot/kernel/kernel
\.old|/etc/start_if.*|/etc/ssh/ssh_host_.*key|/etc/hostid|/etc/(master.passwd|passwd|spwd.db|pwd.db))'
+ '[' -n '' ']'
+ '[' 7 = 4 ']'
+ '[' -n '' -a -z '' ']'
+ '[' -n /home/backup ']'
+ echo 'dumping / ...'
dumping / ...
+ cpio -o --quiet --format crc -O /home/backup/root.amd64.cpio
cpio: ./dev not dumped: minor number would be truncated
cpio: Removing leading `/' from member names
cpio: ./proc not dumped: minor number would be truncated
cpio: Removing leading `../' from member names

We've had to revert this change from our local tree, suggestions?

Sean

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Problems with bge (possibly related to r208993)

2010-06-15 Thread Pyun YongHyeon
On Tue, Jun 15, 2010 at 07:09:27AM +0400, Artem Kim wrote:
> On Tuesday 15 June 2010 01:03:43 Pyun YongHyeon wrote:
> > On Sun, Jun 13, 2010 at 07:34:11PM +0400, Artem Kim wrote:
> > > Hi,
> > >
> > > I have two routers (HP DL140G3):
> > >
> > > NAS3 FreeBSD 8.1-PRERELEASE # 0: Thu Jun 3 04:13:07 MSD 2010 i386
> > > NAS2 FreeBSD 8.1-PRERELEASE # 0: Sat Jun 12 16:42:19 UTC 2010 i386
> > > (r208993 included)
> > >
> > > bge0 @ pci0: 19:0:0: class = 0x02 card = 0x3260103c chip = 0x165914e4
> > > rev = 0x11 hdr = 0x00
> > > vendor = 'Broadcom Corporation'
> > > device = 'NetXtreme Gigabit Ethernet PCI Express (BCM5721)'
> > > class = network
> > > subclass = ethernet
> > > bge1 @ pci0: 20:0:0: class = 0x02 card = 0x3260103c chip = 0x165914e4
> > > rev = 0x11 hdr = 0x00
> > > vendor = 'Broadcom Corporation'
> > > device = 'NetXtreme Gigabit Ethernet PCI Express (BCM5721)'
> > > class = network
> > > subclass = ethernet
> > >
> > >
> > > I have some problems with bge on NAS2.
> > >
> > > After some time (about 15 hours) bge1 stops flowing traffic.
> > > NAS3 NAS3 - pppoe server. Through bge1 passes only ip traffic through
> > > bge0 no ip-traffic.
> > > Problems occur only with the bge1 interface on NAS2.
> > >
> > >
> > > Traffic through bge1 not pass until I will not do "ifconfig bge1 down
> > > ifconfig bge1 up".
> > >
> > > When I do "ifconfig bge0 down" NIC does not shutdown:
> > >
> > > nas2 # ifconfig bge1 down
> > > nas2 #
> > > nas2 # ifconfig bge1
> > > bge1: flags = 8843  metric 0 mtu
> > > 1500 options = 8009b
> > > 
> > > ether X
> > > inet YYY netmask 0xffc0 broadcast 
> > > media: Ethernet autoselect (1000baseT )
> > > status: active
> > >
> > > LED also indicates that the NIC is active.
> > >
> > > I left the NAS in a state of "frozen bge1" - and can provide additional
> > > information for diagnosis.
> > 
> > Try run tcpdump on bge1 and see whether driver still see incoming
> > traffic. Also show me the output of "netstat -ndI bge1" and output
> > of "sysctl dev.bge.1.stats". Verbose dmesg output also would be
> > helpful.
> 
> nas2 # netstat-ndI bge1
> Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop
> bge1 1500  00:1 b: 78: a3: 3c: 01 418543876 1972918 0 446063237 0 0 0
> bge1 1500 XX.XX.6.12 XX.XX.6.133 890,306 - - 1,076,833 - - -
> 

Ok, I see very large number of Ierrs here. When you send some packets
from other hosts to nas2(bge1), do you see Ierrs counter is
increasing?

> 
> Should I add additional debugging options?
> 
> nas2 # sysctl dev.bge.1.stats
> sysctl: unknown oid 'dev.bge.1.stats'
> 
> nas2 # sysctl dev.bge.1
> dev.bge.1.% desc: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 
> 0x004101
> dev.bge.1.% driver: bge
> dev.bge.1.% location: slot = 0 function = 0
> dev.bge.1.% pnpinfo: vendor = 0x14e4 device = 0x1659 subvendor = 0x103c 
> subdevice = 0x3260 class = 0x02
> dev.bge.1.% parent: pci20
> dev.bge.1.forced_collapse: 0
> 

Ok, this controller lacks hardware MAC statistics counters.

> I can show verbose dmesg, but this requires a reboot so bge1 come out of the 
> current state.
> 

Ok.

> 
> I looked tcpdump on NAS2 - and I only saw the ARP requests from NAS2 (NAS2 - 
> XX.XX.6.133):
> 
> nas2 # tcpdump -i bge1
> tcpdump: verbose output suppressed, use-v or-vv for full protocol decode
> listening on bge1, link-type EN10MB (Ethernet), capture size 96 bytes
> 01:23:43.063238 ARP, Request who-has XX.XX.6.129 tell XX.XX.6.133, length 28
> 01:23:43.162257 ARP, Request who-has XX.XX.6.129 tell XX.XX.6.133, length 28
> 01:23:43.935016 ARP, Request who-has XX.XX.6.129 tell XX.XX.6.133, length 28
> 
> 
> XX.XX.6.129 is l3-switch(AT-X900) default router for NAS2; bge1 is directly 
> connected to the x900.
> 
> I looked tcpdump on the x900:
> 
> awplus # tcpdump -ni vlanXX host XX.XX.6.133
> 05:36:30.455642 arp who-has XX.XX.6.129 tell XX.XX.6.133
> 05:36:30.455898 arp reply XX.XX.6.129 is-at 00:00: cd: 29:6 e: 09
> 05:36:31.483353 arp who-has XX.XX.6.129 tell XX.XX.6.133
> 05:36:31.483505 arp reply XX.XX.6.129 is-at 00:00: cd: 29:6 e: 09
> 05:36:32.511260 arp who-has XX.XX.6.129 tell XX.XX.6.133
> 05:36:32.511353 arp reply XX.XX.6.129 is-at 00:00: cd: 29:6 e: 09
> 05:36:33.539163 arp who-has XX.XX.6.129 tell XX.XX.6.133
> 
> ARP requests from NAS2 (XX.XX.6.133). But on NAS2 I can _only_ see ARP-
> requests from NAS2.
> 
> I added static arp-entry on NAS2 and do ping XX.XX.6.129.
> 
> Then I looked again at tcpdump on XX.XX.6.129:
> 
> awplus # tcpdump -nei vlanXX host XX.XX.6.133
> 06:13:03.472539 00:00: cd: 29:6 e: 09> ff: ff: ff: ff: ff: ff, ethertype ARP 
> (0x0806), length 42: arp who-has XX.XX.6.133 tell XX. XX.6.129
> 06:13:03.526768 00:1 b: 78: a3: 3c: 01> 00:00: cd: 29:6 e: 09, ethertype IPv4 
> (0x0800), length 98: XX.XX.6.133> XX.XX.6.129: ICMP echo request, id 6958, 
> seq 
> 1920, length 64
> 06:13:04.553728 00:1 b: 78: a3: 3c: 01> 00:0

Re: File system trouble with ICH9 controller

2010-06-15 Thread Robin Sommer

On Thu, Jun 10, 2010 at 14:06 -0700, I wrote:

> Thanks for your quick response. I don't need much in terms of
> long-term data reliability on these machines (thus the RAID 0).
> However, if MatrixRAID is unreliably even without further external
> events (like disk problems/changes), I'll turn it off. 

An update on this: I have now turned off the RAID on half of my
blades, leaving the other half untouched. After a few days, 3 of
those systems still using the RAID have experienced similar fs
corruption as reported before, while all the blades wo/ RAID have
been running fine. 

So, that looks like the RAID is indeed to blame and I'll turn it off
for all systems now.

Robin

-- 
Robin Sommer * Phone +1 (510) 666-2886 * ro...@icir.org 
ICSI/LBNL* Fax   +1 (510) 666-2956 *   www.icir.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: disable RAM parity error

2010-06-15 Thread Andriy Gapon
on 15/06/2010 08:34 Masoom Shaikh said the following:
> Hello List,
> 
> In continuation to my earlier mail, I am now convinced(nearly) that
> the freezes faced by me are "RAM parity error" indeed.
> http://www.mail-archive.com/freebsd-hack...@freebsd.org/msg70730.html
> 
> when using internet via wpi0, FreeBSD freezes are very prone to occur
> compared to when I am using bfe0. But they do occur, due to some
> reason it feels stable when connected via bfe0.
> Over course of time I have accumulated some core.txt. files, which
> I have uploaded to pastebin and their links are below. My question is
> why only FreeBSD minds about RAM parity error check, while latest
> offering from Redmond does not ?
> is there a way to IGNORE RAM parity check ? is it possible to get hold
> of offending address from these core files ?
> while I observe "stack pointer" and "frame pointer" have repeated
> values in core files, I cannot conclude anything from this
> observation. Anybody here care ?

If I were you I'd be concerned with finding and replacing the flaky hardware
instead of wasting time before disaster happens.
As to why "other OS" doesn't report errors - it could have been the other way
around, different OSes have different usage/load patterns and hit different
hardware problems with higher probability.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"