Re: rsync corrupted MAC

2011-10-11 Thread Larry Rosenman
They are not local to each other. See the diagram. They are across the internet 
from each other.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Jack Vogel jfvo...@gmail.com wrote:

Well, for a start I'd get both interfaces at the same speed, sounds like a 
hardware
issue of some sort, cable or switch maybe?

Jack


On Mon, Oct 10, 2011 at 5:42 PM, Larry Rosenman l...@lerctr.org wrote:

On Mon, 10 Oct 2011, Jeremy Chadwick wrote:

On Mon, Oct 10, 2011 at 04:15:25PM -0500, Larry Rosenman wrote:

On 10/10/2011 3:57 PM, Louis Mamakos wrote:

On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote:

On 10/10/2011 10:47 AM, John Baldwin wrote:

On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:

Any ideas on which side or what might be broke here?

ler/MAIL-ARCHIVE/2008/12/INBOX
Corrupted MAC on input.
Disconnecting: Packet corrupt
rsync: connection unexpectedly closed (33845045 bytes received so far)

[receiver]

rsync error: error in rsync protocol data stream (code 12) at io.c(605)

[receiver=3.0.9]

rsync: connection unexpectedly closed (1450 bytes received so far)

[generator]

rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9]

I've had somewhat similar issues (ssh getting corruption in its data stream)
when a NIC in my netbook was corrupting packet data when it ran at 1G (it
worked fine at 10/100).  Pyun eventually fixed the issue by applying enough
workarounds (it was likely a hardware bug in the NIC's chipset).  However, it
wasn't easy to debug unfortunately. :(

Any ideas on where to start?

from the 8.2 box (tbh.lerctr.org in the script):

8.2-PIX-Provider-Internet-Motorola SBG6580 (Time-Warner)-Trendnet 
TEG-160WS Gig switch-9.0 box (borg.lerctr.org).

So, where do I start?

I'd turn off IP / TCP / UDP checksum offloading on your NIC if it supports it, 
and see if you are getting network layer checksum errors.  If the IP checksum 
is wrong, then it happened on the last hops between the NIC and memory or 
across the previous network hop.



Good idea, but, it didn't show ANY errors on EITHER side (both are
em nics).

Next?
$ ifconfig em0
em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
   options=2098VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC
   ether 00:30:48:2e:99:ba
   inet 192.147.25.65 netmask 0xff00 broadcast 192.147.25.255
   inet6 fe80::230:48ff:fe2e:99ba%em0 prefixlen 64 scopeid 0x1
   inet 192.147.25.45 netmask 0xff00 broadcast 192.147.25.255
   inet 192.147.25.11 netmask 0xff00 broadcast 192.147.25.255
   nd6 options=3PERFORMNUD,ACCEPT_RTADV
   media: Ethernet autoselect (100baseTX full-duplex)
   status: active
$
$ uname -a
FreeBSD thebighonker.lerctr.org 8.2-STABLE FreeBSD 8.2-STABLE #45:
Sat Oct  8 10:57:43 CDT 2011
r...@thebighonker.lerctr.org:/usr/obj/usr/src/sys/THEBIGHONKER
amd64
$



$ ifconfig em0
em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
   options=2088VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC
   ether 00:30:48:8e:9f:f3
   inet 192.168.200.4 netmask 0xff00 broadcast 192.168.200.255
   inet6 fe80::230:48ff:fe8e:9ff3%em0 prefixlen 64 scopeid 0x1
   nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
   media: Ethernet autoselect (1000baseT full-duplex)
   status: active
$ uname -a
FreeBSD borg.lerctr.org 9.0-BETA3 FreeBSD 9.0-BETA3 #1: Sun Oct  9
10:03:42 CDT 2011
r...@borg.lerctr.org:/usr/obj/usr/src/sys/BORG-DTRACE  amd64
$


Can you please provide output from the following commands executed on
the machine showing the problem?  The above commands show nothing
useful, other than the fact that one machine is at 100/full and the
other is at 1000/full (I don't know your network setup).  Commands:

* netstat -inbd -I em0
* sysctl -a dev.em.0
* Issue command sysctl dev.em.0.debug=1, then type dmesg and
 provide all of the new output you will see at the bottom that
 pertains to the NIC

If you Google this problem, you will find that the majority of the time
it's caused by NIC drivers acting oddly.

Also, I believe the em(4) driver in 9.x is slightly different than on
8.x, so I'm CC'ing Jack Vogel here.



from 9.0:

NameMtu Network   Address  Ipkts Ierrs Idrop Ibytes
Opkts Oerrs Obytes  Coll Drop
em01500 Link#1  00:30:48:8e:9f:f3 69776975 0 0 59660392277 
52592789 0 104743924118 00 em01500 192.168.200.0 192.168.200.4  
   69759773 - - 58681934612 96397272 - 104003761109 -- em0  
  1500 fe80::230:48f fe80::230:48ff:fe0 - -  03 
-248 --


dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x1096 subvendor=0x15d9 
subdevice=0x class=0x02
dev.em.0.%parent: pci6
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66

Re: rsync corrupted MAC

2011-10-11 Thread Jack Vogel
Oh, I see.  So, did you have a previous working state?

Jack


On Tue, Oct 11, 2011 at 12:06 AM, Larry Rosenman l...@lerctr.org wrote:

 ** They are not local to each other. See the diagram. They are across the
 internet from each other.
 --
 Sent from my Android phone with K-9 Mail. Please excuse my brevity.


 Jack Vogel jfvo...@gmail.com wrote:

 Well, for a start I'd get both interfaces at the same speed, sounds like a
 hardware
 issue of some sort, cable or switch maybe?

 Jack


 On Mon, Oct 10, 2011 at 5:42 PM, Larry Rosenman l...@lerctr.org wrote:

 On Mon, 10 Oct 2011, Jeremy Chadwick wrote:

  On Mon, Oct 10, 2011 at 04:15:25PM -0500, Larry Rosenman wrote:

 On 10/10/2011 3:57 PM, Louis Mamakos wrote:

 On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote:

  On 10/10/2011 10:47 AM, John Baldwin wrote:

 On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:

 Any ideas on which side or what might be broke here?

 ler/MAIL-ARCHIVE/2008/12/INBOX
 Corrupted MAC on input.
 Disconnecting: Packet corrupt
 rsync: connection unexpectedly closed (33845045 bytes received so
 far)

 [receiver]

 rsync error: error in rsync protocol data stream (code 12) at
 io.c(605)

 [receiver=3.0.9]

 rsync: connection unexpectedly closed (1450 bytes received so far)

 [generator]

 rsync error: unexplained error (code 255) at io.c(605)
 [generator=3.0.9]

 I've had somewhat similar issues (ssh getting corruption in its data
 stream)
 when a NIC in my netbook was corrupting packet data when it ran at
 1G (it
 worked fine at 10/100).  Pyun eventually fixed the issue by applying
 enough
 workarounds (it was likely a hardware bug in the NIC's chipset).
  However, it
 wasn't easy to debug unfortunately. :(

  Any ideas on where to start?

 from the 8.2 box (tbh.lerctr.org in the script):

 8.2-PIX-Provider-Internet-**Motorola SBG6580
 (Time-Warner)-Trendnet TEG-160WS Gig switch-9.0 box (
 borg.lerctr.org).

 So, where do I start?

 I'd turn off IP / TCP / UDP checksum offloading on your NIC if it
 supports it, and see if you are getting network layer checksum errors.  
 If
 the IP checksum is wrong, then it happened on the last hops between the 
 NIC
 and memory or across the previous network hop.



  Good idea, but, it didn't show ANY errors on EITHER side (both are
 em nics).

 Next?
 $ ifconfig em0
 em0: flags=8843UP,BROADCAST,**RUNNING,SIMPLEX,MULTICAST metric 0 mtu
 1500
options=2098VLAN_MTU,VLAN_**HWTAGGING,VLAN_HWCSUM,WOL_**MAGIC
ether 00:30:48:2e:99:ba
inet 192.147.25.65 netmask 0xff00 broadcast 192.147.25.255
inet6 fe80::230:48ff:fe2e:99ba%em0 prefixlen 64 scopeid 0x1
inet 192.147.25.45 netmask 0xff00 broadcast 192.147.25.255
inet 192.147.25.11 netmask 0xff00 broadcast 192.147.25.255
nd6 options=3PERFORMNUD,ACCEPT_**RTADV
media: Ethernet autoselect (100baseTX full-duplex)
status: active
 $
 $ uname -a
 FreeBSD thebighonker.lerctr.org 8.2-STABLE FreeBSD 8.2-STABLE #45:
 Sat Oct  8 10:57:43 CDT 2011
 r...@thebighonker.lerctr.org:/**usr/obj/usr/src/sys/**THEBIGHONKER
 amd64
 $



 $ ifconfig em0
 em0: flags=8843UP,BROADCAST,**RUNNING,SIMPLEX,MULTICAST metric 0 mtu
 1500
options=2088VLAN_MTU,VLAN_**HWCSUM,WOL_MAGIC
ether 00:30:48:8e:9f:f3
inet 192.168.200.4 netmask 0xff00 broadcast 192.168.200.255
inet6 fe80::230:48ff:fe8e:9ff3%em0 prefixlen 64 scopeid 0x1
nd6 options=29PERFORMNUD,**IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect (1000baseT full-duplex)
status: active
 $ uname -a
 FreeBSD borg.lerctr.org 9.0-BETA3 FreeBSD 9.0-BETA3 #1: Sun Oct  9
 10:03:42 CDT 2011
 r...@borg.lerctr.org:/usr/obj/**usr/src/sys/BORG-DTRACE  amd64
 $


 Can you please provide output from the following commands executed on
 the machine showing the problem?  The above commands show nothing
 useful, other than the fact that one machine is at 100/full and the
 other is at 1000/full (I don't know your network setup).  Commands:

 * netstat -inbd -I em0
 * sysctl -a dev.em.0
 * Issue command sysctl dev.em.0.debug=1, then type dmesg and
  provide all of the new output you will see at the bottom that
  pertains to the NIC

 If you Google this problem, you will find that the majority of the time
 it's caused by NIC drivers acting oddly.

 Also, I believe the em(4) driver in 9.x is slightly different than on
 8.x, so I'm CC'ing Jack Vogel here.



 from 9.0:

 NameMtu Network   Address  Ipkts Ierrs Idrop
 IbytesOpkts Oerrs Obytes  Coll Drop
 em01500 Link#1  00:30:48:8e:9f:f3 69776975 0 0
 59660392277 52592789 0 104743924118 00 em01500 192.168.200.0
 192.168.200.4 69759773 - - 58681934612 96397272 -
 104003761109 -- em01500 fe80::230:48f fe80::230:48ff:fe0
 - -  03 -248 --


 dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
 dev.em.0.%driver: em
 

Re: rsync corrupted MAC

2011-10-11 Thread Larry Rosenman
Not sure when it broke. I rebuilt the 9.0 server as 9.0, and ran the script and 
it started giving this.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Jack Vogel jfvo...@gmail.com wrote:

Oh, I see.  So, did you have a previous working state?

Jack


On Tue, Oct 11, 2011 at 12:06 AM, Larry Rosenman l...@lerctr.org wrote:

They are not local to each other. See the diagram. They are across the internet 
from each other.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.



Jack Vogel jfvo...@gmail.com wrote:

Well, for a start I'd get both interfaces at the same speed, sounds like a 
hardware
issue of some sort, cable or switch maybe?

Jack


On Mon, Oct 10, 2011 at 5:42 PM, Larry Rosenman l...@lerctr.org wrote:

On Mon, 10 Oct 2011, Jeremy Chadwick wrote:

On Mon, Oct 10, 2011 at 04:15:25PM -0500, Larry Rosenman wrote:

On 10/10/2011 3:57 PM, Louis Mamakos wrote:

On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote:

On 10/10/2011 10:47 AM, John Baldwin wrote:

On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:

Any ideas on which side or what might be broke here?

ler/MAIL-ARCHIVE/2008/12/INBOX
Corrupted MAC on input.
Disconnecting: Packet corrupt
rsync: connection unexpectedly closed (33845045 bytes received so far)

[receiver]

rsync error: error in rsync protocol data stream (code 12) at io.c(605)

[receiver=3.0.9]

rsync: connection unexpectedly closed (1450 bytes received so far)

[generator]

rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9]

I've had somewhat similar issues (ssh getting corruption in its data stream)
when a NIC in my netbook was corrupting packet data when it ran at 1G (it
worked fine at 10/100).  Pyun eventually fixed the issue by applying enough
workarounds (it was likely a hardware bug in the NIC's chipset).  However, it
wasn't easy to debug unfortunately. :(

Any ideas on where to start?

from the 8.2 box (tbh.lerctr.org in the script):

8.2-PIX-Provider-Internet-Motorola SBG6580 (Time-Warner)-Trendnet 
TEG-160WS Gig switch-9.0 box (borg.lerctr.org).

So, where do I start?

I'd turn off IP / TCP / UDP checksum offloading on your NIC if it supports it, 
and see if you are getting network layer checksum errors.  If the IP checksum 
is wrong, then it happened on the last hops between the NIC and memory or 
across the previous network hop.



Good idea, but, it didn't show ANY errors on EITHER side (both are
em nics).

Next?
$ ifconfig em0
em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
   options=2098VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC
   ether 00:30:48:2e:99:ba
   inet 192.147.25.65 netmask 0xff00 broadcast 192.147.25.255
   inet6 fe80::230:48ff:fe2e:99ba%em0 prefixlen 64 scopeid 0x1
   inet 192.147.25.45 netmask 0xff00 broadcast 192.147.25.255
   inet 192.147.25.11 netmask 0xff00 broadcast 192.147.25.255
   nd6 options=3PERFORMNUD,ACCEPT_RTADV
   media: Ethernet autoselect (100baseTX full-duplex)
   status: active
$
$ uname -a
FreeBSD thebighonker.lerctr.org 8.2-STABLE FreeBSD 8.2-STABLE #45:
Sat Oct  8 10:57:43 CDT 2011
r...@thebighonker.lerctr.org:/usr/obj/usr/src/sys/THEBIGHONKER
amd64
$



$ ifconfig em0
em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
   options=2088VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC
   ether 00:30:48:8e:9f:f3
   inet 192.168.200.4 netmask 0xff00 broadcast 192.168.200.255
   inet6 fe80::230:48ff:fe8e:9ff3%em0 prefixlen 64 scopeid 0x1
   nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
   media: Ethernet autoselect (1000baseT full-duplex)
   status: active
$ uname -a
FreeBSD borg.lerctr.org 9.0-BETA3 FreeBSD 9.0-BETA3 #1: Sun Oct  9
10:03:42 CDT 2011
r...@borg.lerctr.org:/usr/obj/usr/src/sys/BORG-DTRACE  amd64
$


Can you please provide output from the following commands executed on
the machine showing the problem?  The above commands show nothing
useful, other than the fact that one machine is at 100/full and the
other is at 1000/full (I don't know your network setup).  Commands:

* netstat -inbd -I em0
* sysctl -a dev.em.0
* Issue command sysctl dev.em.0.debug=1, then type dmesg and
 provide all of the new output you will see at the bottom that
 pertains to the NIC

If you Google this problem, you will find that the majority of the time
it's caused by NIC drivers acting oddly.

Also, I believe the em(4) driver in 9.x is slightly different than on
8.x, so I'm CC'ing Jack Vogel here.



from 9.0:

NameMtu Network   Address  Ipkts Ierrs Idrop Ibytes
Opkts Oerrs Obytes  Coll Drop
em01500 Link#1  00:30:48:8e:9f:f3 69776975 0 0 59660392277 
52592789 0 104743924118 00 em01500 192.168.200.0 192.168.200.4  
   69759773 - - 58681934612 96397272 - 104003761109 -- em0  
  1500 fe80::230:48f fe80::230:48ff:fe0 - -  03 
-248 --


Re: rsync corrupted MAC

2011-10-10 Thread John Baldwin
On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:
 Any ideas on which side or what might be broke here?
 
 ler/MAIL-ARCHIVE/2008/12/INBOX
 Corrupted MAC on input.
 Disconnecting: Packet corrupt
 rsync: connection unexpectedly closed (33845045 bytes received so far) 
[receiver]
 rsync error: error in rsync protocol data stream (code 12) at io.c(605) 
[receiver=3.0.9]
 rsync: connection unexpectedly closed (1450 bytes received so far) 
[generator]
 rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9]

I've had somewhat similar issues (ssh getting corruption in its data stream) 
when a NIC in my netbook was corrupting packet data when it ran at 1G (it 
worked fine at 10/100).  Pyun eventually fixed the issue by applying enough 
workarounds (it was likely a hardware bug in the NIC's chipset).  However, it 
wasn't easy to debug unfortunately. :(

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: rsync corrupted MAC

2011-10-10 Thread Larry Rosenman

On 10/10/2011 10:47 AM, John Baldwin wrote:

On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:

Any ideas on which side or what might be broke here?

ler/MAIL-ARCHIVE/2008/12/INBOX
Corrupted MAC on input.
Disconnecting: Packet corrupt
rsync: connection unexpectedly closed (33845045 bytes received so far)

[receiver]

rsync error: error in rsync protocol data stream (code 12) at io.c(605)

[receiver=3.0.9]

rsync: connection unexpectedly closed (1450 bytes received so far)

[generator]

rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9]

I've had somewhat similar issues (ssh getting corruption in its data stream)
when a NIC in my netbook was corrupting packet data when it ran at 1G (it
worked fine at 10/100).  Pyun eventually fixed the issue by applying enough
workarounds (it was likely a hardware bug in the NIC's chipset).  However, it
wasn't easy to debug unfortunately. :(


Any ideas on where to start?

from the 8.2 box (tbh.lerctr.org in the script):

8.2-PIX-Provider-Internet-Motorola SBG6580 (Time-Warner)-Trendnet 
TEG-160WS Gig switch-9.0 box (borg.lerctr.org).


So, where do I start?



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 512-248-2683 E-Mail: l...@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: rsync corrupted MAC

2011-10-10 Thread John Baldwin
On Monday, October 10, 2011 2:38:55 pm Larry Rosenman wrote:
 On 10/10/2011 10:47 AM, John Baldwin wrote:
  On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:
  Any ideas on which side or what might be broke here?
 
  ler/MAIL-ARCHIVE/2008/12/INBOX
  Corrupted MAC on input.
  Disconnecting: Packet corrupt
  rsync: connection unexpectedly closed (33845045 bytes received so far)
  [receiver]
  rsync error: error in rsync protocol data stream (code 12) at io.c(605)
  [receiver=3.0.9]
  rsync: connection unexpectedly closed (1450 bytes received so far)
  [generator]
  rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9]
  I've had somewhat similar issues (ssh getting corruption in its data stream)
  when a NIC in my netbook was corrupting packet data when it ran at 1G (it
  worked fine at 10/100).  Pyun eventually fixed the issue by applying enough
  workarounds (it was likely a hardware bug in the NIC's chipset).  However, 
  it
  wasn't easy to debug unfortunately. :(
 
 Any ideas on where to start?
 
 from the 8.2 box (tbh.lerctr.org in the script):
 
 8.2-PIX-Provider-Internet-Motorola SBG6580 (Time-Warner)-Trendnet 
 TEG-160WS Gig switch-9.0 box (borg.lerctr.org).
 
 So, where do I start?

In my case I was seeing other issues with the NIC (it would periodically 
freeze
spewing a constant stream of pause frames onto the LAN and refusing to receive
more frames), so I already suspected it of being an issue.  When I turned off
flow control so it wouldn't freeze, it started corrupting the packets instead.
Without that kind of smoking gun I would probably have had a hard time figuring
out the issue.  I would try switching various parts out to see if you can
narrow the issue down to a single component.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: rsync corrupted MAC

2011-10-10 Thread Larry Rosenman

On 10/10/2011 3:57 PM, Louis Mamakos wrote:

On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote:


On 10/10/2011 10:47 AM, John Baldwin wrote:

On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:

Any ideas on which side or what might be broke here?

ler/MAIL-ARCHIVE/2008/12/INBOX
Corrupted MAC on input.
Disconnecting: Packet corrupt
rsync: connection unexpectedly closed (33845045 bytes received so far)

[receiver]

rsync error: error in rsync protocol data stream (code 12) at io.c(605)

[receiver=3.0.9]

rsync: connection unexpectedly closed (1450 bytes received so far)

[generator]

rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9]

I've had somewhat similar issues (ssh getting corruption in its data stream)
when a NIC in my netbook was corrupting packet data when it ran at 1G (it
worked fine at 10/100).  Pyun eventually fixed the issue by applying enough
workarounds (it was likely a hardware bug in the NIC's chipset).  However, it
wasn't easy to debug unfortunately. :(


Any ideas on where to start?

from the 8.2 box (tbh.lerctr.org in the script):

8.2-PIX-Provider-Internet-Motorola SBG6580 (Time-Warner)-Trendnet TEG-160WS 
Gig switch-9.0 box (borg.lerctr.org).

So, where do I start?

I'd turn off IP / TCP / UDP checksum offloading on your NIC if it supports it, 
and see if you are getting network layer checksum errors.  If the IP checksum 
is wrong, then it happened on the last hops between the NIC and memory or 
across the previous network hop.



Good idea, but, it didn't show ANY errors on EITHER side (both are em 
nics).


Next?
$ ifconfig em0
em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=2098VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC
ether 00:30:48:2e:99:ba
inet 192.147.25.65 netmask 0xff00 broadcast 192.147.25.255
inet6 fe80::230:48ff:fe2e:99ba%em0 prefixlen 64 scopeid 0x1
inet 192.147.25.45 netmask 0xff00 broadcast 192.147.25.255
inet 192.147.25.11 netmask 0xff00 broadcast 192.147.25.255
nd6 options=3PERFORMNUD,ACCEPT_RTADV
media: Ethernet autoselect (100baseTX full-duplex)
status: active
$
$ uname -a
FreeBSD thebighonker.lerctr.org 8.2-STABLE FreeBSD 8.2-STABLE #45: Sat 
Oct  8 10:57:43 CDT 2011 
r...@thebighonker.lerctr.org:/usr/obj/usr/src/sys/THEBIGHONKER  amd64

$



$ ifconfig em0
em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=2088VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC
ether 00:30:48:8e:9f:f3
inet 192.168.200.4 netmask 0xff00 broadcast 192.168.200.255
inet6 fe80::230:48ff:fe8e:9ff3%em0 prefixlen 64 scopeid 0x1
nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect (1000baseT full-duplex)
status: active
$ uname -a
FreeBSD borg.lerctr.org 9.0-BETA3 FreeBSD 9.0-BETA3 #1: Sun Oct  9 
10:03:42 CDT 2011 
r...@borg.lerctr.org:/usr/obj/usr/src/sys/BORG-DTRACE  amd64

$





--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 512-248-2683 E-Mail: l...@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: rsync corrupted MAC

2011-10-10 Thread Louis Mamakos

On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote:

 On 10/10/2011 10:47 AM, John Baldwin wrote:
 On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:
 Any ideas on which side or what might be broke here?
 
 ler/MAIL-ARCHIVE/2008/12/INBOX
 Corrupted MAC on input.
 Disconnecting: Packet corrupt
 rsync: connection unexpectedly closed (33845045 bytes received so far)
 [receiver]
 rsync error: error in rsync protocol data stream (code 12) at io.c(605)
 [receiver=3.0.9]
 rsync: connection unexpectedly closed (1450 bytes received so far)
 [generator]
 rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9]
 I've had somewhat similar issues (ssh getting corruption in its data stream)
 when a NIC in my netbook was corrupting packet data when it ran at 1G (it
 worked fine at 10/100).  Pyun eventually fixed the issue by applying enough
 workarounds (it was likely a hardware bug in the NIC's chipset).  However, it
 wasn't easy to debug unfortunately. :(
 
 Any ideas on where to start?
 
 from the 8.2 box (tbh.lerctr.org in the script):
 
 8.2-PIX-Provider-Internet-Motorola SBG6580 (Time-Warner)-Trendnet 
 TEG-160WS Gig switch-9.0 box (borg.lerctr.org).
 
 So, where do I start?

I'd turn off IP / TCP / UDP checksum offloading on your NIC if it supports it, 
and see if you are getting network layer checksum errors.  If the IP checksum 
is wrong, then it happened on the last hops between the NIC and memory or 
across the previous network hop.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: rsync corrupted MAC

2011-10-10 Thread Jeremy Chadwick
On Mon, Oct 10, 2011 at 04:15:25PM -0500, Larry Rosenman wrote:
 On 10/10/2011 3:57 PM, Louis Mamakos wrote:
 On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote:
 
 On 10/10/2011 10:47 AM, John Baldwin wrote:
 On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:
 Any ideas on which side or what might be broke here?
 
 ler/MAIL-ARCHIVE/2008/12/INBOX
 Corrupted MAC on input.
 Disconnecting: Packet corrupt
 rsync: connection unexpectedly closed (33845045 bytes received so far)
 [receiver]
 rsync error: error in rsync protocol data stream (code 12) at io.c(605)
 [receiver=3.0.9]
 rsync: connection unexpectedly closed (1450 bytes received so far)
 [generator]
 rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9]
 I've had somewhat similar issues (ssh getting corruption in its data 
 stream)
 when a NIC in my netbook was corrupting packet data when it ran at 1G (it
 worked fine at 10/100).  Pyun eventually fixed the issue by applying enough
 workarounds (it was likely a hardware bug in the NIC's chipset).  However, 
 it
 wasn't easy to debug unfortunately. :(
 
 Any ideas on where to start?
 
 from the 8.2 box (tbh.lerctr.org in the script):
 
 8.2-PIX-Provider-Internet-Motorola SBG6580 (Time-Warner)-Trendnet 
 TEG-160WS Gig switch-9.0 box (borg.lerctr.org).
 
 So, where do I start?
 I'd turn off IP / TCP / UDP checksum offloading on your NIC if it supports 
 it, and see if you are getting network layer checksum errors.  If the IP 
 checksum is wrong, then it happened on the last hops between the NIC and 
 memory or across the previous network hop.
 
 
 
 Good idea, but, it didn't show ANY errors on EITHER side (both are
 em nics).
 
 Next?
 $ ifconfig em0
 em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
 options=2098VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC
 ether 00:30:48:2e:99:ba
 inet 192.147.25.65 netmask 0xff00 broadcast 192.147.25.255
 inet6 fe80::230:48ff:fe2e:99ba%em0 prefixlen 64 scopeid 0x1
 inet 192.147.25.45 netmask 0xff00 broadcast 192.147.25.255
 inet 192.147.25.11 netmask 0xff00 broadcast 192.147.25.255
 nd6 options=3PERFORMNUD,ACCEPT_RTADV
 media: Ethernet autoselect (100baseTX full-duplex)
 status: active
 $
 $ uname -a
 FreeBSD thebighonker.lerctr.org 8.2-STABLE FreeBSD 8.2-STABLE #45:
 Sat Oct  8 10:57:43 CDT 2011
 r...@thebighonker.lerctr.org:/usr/obj/usr/src/sys/THEBIGHONKER
 amd64
 $
 
 
 
 $ ifconfig em0
 em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
 options=2088VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC
 ether 00:30:48:8e:9f:f3
 inet 192.168.200.4 netmask 0xff00 broadcast 192.168.200.255
 inet6 fe80::230:48ff:fe8e:9ff3%em0 prefixlen 64 scopeid 0x1
 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
 media: Ethernet autoselect (1000baseT full-duplex)
 status: active
 $ uname -a
 FreeBSD borg.lerctr.org 9.0-BETA3 FreeBSD 9.0-BETA3 #1: Sun Oct  9
 10:03:42 CDT 2011
 r...@borg.lerctr.org:/usr/obj/usr/src/sys/BORG-DTRACE  amd64
 $

Can you please provide output from the following commands executed on
the machine showing the problem?  The above commands show nothing
useful, other than the fact that one machine is at 100/full and the
other is at 1000/full (I don't know your network setup).  Commands:

* netstat -inbd -I em0
* sysctl -a dev.em.0
* Issue command sysctl dev.em.0.debug=1, then type dmesg and
  provide all of the new output you will see at the bottom that
  pertains to the NIC

If you Google this problem, you will find that the majority of the time
it's caused by NIC drivers acting oddly.

Also, I believe the em(4) driver in 9.x is slightly different than on
8.x, so I'm CC'ing Jack Vogel here.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: rsync corrupted MAC

2011-10-10 Thread Larry Rosenman

On Mon, 10 Oct 2011, Jeremy Chadwick wrote:


On Mon, Oct 10, 2011 at 04:15:25PM -0500, Larry Rosenman wrote:

On 10/10/2011 3:57 PM, Louis Mamakos wrote:

On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote:


On 10/10/2011 10:47 AM, John Baldwin wrote:

On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:

Any ideas on which side or what might be broke here?

ler/MAIL-ARCHIVE/2008/12/INBOX
Corrupted MAC on input.
Disconnecting: Packet corrupt
rsync: connection unexpectedly closed (33845045 bytes received so far)

[receiver]

rsync error: error in rsync protocol data stream (code 12) at io.c(605)

[receiver=3.0.9]

rsync: connection unexpectedly closed (1450 bytes received so far)

[generator]

rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9]

I've had somewhat similar issues (ssh getting corruption in its data stream)
when a NIC in my netbook was corrupting packet data when it ran at 1G (it
worked fine at 10/100).  Pyun eventually fixed the issue by applying enough
workarounds (it was likely a hardware bug in the NIC's chipset).  However, it
wasn't easy to debug unfortunately. :(


Any ideas on where to start?

from the 8.2 box (tbh.lerctr.org in the script):

8.2-PIX-Provider-Internet-Motorola SBG6580 (Time-Warner)-Trendnet TEG-160WS 
Gig switch-9.0 box (borg.lerctr.org).

So, where do I start?

I'd turn off IP / TCP / UDP checksum offloading on your NIC if it supports it, 
and see if you are getting network layer checksum errors.  If the IP checksum 
is wrong, then it happened on the last hops between the NIC and memory or 
across the previous network hop.




Good idea, but, it didn't show ANY errors on EITHER side (both are
em nics).

Next?
$ ifconfig em0
em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=2098VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC
ether 00:30:48:2e:99:ba
inet 192.147.25.65 netmask 0xff00 broadcast 192.147.25.255
inet6 fe80::230:48ff:fe2e:99ba%em0 prefixlen 64 scopeid 0x1
inet 192.147.25.45 netmask 0xff00 broadcast 192.147.25.255
inet 192.147.25.11 netmask 0xff00 broadcast 192.147.25.255
nd6 options=3PERFORMNUD,ACCEPT_RTADV
media: Ethernet autoselect (100baseTX full-duplex)
status: active
$
$ uname -a
FreeBSD thebighonker.lerctr.org 8.2-STABLE FreeBSD 8.2-STABLE #45:
Sat Oct  8 10:57:43 CDT 2011
r...@thebighonker.lerctr.org:/usr/obj/usr/src/sys/THEBIGHONKER
amd64
$



$ ifconfig em0
em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=2088VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC
ether 00:30:48:8e:9f:f3
inet 192.168.200.4 netmask 0xff00 broadcast 192.168.200.255
inet6 fe80::230:48ff:fe8e:9ff3%em0 prefixlen 64 scopeid 0x1
nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect (1000baseT full-duplex)
status: active
$ uname -a
FreeBSD borg.lerctr.org 9.0-BETA3 FreeBSD 9.0-BETA3 #1: Sun Oct  9
10:03:42 CDT 2011
r...@borg.lerctr.org:/usr/obj/usr/src/sys/BORG-DTRACE  amd64
$


Can you please provide output from the following commands executed on
the machine showing the problem?  The above commands show nothing
useful, other than the fact that one machine is at 100/full and the
other is at 1000/full (I don't know your network setup).  Commands:

* netstat -inbd -I em0
* sysctl -a dev.em.0
* Issue command sysctl dev.em.0.debug=1, then type dmesg and
 provide all of the new output you will see at the bottom that
 pertains to the NIC

If you Google this problem, you will find that the majority of the time
it's caused by NIC drivers acting oddly.

Also, I believe the em(4) driver in 9.x is slightly different than on
8.x, so I'm CC'ing Jack Vogel here.




from 9.0:

NameMtu Network   Address  Ipkts Ierrs Idrop Ibytes
Opkts Oerrs Obytes  Coll Drop
em01500 Link#1  00:30:48:8e:9f:f3 69776975 0 0 59660392277 52592789 0 104743924118 00 
em01500 192.168.200.0 192.168.200.4 69759773 - - 58681934612 96397272 - 104003761109 -- 
em01500 fe80::230:48f fe80::230:48ff:fe0 - -  03 -248 --



dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x1096 subvendor=0x15d9 
subdevice=0x class=0x02
dev.em.0.%parent: pci6
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100
dev.em.0.flow_control: 3
dev.em.0.eee_control: 0
dev.em.0.link_irq: 0
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 21755
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1851969
dev.em.0.rx_control: 67141634
dev.em.0.fc_high_water: 30720

Re: rsync corrupted MAC

2011-10-10 Thread Jack Vogel
Well, for a start I'd get both interfaces at the same speed, sounds like a
hardware
issue of some sort, cable or switch maybe?

Jack


On Mon, Oct 10, 2011 at 5:42 PM, Larry Rosenman l...@lerctr.org wrote:

 On Mon, 10 Oct 2011, Jeremy Chadwick wrote:

  On Mon, Oct 10, 2011 at 04:15:25PM -0500, Larry Rosenman wrote:

 On 10/10/2011 3:57 PM, Louis Mamakos wrote:

 On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote:

  On 10/10/2011 10:47 AM, John Baldwin wrote:

 On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:

 Any ideas on which side or what might be broke here?

 ler/MAIL-ARCHIVE/2008/12/INBOX
 Corrupted MAC on input.
 Disconnecting: Packet corrupt
 rsync: connection unexpectedly closed (33845045 bytes received so
 far)

 [receiver]

 rsync error: error in rsync protocol data stream (code 12) at
 io.c(605)

 [receiver=3.0.9]

 rsync: connection unexpectedly closed (1450 bytes received so far)

 [generator]

 rsync error: unexplained error (code 255) at io.c(605)
 [generator=3.0.9]

 I've had somewhat similar issues (ssh getting corruption in its data
 stream)
 when a NIC in my netbook was corrupting packet data when it ran at 1G
 (it
 worked fine at 10/100).  Pyun eventually fixed the issue by applying
 enough
 workarounds (it was likely a hardware bug in the NIC's chipset).
  However, it
 wasn't easy to debug unfortunately. :(

  Any ideas on where to start?

 from the 8.2 box (tbh.lerctr.org in the script):

 8.2-PIX-Provider-Internet-**Motorola SBG6580
 (Time-Warner)-Trendnet TEG-160WS Gig switch-9.0 box (borg.lerctr.org
 ).

 So, where do I start?

 I'd turn off IP / TCP / UDP checksum offloading on your NIC if it
 supports it, and see if you are getting network layer checksum errors.  If
 the IP checksum is wrong, then it happened on the last hops between the NIC
 and memory or across the previous network hop.



  Good idea, but, it didn't show ANY errors on EITHER side (both are
 em nics).

 Next?
 $ ifconfig em0
 em0: flags=8843UP,BROADCAST,**RUNNING,SIMPLEX,MULTICAST metric 0 mtu
 1500
options=2098VLAN_MTU,VLAN_**HWTAGGING,VLAN_HWCSUM,WOL_**MAGIC
ether 00:30:48:2e:99:ba
inet 192.147.25.65 netmask 0xff00 broadcast 192.147.25.255
inet6 fe80::230:48ff:fe2e:99ba%em0 prefixlen 64 scopeid 0x1
inet 192.147.25.45 netmask 0xff00 broadcast 192.147.25.255
inet 192.147.25.11 netmask 0xff00 broadcast 192.147.25.255
nd6 options=3PERFORMNUD,ACCEPT_**RTADV
media: Ethernet autoselect (100baseTX full-duplex)
status: active
 $
 $ uname -a
 FreeBSD thebighonker.lerctr.org 8.2-STABLE FreeBSD 8.2-STABLE #45:
 Sat Oct  8 10:57:43 CDT 2011
 r...@thebighonker.lerctr.org:/**usr/obj/usr/src/sys/**THEBIGHONKER
 amd64
 $



 $ ifconfig em0
 em0: flags=8843UP,BROADCAST,**RUNNING,SIMPLEX,MULTICAST metric 0 mtu
 1500
options=2088VLAN_MTU,VLAN_**HWCSUM,WOL_MAGIC
ether 00:30:48:8e:9f:f3
inet 192.168.200.4 netmask 0xff00 broadcast 192.168.200.255
inet6 fe80::230:48ff:fe8e:9ff3%em0 prefixlen 64 scopeid 0x1
nd6 options=29PERFORMNUD,**IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect (1000baseT full-duplex)
status: active
 $ uname -a
 FreeBSD borg.lerctr.org 9.0-BETA3 FreeBSD 9.0-BETA3 #1: Sun Oct  9
 10:03:42 CDT 2011
 r...@borg.lerctr.org:/usr/obj/**usr/src/sys/BORG-DTRACE  amd64
 $


 Can you please provide output from the following commands executed on
 the machine showing the problem?  The above commands show nothing
 useful, other than the fact that one machine is at 100/full and the
 other is at 1000/full (I don't know your network setup).  Commands:

 * netstat -inbd -I em0
 * sysctl -a dev.em.0
 * Issue command sysctl dev.em.0.debug=1, then type dmesg and
  provide all of the new output you will see at the bottom that
  pertains to the NIC

 If you Google this problem, you will find that the majority of the time
 it's caused by NIC drivers acting oddly.

 Also, I believe the em(4) driver in 9.x is slightly different than on
 8.x, so I'm CC'ing Jack Vogel here.



 from 9.0:

 NameMtu Network   Address  Ipkts Ierrs Idrop Ibytes
Opkts Oerrs Obytes  Coll Drop
 em01500 Link#1  00:30:48:8e:9f:f3 69776975 0 0
 59660392277 52592789 0 104743924118 00 em01500 192.168.200.0
 192.168.200.4 69759773 - - 58681934612 96397272 -
 104003761109 -- em01500 fe80::230:48f fe80::230:48ff:fe0
 - -  03 -248 --


 dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
 dev.em.0.%driver: em
 dev.em.0.%location: slot=0 function=0
 dev.em.0.%pnpinfo: vendor=0x8086 device=0x1096 subvendor=0x15d9
 subdevice=0x class=0x02
 dev.em.0.%parent: pci6
 dev.em.0.nvm: -1
 dev.em.0.debug: -1
 dev.em.0.rx_int_delay: 0
 dev.em.0.tx_int_delay: 66
 dev.em.0.rx_abs_int_delay: 66
 dev.em.0.tx_abs_int_delay: 66
 dev.em.0.rx_processing_limit: 100
 

rsync corrupted MAC

2011-10-09 Thread Larry Rosenman

Any ideas on which side or what might be broke here?

ler/MAIL-ARCHIVE/2008/12/INBOX
Corrupted MAC on input.
Disconnecting: Packet corrupt
rsync: connection unexpectedly closed (33845045 bytes received so far) 
[receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) 
[receiver=3.0.9]
rsync: connection unexpectedly closed (1450 bytes received so far) [generator]
rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9]

The script:

#!/bin/sh
/usr/local/bin/rsync -Cavz --delete r...@tbh.lerctr.org:/etc/ \
/vault/backup/TBH/etc/
/usr/local/bin/rsync -Cavz --delete r...@tbh.lerctr.org:/home/ \
/vault/backup/TBH/home/
#/usr/local/bin/rsync -Cavz --delete 
r...@tbh.lerctr.org:/usr/local/pgsql/backups/ \
#/vault/backup/TBH/pgsql/
/usr/local/bin/rsync -Cavz --delete r...@tbh.lerctr.org:/var/named/ \
/vault/backup/TBH/named/


It seems to move and be at a random spot in the file.

Then it will move to a different file.

Ideas?

Source is 8.2-STABLE, and the Destination/controller is 9.0-BETA3.


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 512-248-2683 E-Mail: l...@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org