Re: [Mikrotik Users] Is anyone else losing RB493Gs?

2016-10-17 Thread Nathan Anderson via Mikrotik-users
...this was what I was thinking of (Samsung NAND issues); turns out it was a 
lot longer ago than I remembered it being: 
http://forum.mikrotik.com/viewtopic.php?t=61321

Supposedly 5.18 included the workaround, and clearly you weren't running 5.x. 
:-/

(http://wiki.mikrotik.com/wiki/Manual:RouterBOARD_bad_blocks)

-- Nathan

-Original Message-
From: mikrotik-users-boun...@wispa.org 
[mailto:mikrotik-users-boun...@wispa.org] On Behalf Of Nathan Anderson via 
Mikrotik-users
Sent: Monday, October 17, 2016 5:08 PM
To: 'Scott Lambert'; 'Mikrotik Users'; TJ Trout
Subject: Re: [Mikrotik Users] Is anyone else losing RB493Gs?

Also, the large count of failing NAND blocks is not symptomatic of the 
capacitor issue.  On boards with failing caps, they will just start to randomly 
crash/halt, and as the caps continue to lose integrity, it will happen with 
increasing frequency, and eventually the boards will just fail to boot up 
entirely.

I *do* recall that there was a particular NAND chip that MikroTik started using 
in later batches of 4xx boards that people reported high failure rates of.  I 
believe MT eventually worked around the issue in software.  I think there was 
even a large wiki article about it; I'll see if I can dig it up.

On the boards that have "failed", have you tried, out of curiosity, to format 
the NAND in the bootloader and re-Netinstall?

-- Nathan

-Original Message-
From: mikrotik-users-boun...@wispa.org 
[mailto:mikrotik-users-boun...@wispa.org] On Behalf Of Scott Lambert via 
Mikrotik-users
Sent: Monday, October 17, 2016 4:22 PM
To: TJ Trout
Cc: Mikrotik Users
Subject: Re: [Mikrotik Users] Is anyone else losing RB493Gs?

On Mon, Oct 17, 2016 at 03:20:19PM -0700, TJ Trout wrote:
> Are you sure it's not capacitor plague? Easy to fix with a smd rework
> station

I don't see any leakage or doming of the three largeish electrolytic
capacitors on each board.  I am not even slightly tempted to attempt
to test all of the SM caps.  But I might try one or two if you have
suspects.  No SMD rework station either.

I'm actually quite happy to have the 4xx series MTU limitations leave
the network.  I just don't like spending more money for similar
performance / port count before we have to.

The affected serial numbers cover a fairly wide range.  I was hoping
to use age as a predictor, but the dead units are scattered from
25E0017443A5 to 25E0017A1420.  Assuming the numbers are actually
sequential, they are seperated by 381,051 units.  Maybe that's one
production batch?  I doubt it.

Off topic, I have one other RB493G in the dead drawer with a SN of
37970127C53D which was killed by lightning.  I'll bet that was a
different batch. :-)
 
> On Oct 17, 2016 12:55 PM, "Scott Lambert via Mikrotik-users" 
><mikrotik-users@wispa.org> wrote:
> 
> > We deployed a lot of RB493Gs a few years back.  They are starting show
> > 80 to 95% bad blocks on their flash storage.  I've been losing two to
> > three per month for a few months now.
> >
> > The initial symptom is the automated ssh process which gathers IP pool
> > utilization to feed MRTG starts reporting that it cannot log in.  When I
> > go to look at the problem, all users are gone and I can only log in as
> > admin with no password.  They do not survive a reboot.  But so far, if I
> > don't reboot them, they keep doing what they were supposed to do.  They
> > just allow anyone to login with the default username and password.
> >
> > So far, I've not lost any RB450Gs, but beings from the same family, I
> > half expect them to start going.  I have many fewer of those.
> >
> > I'm not upset about losing these things after 3 to 5 years. I am curious
> > if others are seeing the same thing happen on your networks.
> >
> > If others have left their RB493Gs on older firmware and are not seeing
> > issues, maybe it's a 6.30+ issue on RB493G bug?  Most of my have been
> > brought up to 6.30+ to work around some PPPoE + mangle/queue tree
> > issues.  (The MSS mangle rules don't get inserted before rule 0 leading
> > to PathMTU issues.)
> >
> > If it is happening to others, maybe I need to order 16 to 21
> > replacements and do some preventative swaps.
> >
> > "RouterOS RB493G","6.34.6"
> > "RouterOS RB493G","6.32.3"
> > "RouterOS RB493G","6.34.4"
> > "RouterOS RB493G","6.7"
> > "RouterOS RB493G","6.32.3"
> > "RouterOS RB493G","6.34.6"
> > "RouterOS RB493G","6.34.6"
> > "RouterOS RB493G","6.30.4"
> > "RouterOS RB493G","6.34.4"
> > "RouterOS RB493G","6.30.4"
> > "RouterOS RB493G&quo

Re: [Mikrotik Users] Is anyone else losing RB493Gs?

2016-10-17 Thread Nathan Anderson via Mikrotik-users
Also, the large count of failing NAND blocks is not symptomatic of the 
capacitor issue.  On boards with failing caps, they will just start to randomly 
crash/halt, and as the caps continue to lose integrity, it will happen with 
increasing frequency, and eventually the boards will just fail to boot up 
entirely.

I *do* recall that there was a particular NAND chip that MikroTik started using 
in later batches of 4xx boards that people reported high failure rates of.  I 
believe MT eventually worked around the issue in software.  I think there was 
even a large wiki article about it; I'll see if I can dig it up.

On the boards that have "failed", have you tried, out of curiosity, to format 
the NAND in the bootloader and re-Netinstall?

-- Nathan

-Original Message-
From: mikrotik-users-boun...@wispa.org 
[mailto:mikrotik-users-boun...@wispa.org] On Behalf Of Scott Lambert via 
Mikrotik-users
Sent: Monday, October 17, 2016 4:22 PM
To: TJ Trout
Cc: Mikrotik Users
Subject: Re: [Mikrotik Users] Is anyone else losing RB493Gs?

On Mon, Oct 17, 2016 at 03:20:19PM -0700, TJ Trout wrote:
> Are you sure it's not capacitor plague? Easy to fix with a smd rework
> station

I don't see any leakage or doming of the three largeish electrolytic
capacitors on each board.  I am not even slightly tempted to attempt
to test all of the SM caps.  But I might try one or two if you have
suspects.  No SMD rework station either.

I'm actually quite happy to have the 4xx series MTU limitations leave
the network.  I just don't like spending more money for similar
performance / port count before we have to.

The affected serial numbers cover a fairly wide range.  I was hoping
to use age as a predictor, but the dead units are scattered from
25E0017443A5 to 25E0017A1420.  Assuming the numbers are actually
sequential, they are seperated by 381,051 units.  Maybe that's one
production batch?  I doubt it.

Off topic, I have one other RB493G in the dead drawer with a SN of
37970127C53D which was killed by lightning.  I'll bet that was a
different batch. :-)
 
> On Oct 17, 2016 12:55 PM, "Scott Lambert via Mikrotik-users" 
><mikrotik-users@wispa.org> wrote:
> 
> > We deployed a lot of RB493Gs a few years back.  They are starting show
> > 80 to 95% bad blocks on their flash storage.  I've been losing two to
> > three per month for a few months now.
> >
> > The initial symptom is the automated ssh process which gathers IP pool
> > utilization to feed MRTG starts reporting that it cannot log in.  When I
> > go to look at the problem, all users are gone and I can only log in as
> > admin with no password.  They do not survive a reboot.  But so far, if I
> > don't reboot them, they keep doing what they were supposed to do.  They
> > just allow anyone to login with the default username and password.
> >
> > So far, I've not lost any RB450Gs, but beings from the same family, I
> > half expect them to start going.  I have many fewer of those.
> >
> > I'm not upset about losing these things after 3 to 5 years. I am curious
> > if others are seeing the same thing happen on your networks.
> >
> > If others have left their RB493Gs on older firmware and are not seeing
> > issues, maybe it's a 6.30+ issue on RB493G bug?  Most of my have been
> > brought up to 6.30+ to work around some PPPoE + mangle/queue tree
> > issues.  (The MSS mangle rules don't get inserted before rule 0 leading
> > to PathMTU issues.)
> >
> > If it is happening to others, maybe I need to order 16 to 21
> > replacements and do some preventative swaps.
> >
> > "RouterOS RB493G","6.34.6"
> > "RouterOS RB493G","6.32.3"
> > "RouterOS RB493G","6.34.4"
> > "RouterOS RB493G","6.7"
> > "RouterOS RB493G","6.32.3"
> > "RouterOS RB493G","6.34.6"
> > "RouterOS RB493G","6.34.6"
> > "RouterOS RB493G","6.30.4"
> > "RouterOS RB493G","6.34.4"
> > "RouterOS RB493G","6.30.4"
> > "RouterOS RB493G","6.32.3"
> > "RouterOS RB493G","6.18"
> > "RouterOS RB493G","6.18"
> > "RouterOS RB493G","6.7"
> > "RouterOS RB493G","6.18"
> > "RouterOS RB493G","6.30.4"
> >
> > "RouterOS RB450G","6.7"
> > "RouterOS RB450G","6.32.3"
> > "RouterOS RB450G","6.32.3"
> > "RouterOS RB450G","6.34.6"
> > "RouterOS RB450G","6.34.6"
> > "RouterOS RB450G","6.27"
> > "RouterOS RB450G","6.34.3"
> >
> > --
> > Scott LambertKC5MLE   Unix SysAdmin
> > lamb...@lambertfam.org
> > ___
> > Mikrotik-users mailing list
> > Mikrotik-users@wispa.org
> > http://lists.wispa.org/mailman/listinfo/mikrotik-users
> >

-- 
Scott LambertKC5MLE   Unix SysAdmin
lamb...@lambertfam.org
___
Mikrotik-users mailing list
Mikrotik-users@wispa.org
http://lists.wispa.org/mailman/listinfo/mikrotik-users

___
Mikrotik-users mailing list
Mikrotik-users@wispa.org
http://lists.wispa.org/mailman/listinfo/mikrotik-users


Re: [Mikrotik Users] Is anyone else losing RB493Gs?

2016-10-17 Thread Scott Lambert via Mikrotik-users
On Mon, Oct 17, 2016 at 07:03:01PM -0400, Justin Miller via Mikrotik-users 
wrote:
> In 6.something they added a flash refresh feature. Maybe this is
> prematurely killing the flash or just showing an issue before it
> becomes one.  Hard to tell.
>
> I've had a few devices with flash that fails, but not as much as you
> describe.

Ours may happen to run in warmer boxes during the summer.  No telling.

The part I don't like is not knowing the problem is coming until
authentication information disappears.  One day it works, the next it's
gone.  There is nothing in the remote syslog.

I did an actual count.  We've lost 6 devices out of 22 to bad flash, so
far.  My time sense is horrible.  The rate may be closer to 0.5 to 2 per
month.

-- 
Scott LambertKC5MLE   Unix SysAdmin
lamb...@lambertfam.org
___
Mikrotik-users mailing list
Mikrotik-users@wispa.org
http://lists.wispa.org/mailman/listinfo/mikrotik-users


Re: [Mikrotik Users] Is anyone else losing RB493Gs?

2016-10-17 Thread Scott Lambert via Mikrotik-users
On Mon, Oct 17, 2016 at 03:20:19PM -0700, TJ Trout wrote:
> Are you sure it's not capacitor plague? Easy to fix with a smd rework
> station

I don't see any leakage or doming of the three largeish electrolytic
capacitors on each board.  I am not even slightly tempted to attempt
to test all of the SM caps.  But I might try one or two if you have
suspects.  No SMD rework station either.

I'm actually quite happy to have the 4xx series MTU limitations leave
the network.  I just don't like spending more money for similar
performance / port count before we have to.

The affected serial numbers cover a fairly wide range.  I was hoping
to use age as a predictor, but the dead units are scattered from
25E0017443A5 to 25E0017A1420.  Assuming the numbers are actually
sequential, they are seperated by 381,051 units.  Maybe that's one
production batch?  I doubt it.

Off topic, I have one other RB493G in the dead drawer with a SN of
37970127C53D which was killed by lightning.  I'll bet that was a
different batch. :-)
 
> On Oct 17, 2016 12:55 PM, "Scott Lambert via Mikrotik-users" 
> wrote:
> 
> > We deployed a lot of RB493Gs a few years back.  They are starting show
> > 80 to 95% bad blocks on their flash storage.  I've been losing two to
> > three per month for a few months now.
> >
> > The initial symptom is the automated ssh process which gathers IP pool
> > utilization to feed MRTG starts reporting that it cannot log in.  When I
> > go to look at the problem, all users are gone and I can only log in as
> > admin with no password.  They do not survive a reboot.  But so far, if I
> > don't reboot them, they keep doing what they were supposed to do.  They
> > just allow anyone to login with the default username and password.
> >
> > So far, I've not lost any RB450Gs, but beings from the same family, I
> > half expect them to start going.  I have many fewer of those.
> >
> > I'm not upset about losing these things after 3 to 5 years. I am curious
> > if others are seeing the same thing happen on your networks.
> >
> > If others have left their RB493Gs on older firmware and are not seeing
> > issues, maybe it's a 6.30+ issue on RB493G bug?  Most of my have been
> > brought up to 6.30+ to work around some PPPoE + mangle/queue tree
> > issues.  (The MSS mangle rules don't get inserted before rule 0 leading
> > to PathMTU issues.)
> >
> > If it is happening to others, maybe I need to order 16 to 21
> > replacements and do some preventative swaps.
> >
> > "RouterOS RB493G","6.34.6"
> > "RouterOS RB493G","6.32.3"
> > "RouterOS RB493G","6.34.4"
> > "RouterOS RB493G","6.7"
> > "RouterOS RB493G","6.32.3"
> > "RouterOS RB493G","6.34.6"
> > "RouterOS RB493G","6.34.6"
> > "RouterOS RB493G","6.30.4"
> > "RouterOS RB493G","6.34.4"
> > "RouterOS RB493G","6.30.4"
> > "RouterOS RB493G","6.32.3"
> > "RouterOS RB493G","6.18"
> > "RouterOS RB493G","6.18"
> > "RouterOS RB493G","6.7"
> > "RouterOS RB493G","6.18"
> > "RouterOS RB493G","6.30.4"
> >
> > "RouterOS RB450G","6.7"
> > "RouterOS RB450G","6.32.3"
> > "RouterOS RB450G","6.32.3"
> > "RouterOS RB450G","6.34.6"
> > "RouterOS RB450G","6.34.6"
> > "RouterOS RB450G","6.27"
> > "RouterOS RB450G","6.34.3"
> >
> > --
> > Scott LambertKC5MLE   Unix SysAdmin
> > lamb...@lambertfam.org
> > ___
> > Mikrotik-users mailing list
> > Mikrotik-users@wispa.org
> > http://lists.wispa.org/mailman/listinfo/mikrotik-users
> >

-- 
Scott LambertKC5MLE   Unix SysAdmin
lamb...@lambertfam.org
___
Mikrotik-users mailing list
Mikrotik-users@wispa.org
http://lists.wispa.org/mailman/listinfo/mikrotik-users


Re: [Mikrotik Users] Is anyone else losing RB493Gs?

2016-10-17 Thread Justin Miller via Mikrotik-users
In 6.something they added a flash refresh feature. Maybe this is prematurely 
killing the flash or just showing an issue before it becomes one. Hard to tell. 

I've had a few devices with flash that fails, but not as much as you describe. 

Justin Miller

 VA SkyWire, LLC
 3114 W Marshall St, Ste A
 Richmond, VA 23230
 Office: (804) 521-4212
 Desk: (804) 591-0500 ext 101
 Fax: (804) 591-1559
 jus...@vaskywire.com

> On Oct 17, 2016, at 6:20 PM, TJ Trout via Mikrotik-users 
>  wrote:
> 
> Are you sure it's not capacitor plague? Easy to fix with a smd rework station
> 
> 
>> On Oct 17, 2016 12:55 PM, "Scott Lambert via Mikrotik-users" 
>>  wrote:
>> We deployed a lot of RB493Gs a few years back.  They are starting show
>> 80 to 95% bad blocks on their flash storage.  I've been losing two to
>> three per month for a few months now.
>> 
>> The initial symptom is the automated ssh process which gathers IP pool
>> utilization to feed MRTG starts reporting that it cannot log in.  When I
>> go to look at the problem, all users are gone and I can only log in as
>> admin with no password.  They do not survive a reboot.  But so far, if I
>> don't reboot them, they keep doing what they were supposed to do.  They
>> just allow anyone to login with the default username and password.
>> 
>> So far, I've not lost any RB450Gs, but beings from the same family, I
>> half expect them to start going.  I have many fewer of those.
>> 
>> I'm not upset about losing these things after 3 to 5 years. I am curious
>> if others are seeing the same thing happen on your networks.
>> 
>> If others have left their RB493Gs on older firmware and are not seeing
>> issues, maybe it's a 6.30+ issue on RB493G bug?  Most of my have been
>> brought up to 6.30+ to work around some PPPoE + mangle/queue tree
>> issues.  (The MSS mangle rules don't get inserted before rule 0 leading
>> to PathMTU issues.)
>> 
>> If it is happening to others, maybe I need to order 16 to 21
>> replacements and do some preventative swaps.
>> 
>> "RouterOS RB493G","6.34.6"
>> "RouterOS RB493G","6.32.3"
>> "RouterOS RB493G","6.34.4"
>> "RouterOS RB493G","6.7"
>> "RouterOS RB493G","6.32.3"
>> "RouterOS RB493G","6.34.6"
>> "RouterOS RB493G","6.34.6"
>> "RouterOS RB493G","6.30.4"
>> "RouterOS RB493G","6.34.4"
>> "RouterOS RB493G","6.30.4"
>> "RouterOS RB493G","6.32.3"
>> "RouterOS RB493G","6.18"
>> "RouterOS RB493G","6.18"
>> "RouterOS RB493G","6.7"
>> "RouterOS RB493G","6.18"
>> "RouterOS RB493G","6.30.4"
>> 
>> "RouterOS RB450G","6.7"
>> "RouterOS RB450G","6.32.3"
>> "RouterOS RB450G","6.32.3"
>> "RouterOS RB450G","6.34.6"
>> "RouterOS RB450G","6.34.6"
>> "RouterOS RB450G","6.27"
>> "RouterOS RB450G","6.34.3"
>> 
>> --
>> Scott LambertKC5MLE   Unix SysAdmin
>> lamb...@lambertfam.org
>> ___
>> Mikrotik-users mailing list
>> Mikrotik-users@wispa.org
>> http://lists.wispa.org/mailman/listinfo/mikrotik-users
> ___
> Mikrotik-users mailing list
> Mikrotik-users@wispa.org
> http://lists.wispa.org/mailman/listinfo/mikrotik-users
___
Mikrotik-users mailing list
Mikrotik-users@wispa.org
http://lists.wispa.org/mailman/listinfo/mikrotik-users


Re: [Mikrotik Users] Is anyone else losing RB493Gs?

2016-10-17 Thread TJ Trout via Mikrotik-users
Are you sure it's not capacitor plague? Easy to fix with a smd rework
station

On Oct 17, 2016 12:55 PM, "Scott Lambert via Mikrotik-users" <
mikrotik-users@wispa.org> wrote:

> We deployed a lot of RB493Gs a few years back.  They are starting show
> 80 to 95% bad blocks on their flash storage.  I've been losing two to
> three per month for a few months now.
>
> The initial symptom is the automated ssh process which gathers IP pool
> utilization to feed MRTG starts reporting that it cannot log in.  When I
> go to look at the problem, all users are gone and I can only log in as
> admin with no password.  They do not survive a reboot.  But so far, if I
> don't reboot them, they keep doing what they were supposed to do.  They
> just allow anyone to login with the default username and password.
>
> So far, I've not lost any RB450Gs, but beings from the same family, I
> half expect them to start going.  I have many fewer of those.
>
> I'm not upset about losing these things after 3 to 5 years. I am curious
> if others are seeing the same thing happen on your networks.
>
> If others have left their RB493Gs on older firmware and are not seeing
> issues, maybe it's a 6.30+ issue on RB493G bug?  Most of my have been
> brought up to 6.30+ to work around some PPPoE + mangle/queue tree
> issues.  (The MSS mangle rules don't get inserted before rule 0 leading
> to PathMTU issues.)
>
> If it is happening to others, maybe I need to order 16 to 21
> replacements and do some preventative swaps.
>
> "RouterOS RB493G","6.34.6"
> "RouterOS RB493G","6.32.3"
> "RouterOS RB493G","6.34.4"
> "RouterOS RB493G","6.7"
> "RouterOS RB493G","6.32.3"
> "RouterOS RB493G","6.34.6"
> "RouterOS RB493G","6.34.6"
> "RouterOS RB493G","6.30.4"
> "RouterOS RB493G","6.34.4"
> "RouterOS RB493G","6.30.4"
> "RouterOS RB493G","6.32.3"
> "RouterOS RB493G","6.18"
> "RouterOS RB493G","6.18"
> "RouterOS RB493G","6.7"
> "RouterOS RB493G","6.18"
> "RouterOS RB493G","6.30.4"
>
> "RouterOS RB450G","6.7"
> "RouterOS RB450G","6.32.3"
> "RouterOS RB450G","6.32.3"
> "RouterOS RB450G","6.34.6"
> "RouterOS RB450G","6.34.6"
> "RouterOS RB450G","6.27"
> "RouterOS RB450G","6.34.3"
>
> --
> Scott LambertKC5MLE   Unix SysAdmin
> lamb...@lambertfam.org
> ___
> Mikrotik-users mailing list
> Mikrotik-users@wispa.org
> http://lists.wispa.org/mailman/listinfo/mikrotik-users
>
___
Mikrotik-users mailing list
Mikrotik-users@wispa.org
http://lists.wispa.org/mailman/listinfo/mikrotik-users