Re: [pfSense] 3 hard locks this week... any ideas?

2016-10-16 Thread WebDawg
On Thu, Sep 8, 2016 at 2:29 PM, Todd Russell  wrote:
> Final update on this issue. When I took it down, I pulled the drive and
> started a Level 2 SpinRite on it while I took out and reseated the RAM then
> ran memtest. I found no errors in either test, so I also took out the Intel
> 4 port gigabit card and reseated that, then put everything back together.
> It has been running for a week straight now with no hiccups of any kind, so
> either the SpinRite forced the drive to correct some read errors or
> removing and reseating the RAM got around some dust or oxidation on the
> contacts. It wouldn't be the first time reseating the RAM cleared otherwise
> unexplainable issues with a machine for me, so I will assume that was the
> case. I wish I'd had time to run the memtest before and after reseating the
> RAM but... AIN'T NOBODY GOT TIME FOR THAT!
>
> Thanks to all for the feedback last week.
>
>
> Peace,
> Todd Russell
> Director of IT and Webmaster
> Saint Joseph Abbey and Seminary College
> 985-867-2266
> 985-789-4319
>


https://en.wikipedia.org/wiki/SpinRite#Solid_state_drives

I mean, even if that card was not inserted properly, you would have
had an issue.  You should have tested that ram before reseat, because
same thing there.  So many peoples comments here are just hearsay.
Hard-locks are usually bad hardware or incompatibility and in that
case you are usually happy when it is happening to get some kernel
messages/dumps that can help you out.  I am glad that you solved your
problem but is bad to make any conclusions that are not based on the
scientific method.
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


Re: [pfSense] 3 hard locks this week... any ideas?

2016-10-16 Thread Volker Kuhlmann
On Fri 02 Sep 2016 13:33:35 NZST +1200, compdoc wrote:

> As for me, these days I install only SSDs in desktop systems that run
> 24/7, and also use them as boot drives for servers. Over the years I
> have had only one SSD fail, and it did show pending sectors in SMART.

That's not my observation with SSDs. Which SSD models do you use?
Or better, how do you select your SSDs? That's be really good to know
from those doing well there.

Thanks,

Volker

-- 
Volker Kuhlmann is list0570 with the domain in header.
http://volker.top.geek.nz/  Please do not CC list postings to me.
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


Re: [pfSense] 3 hard locks this week... any ideas?

2016-10-16 Thread Volker Kuhlmann
On Fri 02 Sep 2016 10:14:59 NZST +1200, Todd Russell wrote:

> I will just run level 2 SpinRite on the SSD to force the drive to read
> every spot, which should trigger the error correction if that is happening.

Ehh, you use what for that? Toss spinrite into the bit bucket as
suggested. Log into your pfsense (or any unix!), obtain root
priviledges, and run
  dd bs=16k if=/dev/yourdisk of=/dev/null

Use what you have!! Why install extra cr^H^H^Hstuff? dd *always* works
as exected. Change buffer size as you see fit, and add an option to
prevent block buffering (if supported by bsd and if it works like
linux).

> plenty experience with that scourge.  :/  I did use the diagnostics in the
> web gui to check the SMART info and it didn't say anything out of the
> ordinary, but I have seen at least 2 Samsung SSDs over the years lose data
> with no warning and no errors in SMART.

The SMART info is effectively a status collected over time. Sectors going
bad without detectable warning by necessitiy don't give SMART a chance.
Ditto disks that fail suddenly and catastrophically. SMART is not a
fix-all, but is is very very usful in many cases.

Volker

-- 
Volker Kuhlmann is list0570 with the domain in header.
http://volker.top.geek.nz/  Please do not CC list postings to me.
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-08 Thread Espen Johansen
Compdoc:
Your spinrite comments just show how dangerous some knowledge is without
propper understanding. Spinrite does indeed force SSDs to "fix" themselves
because it reads extensively (causes heat) and forces "half" working areas
to be marked bad. Most SSDs has minor defects from day one. Just like most
spinning drives has bad sectors marked when it arrives from the factory.
You can force the same result by reading all parts of a SSD drive
extensively. Spinrite does not per definition fix a SSD drive, but it does
make the firmware (software) in the drive detect read errors that might not
be relocated during normal operation. I have forced SSDs to fix themselves
since i got my first SSD more then 10 years ago. Often with the help of
Spinrite.

-lsf

On Thu, Sep 8, 2016, 22:29 Todd Russell  wrote:

> Final update on this issue. When I took it down, I pulled the drive and
> started a Level 2 SpinRite on it while I took out and reseated the RAM then
> ran memtest. I found no errors in either test, so I also took out the Intel
> 4 port gigabit card and reseated that, then put everything back together.
> It has been running for a week straight now with no hiccups of any kind, so
> either the SpinRite forced the drive to correct some read errors or
> removing and reseating the RAM got around some dust or oxidation on the
> contacts. It wouldn't be the first time reseating the RAM cleared otherwise
> unexplainable issues with a machine for me, so I will assume that was the
> case. I wish I'd had time to run the memtest before and after reseating the
> RAM but... AIN'T NOBODY GOT TIME FOR THAT!
>
> Thanks to all for the feedback last week.
>
>
> Peace,
> Todd Russell
> Director of IT and Webmaster
> Saint Joseph Abbey and Seminary College
> 985-867-2266
> 985-789-4319
>
> Please consider helping Saint Joseph Abbey and Seminary College recover
> from the devastating flood waters that overtook our campus on March 11,
> 2016.
> http://helptheabbey.com
>
> ---
>
> http://saintjosephabbey.com
>
> For IT Requests, please submit a ticket at:
>
> https://docs.google.com/forms/d/1e3PCRvnEVNU5-rVFolf9zivA9-m41Nj07eDjjCtFwpI/viewform?usp=send_form#start=invite
>
> On Thu, Sep 1, 2016 at 8:33 PM, compdoc  wrote:
>
> > >I'd suggest that before you slag programs, you not rely on old,
> outdated,
> > biased information.
> >
> >
> >
> >
> >
> > Spinrite 6 is a twelve year program that seemed cool back in the day, but
> > I would never recommend it to anyone now.
> >
> >
> >
> > Repairing computers for a living, Im always on the lookout for useful
> > tools. I don’t find Spinrite useful.
> >
> >
> >
> > I once watched spinrite work on a failing HDD for a day and a half, and
> > did nothing more than place additional wear on the drive. Does that make
> me
> > biased?
> >
> >
> >
> > Speaking of outdated... In 2013 Steve Gibson said he would finally update
> > it, but nothing so far?
> >
> >
> >
> > Here's an interesting quote:
> >
> >
> >
> > Gibson said that he could "see absolutely no possible benefit to running
> > SpinRite on a solid-state drive" and later "SpinRite is all about
> mechanics
> > and magnetics, neither of which exist, by design, in an SSD"
> >
> >
> >
> > And for your information, SMART records events. Some of those events will
> > happen under load, since that’s the nature of mechanical drives.
> >
> >
> >
> > However, a bad sector is a bad sector and load or no, that does not
> > change. Once they start to fail you replace the HDD, not try to repair
> it.
> >
> >
> >
> > Modern drives automatically reallocate sectors, meaning bad sectors are
> > replaced with spares. Not even spinrite can recover lost data from these
> > spare sectors that have never been used before.
> >
> >
> >
> > As for me, these days I install only SSDs in desktop systems that run
> > 24/7, and also use them as boot drives for servers. Over the years I have
> > had only one SSD fail, and it did show pending sectors in SMART.
> >
> >
> >
> > ___
> > pfSense mailing list
> > https://lists.pfsense.org/mailman/listinfo/list
> > Support the project with Gold! https://pfsense.org/gold
> >
> ___
> pfSense mailing list
> https://lists.pfsense.org/mailman/listinfo/list
> Support the project with Gold! https://pfsense.org/gold
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold

Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-08 Thread Todd Russell
Final update on this issue. When I took it down, I pulled the drive and
started a Level 2 SpinRite on it while I took out and reseated the RAM then
ran memtest. I found no errors in either test, so I also took out the Intel
4 port gigabit card and reseated that, then put everything back together.
It has been running for a week straight now with no hiccups of any kind, so
either the SpinRite forced the drive to correct some read errors or
removing and reseating the RAM got around some dust or oxidation on the
contacts. It wouldn't be the first time reseating the RAM cleared otherwise
unexplainable issues with a machine for me, so I will assume that was the
case. I wish I'd had time to run the memtest before and after reseating the
RAM but... AIN'T NOBODY GOT TIME FOR THAT!

Thanks to all for the feedback last week.


Peace,
Todd Russell
Director of IT and Webmaster
Saint Joseph Abbey and Seminary College
985-867-2266
985-789-4319

Please consider helping Saint Joseph Abbey and Seminary College recover
from the devastating flood waters that overtook our campus on March 11,
2016.
http://helptheabbey.com

---

http://saintjosephabbey.com

For IT Requests, please submit a ticket at:
https://docs.google.com/forms/d/1e3PCRvnEVNU5-rVFolf9zivA9-m41Nj07eDjjCtFwpI/viewform?usp=send_form#start=invite

On Thu, Sep 1, 2016 at 8:33 PM, compdoc  wrote:

> >I'd suggest that before you slag programs, you not rely on old, outdated,
> biased information.
>
>
>
>
>
> Spinrite 6 is a twelve year program that seemed cool back in the day, but
> I would never recommend it to anyone now.
>
>
>
> Repairing computers for a living, Im always on the lookout for useful
> tools. I don’t find Spinrite useful.
>
>
>
> I once watched spinrite work on a failing HDD for a day and a half, and
> did nothing more than place additional wear on the drive. Does that make me
> biased?
>
>
>
> Speaking of outdated... In 2013 Steve Gibson said he would finally update
> it, but nothing so far?
>
>
>
> Here's an interesting quote:
>
>
>
> Gibson said that he could "see absolutely no possible benefit to running
> SpinRite on a solid-state drive" and later "SpinRite is all about mechanics
> and magnetics, neither of which exist, by design, in an SSD"
>
>
>
> And for your information, SMART records events. Some of those events will
> happen under load, since that’s the nature of mechanical drives.
>
>
>
> However, a bad sector is a bad sector and load or no, that does not
> change. Once they start to fail you replace the HDD, not try to repair it.
>
>
>
> Modern drives automatically reallocate sectors, meaning bad sectors are
> replaced with spares. Not even spinrite can recover lost data from these
> spare sectors that have never been used before.
>
>
>
> As for me, these days I install only SSDs in desktop systems that run
> 24/7, and also use them as boot drives for servers. Over the years I have
> had only one SSD fail, and it did show pending sectors in SMART.
>
>
>
> ___
> pfSense mailing list
> https://lists.pfsense.org/mailman/listinfo/list
> Support the project with Gold! https://pfsense.org/gold
>
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold

Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-01 Thread compdoc
>I'd suggest that before you slag programs, you not rely on old, outdated, 
>biased information.

 

 

Spinrite 6 is a twelve year program that seemed cool back in the day, but I 
would never recommend it to anyone now. 

 

Repairing computers for a living, Im always on the lookout for useful tools. I 
don’t find Spinrite useful.

 

I once watched spinrite work on a failing HDD for a day and a half, and did 
nothing more than place additional wear on the drive. Does that make me biased?

 

Speaking of outdated... In 2013 Steve Gibson said he would finally update it, 
but nothing so far? 

 

Here's an interesting quote:

 

Gibson said that he could "see absolutely no possible benefit to running 
SpinRite on a solid-state drive" and later "SpinRite is all about mechanics and 
magnetics, neither of which exist, by design, in an SSD"

 

And for your information, SMART records events. Some of those events will 
happen under load, since that’s the nature of mechanical drives. 

 

However, a bad sector is a bad sector and load or no, that does not change. 
Once they start to fail you replace the HDD, not try to repair it.

 

Modern drives automatically reallocate sectors, meaning bad sectors are 
replaced with spares. Not even spinrite can recover lost data from these spare 
sectors that have never been used before.

 

As for me, these days I install only SSDs in desktop systems that run 24/7, and 
also use them as boot drives for servers. Over the years I have had only one 
SSD fail, and it did show pending sectors in SMART.

 

___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold

Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-01 Thread WebDawg
On Thu, Sep 1, 2016 at 6:43 PM, Walter Parker  wrote:
> On Thu, Sep 1, 2016 at 3:06 PM, compdoc  wrote:
>
>> >>Coming back tonight to do memtest, SpinRite on the SSD, etc...,
>>
>> Spinrite on an ssd is a terrible idea. It's an ancient program thats even a
>> bad idea to use on hard drives.
>>
>> It doesn't even work on drives larger than 1TB, because it was written in a
>> time when drives were not that big. And there was no such thing as an SSD
>> back then. Toss spinrite in the trash.
>>
>> If you want to know if a drive is failing, you just have to ask it. Just
>> read the SMART info recorded in the drive.
>>
>> Memtest86+ on the other hand is a great idea, but you should let it run as
>> many passes as possible. One or two passes is fine for new equipment, but
>> with old ram that might be flakey, its best to run overnight or at least 4
>> or 5 passes.
>>
>> If the motherboard is 4 or 5 years old, you might check for swollen
>> capacitors, and many of the low cost power supplies go bad in a year or
>> two.
>>
>>
> I suggest you update your knowledge base on SpinRite. It has found a new
> life in helping SSD drives to fix themselves. FYI, the SMART info is often
> different depend on if the drive is under load. SpinRite puts the drive
> under load, so you may not errors on the drive unless are running your own
> seek application. The size limit is 2TB and the program will have a free
> update in the near future to support drives >2TB. Most recommendations are
> to use SpinRite in Level 2 mode (read only), but given that modern drives
> have wear leveling, even running it read-write will not kill a drive that
> does caching and basic wear leveling.
>
> I'd suggest that before you slag programs, you not rely on old, outdated,
> biased information. But that is just me...
>
>
> Walter
>
>
>
>
> --

I think I am guilty also, I did not even know it was still developed actively.

I am glad someone is around to reply back and let everyone know that
it still is.
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-01 Thread Walter Parker
On Thu, Sep 1, 2016 at 3:06 PM, compdoc  wrote:

> >>Coming back tonight to do memtest, SpinRite on the SSD, etc...,
>
> Spinrite on an ssd is a terrible idea. It's an ancient program thats even a
> bad idea to use on hard drives.
>
> It doesn't even work on drives larger than 1TB, because it was written in a
> time when drives were not that big. And there was no such thing as an SSD
> back then. Toss spinrite in the trash.
>
> If you want to know if a drive is failing, you just have to ask it. Just
> read the SMART info recorded in the drive.
>
> Memtest86+ on the other hand is a great idea, but you should let it run as
> many passes as possible. One or two passes is fine for new equipment, but
> with old ram that might be flakey, its best to run overnight or at least 4
> or 5 passes.
>
> If the motherboard is 4 or 5 years old, you might check for swollen
> capacitors, and many of the low cost power supplies go bad in a year or
> two.
>
>
I suggest you update your knowledge base on SpinRite. It has found a new
life in helping SSD drives to fix themselves. FYI, the SMART info is often
different depend on if the drive is under load. SpinRite puts the drive
under load, so you may not errors on the drive unless are running your own
seek application. The size limit is 2TB and the program will have a free
update in the near future to support drives >2TB. Most recommendations are
to use SpinRite in Level 2 mode (read only), but given that modern drives
have wear leveling, even running it read-write will not kill a drive that
does caching and basic wear leveling.

I'd suggest that before you slag programs, you not rely on old, outdated,
biased information. But that is just me...


Walter




-- 
The greatest dangers to liberty lurk in insidious encroachment by men of
zeal, well-meaning but without understanding.   -- Justice Louis D. Brandeis
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-01 Thread WebDawg
On Thu, Sep 1, 2016 at 4:26 PM, Todd Russell  wrote:
> 1 possible clue I didn't mention. Early in the week, I enabled ssh for the
> first time and it started generating ssh keys... but it never finished.
> Hours later I still couldn't ssh in and shrugged my shoulders and forgot
> about it. After the first hard lock reboot, the next time I logged into the
> web console, there were two alerts saying it had started generating ssh
> keys and that it had finished... those were both generated after the
> reboot. The third hard lock happened today while I was working on getting
> ssh in using the key for a user. It happened right at the time when the
> successful ssh should have occurred. Perhaps this suggests something with
> drive access or maybe memory?
>
> Peace,
> Todd Russell
> Director of IT and Webmaster
> Saint Joseph Abbey and Seminary College
> 985-867-2266
> 985-789-4319
>

That sounds like it could be something but you would have to see if
there is something running in the background peaking a cpu or
something like that.

You could also check what happens on a login...
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-01 Thread Todd Russell
1 possible clue I didn't mention. Early in the week, I enabled ssh for the
first time and it started generating ssh keys... but it never finished.
Hours later I still couldn't ssh in and shrugged my shoulders and forgot
about it. After the first hard lock reboot, the next time I logged into the
web console, there were two alerts saying it had started generating ssh
keys and that it had finished... those were both generated after the
reboot. The third hard lock happened today while I was working on getting
ssh in using the key for a user. It happened right at the time when the
successful ssh should have occurred. Perhaps this suggests something with
drive access or maybe memory?

Peace,
Todd Russell
Director of IT and Webmaster
Saint Joseph Abbey and Seminary College
985-867-2266
985-789-4319

Please consider helping Saint Joseph Abbey and Seminary College recover
from the devastating flood waters that overtook our campus on March 11,
2016.
http://helptheabbey.com

---

http://saintjosephabbey.com

For IT Requests, please submit a ticket at:
https://docs.google.com/forms/d/1e3PCRvnEVNU5-rVFolf9zivA9-m41Nj07eDjjCtFwpI/viewform?usp=send_form#start=invite

On Thu, Sep 1, 2016 at 4:53 PM, Todd Russell  wrote:

> Everything had been fine for ages. Had a hard lock Tuesday before lunch...
> couldn't ping it, no response at physical kb, had to hard reboot it.
>
> Came back late that night to apply 2.3.2 update. Had another hard lock
> today a little after noon. Was looking into it and getting set up to ssh in
> from home so I could plan to reboot every night until after Labor Day trip
> when I would look further into it. Then got another hard lock while trying
> to ssh in around 3:30.
>
> Coming back tonight to do memtest, SpinRite on the SSD, etc..., but I was
> wondering if anyone has any ideas of anything that might cause hard locks
> aside from hardware problems? If this was linux, I would blame it on
> systemd, but I don't know if FreeBSD would ever hard lock outside of
> hardware issues.
>
> The hardware is a SuperMicro Atom board I bought from iXSystems installed
> to a Samsung 850 Pro with 8GB ECC RAM.
>
> I know this isn't much to go on, and I am not expecting help with
> troubleshooting, but there was nothing in system logs or dmesg that looked
> out of place after the first 2. Mostly I am curious if others have ever
> seen hard locks happen in FreeBSD and what caused them in their experience.
> Thanks in advance for any help.
>
> Peace,
> Todd Russell
> Director of IT and Webmaster
> Saint Joseph Abbey and Seminary College
> 985-867-2266
> 985-789-4319
>
> Please consider helping Saint Joseph Abbey and Seminary College recover
> from the devastating flood waters that overtook our campus on March 11,
> 2016.
> http://helptheabbey.com
>
> ---
>
> http://saintjosephabbey.com
>
> For IT Requests, please submit a ticket at:
> https://docs.google.com/forms/d/1e3PCRvnEVNU5-rVFolf9zivA9-
> m41Nj07eDjjCtFwpI/viewform?usp=send_form#start=invite
>
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-01 Thread Todd Russell
>
> If that supermicro atom board is not ecc then memory could be a
> culprit.  I agree though:  Spinrite on an SSD?
>

See above about SpinRite. Actually have seen Level 2 force a Samsung SSD to
do the error correction to silently recover data that was not being read
back before.


> How are you rebooting it?  Remotely?


Since it was hard-locked, I had to hit the reset button on the front of the
case.


> Are your nic cards good?


They have been running with no problems for maybe 2 years.


>   Is
> your networking equipment good?
>

Connected to Avaya switches and they haven't had any other problems.
Networking within the same VLAN was working fine while the pfsense was
hard-locked.


>
> Never had a hard lockbut I did have drives that would idle out and
> crash pfsense. It is a known issue with BSD and I had to disable idle
> on the drives with WDIDLE.  I replaced those with an SSD though...just
> to get rid of that problem.
>
> If there is nothing in the logs, it could be losing connectivity to
> the drives though..I could never catch the logs with the idle out
> issue because the drives would just drop out of the system.
>

I could see the logs right up to the point it locked, so it was saving
right up to the time of the lock, but with no error messages or anything
else.


>
> Did you have access to the main console when this happened?  Does it
> have a VGA monitor?
>

VGA monitor is connected and there was nothing on the screen aside from the
console menu, so whatever triggered the lock did it before any error
messages were flushed to stdout.

I have the install CD for 2.3.2 and a fresh Intel 710 SSD ready to do a
clean install in case I do not find any other hardware problems.


> ___
> pfSense mailing list
> https://lists.pfsense.org/mailman/listinfo/list
> Support the project with Gold! https://pfsense.org/gold
>
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-01 Thread Todd Russell
I will just run level 2 SpinRite on the SSD to force the drive to read
every spot, which should trigger the error correction if that is happening.
I was planning to check the caps when I get here later tonight as I have
plenty experience with that scourge.  :/  I did use the diagnostics in the
web gui to check the SMART info and it didn't say anything out of the
ordinary, but I have seen at least 2 Samsung SSDs over the years lose data
with no warning and no errors in SMART.

I burned a copy of the latest install disc and I may do a clean install and
reload my config if I can't find anything with the hardware.

Peace,
Todd Russell
Director of IT and Webmaster
Saint Joseph Abbey and Seminary College
985-867-2266
985-789-4319

Please consider helping Saint Joseph Abbey and Seminary College recover
from the devastating flood waters that overtook our campus on March 11,
2016.
http://helptheabbey.com

---

http://saintjosephabbey.com

For IT Requests, please submit a ticket at:
https://docs.google.com/forms/d/1e3PCRvnEVNU5-rVFolf9zivA9-m41Nj07eDjjCtFwpI/viewform?usp=send_form#start=invite

On Thu, Sep 1, 2016 at 5:06 PM, compdoc  wrote:

> >>Coming back tonight to do memtest, SpinRite on the SSD, etc...,
>
> Spinrite on an ssd is a terrible idea. It's an ancient program thats even a
> bad idea to use on hard drives.
>
> It doesn't even work on drives larger than 1TB, because it was written in a
> time when drives were not that big. And there was no such thing as an SSD
> back then. Toss spinrite in the trash.
>
> If you want to know if a drive is failing, you just have to ask it. Just
> read the SMART info recorded in the drive.
>
> Memtest86+ on the other hand is a great idea, but you should let it run as
> many passes as possible. One or two passes is fine for new equipment, but
> with old ram that might be flakey, its best to run overnight or at least 4
> or 5 passes.
>
> If the motherboard is 4 or 5 years old, you might check for swollen
> capacitors, and many of the low cost power supplies go bad in a year or
> two.
>
>
> A bad PSU will have swollen caps and burned components inside, but it can
> be
> risky opening it if you aren't a technician.
>
>
>
> ___
> pfSense mailing list
> https://lists.pfsense.org/mailman/listinfo/list
> Support the project with Gold! https://pfsense.org/gold
>
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-01 Thread WebDawg
On Thu, Sep 1, 2016 at 3:53 PM, Todd Russell  wrote:
> Everything had been fine for ages. Had a hard lock Tuesday before lunch...
> couldn't ping it, no response at physical kb, had to hard reboot it.
>
> Came back late that night to apply 2.3.2 update. Had another hard lock
> today a little after noon. Was looking into it and getting set up to ssh in
> from home so I could plan to reboot every night until after Labor Day trip
> when I would look further into it. Then got another hard lock while trying
> to ssh in around 3:30.
>
> Coming back tonight to do memtest, SpinRite on the SSD, etc..., but I was
> wondering if anyone has any ideas of anything that might cause hard locks
> aside from hardware problems? If this was linux, I would blame it on
> systemd, but I don't know if FreeBSD would ever hard lock outside of
> hardware issues.
>
> The hardware is a SuperMicro Atom board I bought from iXSystems installed
> to a Samsung 850 Pro with 8GB ECC RAM.
>
> I know this isn't much to go on, and I am not expecting help with
> troubleshooting, but there was nothing in system logs or dmesg that looked
> out of place after the first 2. Mostly I am curious if others have ever
> seen hard locks happen in FreeBSD and what caused them in their experience.
> Thanks in advance for any help.
>
> Peace,
> Todd Russell
> Director of IT and Webmaster
> Saint Joseph Abbey and Seminary College
> 985-867-2266
> 985-789-4319
>
> Please consider helping Saint Joseph Abbey and Seminary College recover
> from the devastating flood waters that overtook our campus on March 11,
> 2016.
> http://helptheabbey.com
>
> ---
>
> http://saintjosephabbey.com
>
> For IT Requests, please submit a ticket at:
> https://docs.google.com/forms/d/1e3PCRvnEVNU5-rVFolf9zivA9-m41Nj07eDjjCtFwpI/viewform?usp=send_form#start=invite
> ___


If that supermicro atom board is not ecc then memory could be a
culprit.  I agree though:  Spinrite on an SSD?

How are you rebooting it?  Remotely?  Are your nic cards good?  Is
your networking equipment good?

Never had a hard lockbut I did have drives that would idle out and
crash pfsense. It is a known issue with BSD and I had to disable idle
on the drives with WDIDLE.  I replaced those with an SSD though...just
to get rid of that problem.

If there is nothing in the logs, it could be losing connectivity to
the drives though..I could never catch the logs with the idle out
issue because the drives would just drop out of the system.

Did you have access to the main console when this happened?  Does it
have a VGA monitor?
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-01 Thread Moshe Katz
I have seen those symptoms on three different machines over the years, and
all of them were hardware failures - RAM on one of them, power supply on
another, and an old consumer-grade PCI network card on the third. (Most of
the pfSense machines I support are running low-end salvaged hardware, so in
my case it was not unexpected.)

The first two happened with no visible symptoms on the screen, but the
network card failure showed errors on a monitor plugged into the machine
though not in dmesg or any logs.

Moshe

On Sep 1, 2016 5:53 PM, "Todd Russell"  wrote:

> Everything had been fine for ages. Had a hard lock Tuesday before lunch...
> couldn't ping it, no response at physical kb, had to hard reboot it.
>
> Came back late that night to apply 2.3.2 update. Had another hard lock
> today a little after noon. Was looking into it and getting set up to ssh in
> from home so I could plan to reboot every night until after Labor Day trip
> when I would look further into it. Then got another hard lock while trying
> to ssh in around 3:30.
>
> Coming back tonight to do memtest, SpinRite on the SSD, etc..., but I was
> wondering if anyone has any ideas of anything that might cause hard locks
> aside from hardware problems? If this was linux, I would blame it on
> systemd, but I don't know if FreeBSD would ever hard lock outside of
> hardware issues.
>
> The hardware is a SuperMicro Atom board I bought from iXSystems installed
> to a Samsung 850 Pro with 8GB ECC RAM.
>
> I know this isn't much to go on, and I am not expecting help with
> troubleshooting, but there was nothing in system logs or dmesg that looked
> out of place after the first 2. Mostly I am curious if others have ever
> seen hard locks happen in FreeBSD and what caused them in their experience.
> Thanks in advance for any help.
>
> Peace,
> Todd Russell
> Director of IT and Webmaster
> Saint Joseph Abbey and Seminary College
> 985-867-2266
> 985-789-4319
>
> Please consider helping Saint Joseph Abbey and Seminary College recover
> from the devastating flood waters that overtook our campus on March 11,
> 2016.
> http://helptheabbey.com
>
> ---
>
> http://saintjosephabbey.com
>
> For IT Requests, please submit a ticket at:
> https://docs.google.com/forms/d/1e3PCRvnEVNU5-rVFolf9zivA9-
> m41Nj07eDjjCtFwpI/viewform?usp=send_form#start=invite
> ___
> pfSense mailing list
> https://lists.pfsense.org/mailman/listinfo/list
> Support the project with Gold! https://pfsense.org/gold
>
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


Re: [pfSense] 3 hard locks this week... any ideas?

2016-09-01 Thread compdoc
>>Coming back tonight to do memtest, SpinRite on the SSD, etc...,

Spinrite on an ssd is a terrible idea. It's an ancient program thats even a
bad idea to use on hard drives. 

It doesn't even work on drives larger than 1TB, because it was written in a
time when drives were not that big. And there was no such thing as an SSD
back then. Toss spinrite in the trash.

If you want to know if a drive is failing, you just have to ask it. Just
read the SMART info recorded in the drive. 

Memtest86+ on the other hand is a great idea, but you should let it run as
many passes as possible. One or two passes is fine for new equipment, but
with old ram that might be flakey, its best to run overnight or at least 4
or 5 passes. 

If the motherboard is 4 or 5 years old, you might check for swollen
capacitors, and many of the low cost power supplies go bad in a year or two.


A bad PSU will have swollen caps and burned components inside, but it can be
risky opening it if you aren't a technician.



___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold


[pfSense] 3 hard locks this week... any ideas?

2016-09-01 Thread Todd Russell
Everything had been fine for ages. Had a hard lock Tuesday before lunch...
couldn't ping it, no response at physical kb, had to hard reboot it.

Came back late that night to apply 2.3.2 update. Had another hard lock
today a little after noon. Was looking into it and getting set up to ssh in
from home so I could plan to reboot every night until after Labor Day trip
when I would look further into it. Then got another hard lock while trying
to ssh in around 3:30.

Coming back tonight to do memtest, SpinRite on the SSD, etc..., but I was
wondering if anyone has any ideas of anything that might cause hard locks
aside from hardware problems? If this was linux, I would blame it on
systemd, but I don't know if FreeBSD would ever hard lock outside of
hardware issues.

The hardware is a SuperMicro Atom board I bought from iXSystems installed
to a Samsung 850 Pro with 8GB ECC RAM.

I know this isn't much to go on, and I am not expecting help with
troubleshooting, but there was nothing in system logs or dmesg that looked
out of place after the first 2. Mostly I am curious if others have ever
seen hard locks happen in FreeBSD and what caused them in their experience.
Thanks in advance for any help.

Peace,
Todd Russell
Director of IT and Webmaster
Saint Joseph Abbey and Seminary College
985-867-2266
985-789-4319

Please consider helping Saint Joseph Abbey and Seminary College recover
from the devastating flood waters that overtook our campus on March 11,
2016.
http://helptheabbey.com

---

http://saintjosephabbey.com

For IT Requests, please submit a ticket at:
https://docs.google.com/forms/d/1e3PCRvnEVNU5-rVFolf9zivA9-m41Nj07eDjjCtFwpI/viewform?usp=send_form#start=invite
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold