Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-02-09 Thread Les Mikesell
On Mon, Jan 19, 2015 at 4:53 PM, Charles Polisher cpol...@surewest.net wrote: On Jan 07, 2015 at 01:47:53PM -0600, Les Mikesell wrote: I see a bunch of entries like: ioatdma :00:08.0: Channel halted, chanerr = 2 ioatdma :00:08.0: Channel halted, chanerr = 0 in the logs and one of

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-19 Thread Charles Polisher
On Jan 07, 2015 at 01:47:53PM -0600, Les Mikesell wrote: I see a bunch of entries like: ioatdma :00:08.0: Channel halted, chanerr = 2 ioatdma :00:08.0: Channel halted, chanerr = 0 in the logs and one of these: hrtimer: interrupt took 258633 ns Not sure what those mean. We do

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Les Mikesell
On Wed, Jan 7, 2015 at 12:10 AM, Keith Keller kkel...@wombat.san-francisco.ca.us wrote: On 2015-01-07, Gordon Messmer gordon.mess...@gmail.com wrote: Of course, the other possibility is simply that you've formatted your own filesystems, and they have a maximum mount count or a check interval.

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Les Mikesell
On Wed, Jan 7, 2015 at 1:37 PM, Gary Greene ggre...@minervanetworks.com wrote: Problem is, Gordon, the layer I’m talking about is _below_ the logical layer that filesystems live at, in the block layer, at the mercy of drivers, and firmware that the kernel has zero control over. While in a

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread John R Pierce
On 1/7/2015 11:30 AM, Gary Greene wrote: During the reboot, most card’s drivers on init, will invalidate the cache on the card to ensure dirty pages of data don’t get flushed to disk, to prevent scribbling junk data to the platters. From what I recall, this is true of both the megaraid and

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread John R Pierce
On 1/7/2015 12:15 PM, m.r...@5-cent.us wrote: Actually, the WD Reds and similar are just fine. those are specifically sold for use in small NAS (raid) environments, so yeah, they are configured 'correctly'. -- john r pierce 37N 122W somewhere on the

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Gordon Messmer
On 01/07/2015 05:53 AM, Les Mikesell wrote: Yes - the unattended fsck fails. In that case, there should be logs indicating the cause of the error when it was detected by the kernel. There's probably something wrong with your controller or other hardware. Personally, I'd prefer for the

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Les Mikesell
On Wed, Jan 7, 2015 at 3:30 PM, John R Pierce pie...@hogranch.com wrote: Right... but only cost 133% (about) more than consumer drives, as opposed to the 300% that the server/enterprise grade drives' cost. well, those $$$ drives are likely SAS rather than SATA, and that has other

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Gary Greene
On Jan 7, 2015, at 12:08 PM, John R Pierce pie...@hogranch.com wrote: On 1/7/2015 11:30 AM, Gary Greene wrote: During the reboot, most card’s drivers on init, will invalidate the cache on the card to ensure dirty pages of data don’t get flushed to disk, to prevent scribbling junk data to

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread John R Pierce
On 1/7/2015 12:50 PM, m.r...@5-cent.us wrote: Right... but only cost 133% (about) more than consumer drives, as opposed to the 300% that the server/enterprise grade drives' cost. well, those $$$ drives are likely SAS rather than SATA, and that has other advantages... 10k or 15k RPM gives you

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Les Mikesell
On Wed, Jan 7, 2015 at 10:43 AM, Valeri Galtsev galt...@kicp.uchicago.edu wrote: Not junk - these are mostly IBM 3550/3650 boxes - pretty much top of the line in their day (before the M2/3/4 versions), They have Adaptec raid contollers, I never had Adaptec in _my_ list of good RAID

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Les Mikesell
On Wed, Jan 7, 2015 at 10:15 AM, m.r...@5-cent.us wrote: Yes - the unattended fsck fails. Personally, I'd prefer for the default run to use '-y' in the first place. It's not like I'm more likely than fsck to know how to fix it and it is very inconvenient on remote machines. The recent

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Valeri Galtsev
On Wed, January 7, 2015 10:54 am, Les Mikesell wrote: On Wed, Jan 7, 2015 at 10:43 AM, Valeri Galtsev galt...@kicp.uchicago.edu wrote: Not junk - these are mostly IBM 3550/3650 boxes - pretty much top of the line in their day (before the M2/3/4 versions), They have Adaptec raid contollers,

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Steve Clark
On 01/07/2015 08:53 AM, Les Mikesell wrote: On Wed, Jan 7, 2015 at 12:10 AM, Keith Keller kkel...@wombat.san-francisco.ca.us wrote: On 2015-01-07, Gordon Messmer gordon.mess...@gmail.com wrote: Of course, the other possibility is simply that you've formatted your own filesystems, and they have

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Les Mikesell
On Wed, Jan 7, 2015 at 9:52 AM, Gordon Messmer gordon.mess...@gmail.com wrote: Every regular file's directory entry on your system is a hard link. There's nothing particular about links (files) that make a filesystem fragile. Agreed, although when there are millions, the fsck fixing it is

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Valeri Galtsev
On Wed, January 7, 2015 10:33 am, Les Mikesell wrote: On Wed, Jan 7, 2015 at 9:52 AM, Gordon Messmer gordon.mess...@gmail.com wrote: Every regular file's directory entry on your system is a hard link. There's nothing particular about links (files) that make a filesystem fragile. Agreed,

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Gary Greene
On Jan 6, 2015, at 5:50 PM, Les Mikesell lesmikes...@gmail.com wrote: On Tue, Jan 6, 2015 at 6:37 PM, Gary Greene ggre...@minervanetworks.com wrote: Almost every controller and drive out there now lies about what is and isn’t flushed to disk, making it nigh on impossible for the

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-07 Thread Gary Greene
On Jan 6, 2015, at 9:23 PM, Gordon Messmer gordon.mess...@gmail.com wrote: On 01/06/2015 04:37 PM, Gary Greene wrote: This has been discussed to death on various lists, including the LKML... Almost every controller and drive out there now lies about what is and isn’t flushed to disk,

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-06 Thread Fran Garcia
On Tue, Jan 6, 2015 at 6:12 PM, Les Mikesell wrote: I've had a few systems with a lot of RAM and very busy filesystems come up with filesystem errors that took a manual 'fsck -y' after what should have been a clean reboot. This is particularly annoying on remote systems where I have to talk

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-06 Thread Gary Greene
On Jan 6, 2015, at 4:28 PM, Fran Garcia franchu.gar...@gmail.com wrote: On Tue, Jan 6, 2015 at 6:12 PM, Les Mikesell wrote: I've had a few systems with a lot of RAM and very busy filesystems come up with filesystem errors that took a manual 'fsck -y' after what should have been a clean

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-06 Thread Les Mikesell
On Tue, Jan 6, 2015 at 6:37 PM, Gary Greene ggre...@minervanetworks.com wrote: Almost every controller and drive out there now lies about what is and isn’t flushed to disk, making it nigh on impossible for the Kernel to reliably know 100% of the time that the data HAS been flushed to disk.

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-06 Thread Gordon Messmer
On 01/06/2015 04:37 PM, Gary Greene wrote: This has been discussed to death on various lists, including the LKML... Almost every controller and drive out there now lies about what is and isn’t flushed to disk, making it nigh on impossible for the Kernel to reliably know 100% of the time that

Re: [CentOS] reboot - is there a timeout on filesystem flush?

2015-01-06 Thread Keith Keller
On 2015-01-07, Gordon Messmer gordon.mess...@gmail.com wrote: Of course, the other possibility is simply that you've formatted your own filesystems, and they have a maximum mount count or a check interval. If Les is having to run fsck manually, as he wrote in his OP, then this is unlikely

[CentOS] reboot - is there a timeout on filesystem flush?

2015-01-06 Thread Les Mikesell
I've had a few systems with a lot of RAM and very busy filesystems come up with filesystem errors that took a manual 'fsck -y' after what should have been a clean reboot. This is particularly annoying on remote systems where I have to talk someone else through the recovery. Is there some time