Re: OT: Processor recommendation for RAID6

2021-04-07 Thread Roger Heflin
I ran some tests on a 4 intel socket box with files in tmpfs (gold 6152 I think) and with the files interleaved 4way (I think) got the same speeds you got on your intels (roughly) with defaults. I also tested on my 6 core/4500u ryzen and got almost the same speed(slightly slower) as on your large

Re: OT: Processor recommendation for RAID6

2021-04-02 Thread Roger Heflin
On Fri, Apr 2, 2021 at 4:13 AM Paul Menzel wrote: > > Dear Linux folks, > > > > Are these values a good benchmark for comparing processors? > > After two years, yes they are. I created 16 10 GB files in `/dev/shm`, > set them up as loop devices, and created a RAID6. For resync speed it > makes

Re: Consistent block device references for root= cmdline

2020-06-10 Thread Roger Heflin
No idea if this would still work, but back before label/uuid and lvm in initird I had a staticly linked "C" program that ran inside initrd, it searched for likely places a boot device could be (mounted them and looked for a file to confirm it was the right device, then unmounted it), and when it

Re: Question on MSI support in PCI and PCI-E devices

2015-03-04 Thread Roger Heflin
se consider the environment before printing this email > > > > -Original Message- > From: Roger Heflin [mailto:rogerhef...@gmail.com] > Sent: Wednesday, March 04, 2015 10:31 AM > To: McKay, Luke > Cc: Andrey Utkin; Andrey Utkin; Stephen Hemminger; > kernel-ment...@s

Re: Question on MSI support in PCI and PCI-E devices

2015-03-04 Thread Roger Heflin
I know from some data I have seen that between the Intel Sandy Bridge and Intel Ivy Bridge the same motherboards stopped delivering INTx reliably (int lost under load around 1x every 30 days, driver and firmware has no method to recover from failure) We had to transition to using MSI on some PCI

Re: Question on MSI support in PCI and PCI-E devices

2015-03-04 Thread Roger Heflin
: Roger Heflin [mailto:rogerhef...@gmail.com] Sent: Wednesday, March 04, 2015 10:31 AM To: McKay, Luke Cc: Andrey Utkin; Andrey Utkin; Stephen Hemminger; kernel-ment...@selenic.com; linux-kernel@vger.kernel.org; kernelnewbies Subject: Re: Question on MSI support in PCI and PCI-E devices I know

Re: Question on MSI support in PCI and PCI-E devices

2015-03-04 Thread Roger Heflin
I know from some data I have seen that between the Intel Sandy Bridge and Intel Ivy Bridge the same motherboards stopped delivering INTx reliably (int lost under load around 1x every 30 days, driver and firmware has no method to recover from failure) We had to transition to using MSI on some PCI

Re: read performance is too low compared to write - /dev/sda1

2014-11-14 Thread Roger Heflin
What kind of underlying disk is it? On Fri, Nov 14, 2014 at 7:36 AM, Jagan Teki wrote: > On 14 November 2014 18:50, Roger Heflin wrote: >> If you are robocoping small files you will hit other limits. >> >> Best I have seen with small files is around 30 files/second,

Re: read performance is too low compared to write - /dev/sda1

2014-11-14 Thread Roger Heflin
If you are robocoping small files you will hit other limits. Best I have seen with small files is around 30 files/second, and that involves multiple copies going on. Remember with a small files there are several reads and writes that need to be done to complete a create of a small file and each

Re: read performance is too low compared to write - /dev/sda1

2014-11-14 Thread Roger Heflin
If you are robocoping small files you will hit other limits. Best I have seen with small files is around 30 files/second, and that involves multiple copies going on. Remember with a small files there are several reads and writes that need to be done to complete a create of a small file and each

Re: read performance is too low compared to write - /dev/sda1

2014-11-14 Thread Roger Heflin
What kind of underlying disk is it? On Fri, Nov 14, 2014 at 7:36 AM, Jagan Teki jagannadh.t...@gmail.com wrote: On 14 November 2014 18:50, Roger Heflin rogerhef...@gmail.com wrote: If you are robocoping small files you will hit other limits. Best I have seen with small files is around 30

Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)

2014-08-01 Thread Roger Heflin
Doesn't NFS have an intr flag to allow kill -9 to work? Whenever I have had that set it has appeared to work after about 30 seconds or so...without that kill -9 does not work when the nfs server is missing. On Fri, Aug 1, 2014 at 8:21 PM, Jeff Layton wrote: > On Fri, 1 Aug 2014 07:50:53

Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)

2014-08-01 Thread Roger Heflin
Doesn't NFS have an intr flag to allow kill -9 to work? Whenever I have had that set it has appeared to work after about 30 seconds or so...without that kill -9 does not work when the nfs server is missing. On Fri, Aug 1, 2014 at 8:21 PM, Jeff Layton jlay...@poochiereds.net wrote: On Fri, 1

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
pretty much any smartcommands...I was running something that got all of the smart stats 1x per hour per disk...and this made it crash about 1x per week, if you were pushing the disks hard it appear to make it even more likely to crash under the smart cmds, removing the commands took things up to

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
I had a 9230...on older kernels it worked "ok" so long as you did not do any smart commands, I removed it and went to something that works. Marvell appears to be hit and miss with some cards/chips working right and some not... Do enough smartcmds and the entire board (all 4 ports) locked up

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
I had a 9230...on older kernels it worked ok so long as you did not do any smart commands, I removed it and went to something that works. Marvell appears to be hit and miss with some cards/chips working right and some not... Do enough smartcmds and the entire board (all 4 ports) locked up and

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
pretty much any smartcommands...I was running something that got all of the smart stats 1x per hour per disk...and this made it crash about 1x per week, if you were pushing the disks hard it appear to make it even more likely to crash under the smart cmds, removing the commands took things up to

Re: NFS V4 calls for a NFS v3 mount

2014-04-06 Thread Roger Heflin
No where in the mount command did you tell it that this was a nfsversion 3 only mount, the mount name itself means nothing to mount, so it tired nfs version 4 first then nfs version 3. Note this in the man page for nfs: nfsvers=n The NFS protocol version number used to contact the server's

Re: NFS V4 calls for a NFS v3 mount

2014-04-06 Thread Roger Heflin
No where in the mount command did you tell it that this was a nfsversion 3 only mount, the mount name itself means nothing to mount, so it tired nfs version 4 first then nfs version 3. Note this in the man page for nfs: nfsvers=n The NFS protocol version number used to contact the server's

Re: possible viri in tarballs?

2014-02-05 Thread Roger Heflin
Gene, How big is the file you have? Here is what I have, and this is from several different kernels. wc gadget_multi.txt 150 830 5482 gadget_multi.tx cksum gadget_multi.txt 3973522114 5482 gadget_multi.txt ls -l gadget_multi.txt -rw-rw-r-- 1 root root 5482 Dec 20 09:51 gadget_multi.txt

Re: possible viri in tarballs?

2014-02-05 Thread Roger Heflin
Gene, How big is the file you have? Here is what I have, and this is from several different kernels. wc gadget_multi.txt 150 830 5482 gadget_multi.tx cksum gadget_multi.txt 3973522114 5482 gadget_multi.txt ls -l gadget_multi.txt -rw-rw-r-- 1 root root 5482 Dec 20 09:51 gadget_multi.txt

Re: Probably silly Q about bootable partitions

2013-12-31 Thread Roger Heflin
rescue boot it, change the /boot mount line in /etc/fstab to add noauto (like noauto,defaults...or whatever else you already have) and change the last column to 0 to disable fsck on it. It should boot then, and you have the machine fully up were you can do better debugging. ie mount /boot may

Re: Probably silly Q about bootable partitions

2013-12-31 Thread Roger Heflin
rescue boot it, change the /boot mount line in /etc/fstab to add noauto (like noauto,defaults...or whatever else you already have) and change the last column to 0 to disable fsck on it. It should boot then, and you have the machine fully up were you can do better debugging. ie mount /boot may

Re: Disk schedulers

2008-02-15 Thread Roger Heflin
Lukas Hejtmanek wrote: On Fri, Feb 15, 2008 at 03:42:58PM +0100, Jan Engelhardt wrote: Also consider - DMA (e.g. only UDMA2 selected) - aging disk it's not the case. hdparm reports udma5 is used, if it is reliable with libata. The disk is 3 months old, kernel does not report any errors. And

Re: Disk schedulers

2008-02-15 Thread Roger Heflin
Lukas Hejtmanek wrote: On Fri, Feb 15, 2008 at 03:42:58PM +0100, Jan Engelhardt wrote: Also consider - DMA (e.g. only UDMA2 selected) - aging disk it's not the case. hdparm reports udma5 is used, if it is reliable with libata. The disk is 3 months old, kernel does not report any errors. And

Re: Hangs and reboots under high loads, oops with DEBUG_SHIRQ

2007-07-31 Thread Roger Heflin
Attila Nagy wrote: On 2007.07.30. 18:19, Alan Cox wrote: O> MCE: [153103.918654] HARDWARE ERROR [153103.918655] CPU 1: Machine Check Exception:5 Bank 0: b2401400 [153104.066037] RIP !INEXACT! 10: {mwait_idle+0x46/0x60} [153104.145699] TSC 1167e915e93ce

Re: Hangs and reboots under high loads, oops with DEBUG_SHIRQ

2007-07-31 Thread Roger Heflin
Attila Nagy wrote: On 2007.07.30. 18:19, Alan Cox wrote: O MCE: [153103.918654] HARDWARE ERROR [153103.918655] CPU 1: Machine Check Exception:5 Bank 0: b2401400 [153104.066037] RIP !INEXACT! 10:802569e6 {mwait_idle+0x46/0x60} [153104.145699] TSC

Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-29 Thread Roger Heflin
Dave Kleikamp wrote: On Tue, 2007-05-29 at 12:16 -0500, Roger Heflin wrote: Dave, Apparently there appears to be another different similar lockup, The MTBF has risen from 1-2 hours without that patch to >100 hours, so I am fairly sure the patch did correct the original lockup, or at the v

Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-29 Thread Roger Heflin
Dave Kleikamp wrote: On Tue, 2007-05-29 at 12:16 -0500, Roger Heflin wrote: Dave, Apparently there appears to be another different similar lockup, The MTBF has risen from 1-2 hours without that patch to 100 hours, so I am fairly sure the patch did correct the original lockup, or at the very

BUG: sleeping function called from invalid context at mm/mempool.c:210

2007-05-18 Thread Roger Heflin
I am getting this bug under heavy IO/NFS on 2.6.21.1. BUG: sleeping function called from invalid context at mm/mempool.c:210 So far I have got the error I believe 3 times. Roger - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message

BUG: sleeping function called from invalid context at mm/mempool.c:210

2007-05-18 Thread Roger Heflin
I am getting this bug under heavy IO/NFS on 2.6.21.1. BUG: sleeping function called from invalid context at mm/mempool.c:210 So far I have got the error I believe 3 times. Roger - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message

Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-17 Thread Roger Heflin
Dave Kleikamp wrote: I don't have an answer to an ext3 deadlock, but this looks like a jfs problem that was recently fixed in linux-2.6.22-rc1. I had intended to send it to the stable kernel after it was picked up in mainline, but hadn't gotten to it yet. The patch is here:

Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-17 Thread Roger Heflin
Dave Kleikamp wrote: I don't have an answer to an ext3 deadlock, but this looks like a jfs problem that was recently fixed in linux-2.6.22-rc1. I had intended to send it to the stable kernel after it was picked up in mainline, but hadn't gotten to it yet. The patch is here:

Re: [NFS] Kernel BUG at mm/slab.c:2380 on NFS server in nfsd (2.6.21.1)

2007-05-16 Thread Roger Heflin
J. Bruce Fields wrote: On Wed, May 16, 2007 at 08:55:19AM -0500, Roger Heflin wrote: Running bonnie over nfs on a RHEL4.4 client against a 2.6.21.1 server got me this crash after about 4 hours of running on the server: This was running lvm -> ext3 -> nfs nfsclient (RHEL4.4). Yipes

Kernel BUG at mm/slab.c:2380 on NFS server in nfsd (2.6.21.1)

2007-05-16 Thread Roger Heflin
Running bonnie over nfs on a RHEL4.4 client against a 2.6.21.1 server got me this crash after about 4 hours of running on the server: This was running lvm -> ext3 -> nfs nfsclient (RHEL4.4). Ideas? Roger May 15 21:10:31 vault1 kernel: [ cut here

Kernel BUG at mm/slab.c:2380 on NFS server in nfsd (2.6.21.1)

2007-05-16 Thread Roger Heflin
Running bonnie over nfs on a RHEL4.4 client against a 2.6.21.1 server got me this crash after about 4 hours of running on the server: This was running lvm - ext3 - nfs nfsclient (RHEL4.4). Ideas? Roger May 15 21:10:31 vault1 kernel: [ cut here

Re: [NFS] Kernel BUG at mm/slab.c:2380 on NFS server in nfsd (2.6.21.1)

2007-05-16 Thread Roger Heflin
J. Bruce Fields wrote: On Wed, May 16, 2007 at 08:55:19AM -0500, Roger Heflin wrote: Running bonnie over nfs on a RHEL4.4 client against a 2.6.21.1 server got me this crash after about 4 hours of running on the server: This was running lvm - ext3 - nfs nfsclient (RHEL4.4). Yipes. Has

Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-15 Thread Roger Heflin
Dave Kleikamp wrote: Sorry if I'm missing anyone on the reply, but my mail feed is messed up and I'm replying from the gmane archive. On Tue, 15 May 2007 09:08:25 -0500, Roger Heflin wrote: Hello, Running 2.6.21.1 (FC6 Dist), with a RHEL client (client appears to not be having issues) I am

Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-15 Thread Roger Heflin
Dave Kleikamp wrote: Sorry if I'm missing anyone on the reply, but my mail feed is messed up and I'm replying from the gmane archive. On Tue, 15 May 2007 09:08:25 -0500, Roger Heflin wrote: Hello, Running 2.6.21.1 (FC6 Dist), with a RHEL client (client appears to not be having issues) I am

Re: Nvidia MCP55 Machine reboots on ixgb driver load

2007-01-24 Thread Roger Heflin
Auke Kok wrote: [added netdev to CC] Roger Heflin wrote: I have a machine (actually 2 machines) that upon loading the intel 10GBe driver (ixgb) the machine reboots, I am using a RHAS4.4 based distribution with Vanilla 2.6.19.2 (the RHAS 4.4.03 kernel also reboots with the ixgb load), I don't

Re: Nvidia MCP55 Machine reboots on ixgb driver load

2007-01-24 Thread Roger Heflin
Jeff V. Merkey wrote: I have seen something similiar with the ixgb. make certain there are **NO** other adapters sharing the PCI bus with the ixgb. There are some serious hardware compatibility issues with the ixgb mixing it with other cards on the same PCI-X bus, and I have seen power

Re: Nvidia MCP55 Machine reboots on ixgb driver load

2007-01-24 Thread Roger Heflin
Jeff V. Merkey wrote: I have seen something similiar with the ixgb. make certain there are **NO** other adapters sharing the PCI bus with the ixgb. There are some serious hardware compatibility issues with the ixgb mixing it with other cards on the same PCI-X bus, and I have seen power

Re: Nvidia MCP55 Machine reboots on ixgb driver load

2007-01-24 Thread Roger Heflin
Auke Kok wrote: [added netdev to CC] Roger Heflin wrote: I have a machine (actually 2 machines) that upon loading the intel 10GBe driver (ixgb) the machine reboots, I am using a RHAS4.4 based distribution with Vanilla 2.6.19.2 (the RHAS 4.4.03 kernel also reboots with the ixgb load), I don't

RE: kernel 2.6.13 buffer strangeness

2005-09-09 Thread Roger Heflin
I saw it mentioned before that the kernel only allows a certain percentage of total memory to be dirty, I thought the number was around 40%, and I have seen machines with large amounts of ram, hit the 40% and then put the writing application into disk wait until certain amounts of things are

RE: kernel 2.6.13 buffer strangeness

2005-09-09 Thread Roger Heflin
I saw it mentioned before that the kernel only allows a certain percentage of total memory to be dirty, I thought the number was around 40%, and I have seen machines with large amounts of ram, hit the 40% and then put the writing application into disk wait until certain amounts of things are

RE: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Roger Heflin
With the Segate sata's I worked with before, I had to actually remove them from the blacklist, this was a couple of months ago with the native sata seagate disks. With the drive in the blacklist the drive worked right under light conditions, but under a dd read from the boot seagate the entire

RE: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Roger Heflin
With the Segate sata's I worked with before, I had to actually remove them from the blacklist, this was a couple of months ago with the native sata seagate disks. With the drive in the blacklist the drive worked right under light conditions, but under a dd read from the boot seagate the entire

RE: NCQ support NVidia NForce4 (CK804) SATAII

2005-08-11 Thread Roger Heflin
For high end stuff Serverworks is supposed to have some AMD stuff soon (this is rumor I heard). >From what Allen said, the implication to me is that something in the current NVIDIA stat NCQ chipset is *not* fully under NVIDIA's control, ie they got some piece of technology from someone else and

RE: NCQ support NVidia NForce4 (CK804) SATAII

2005-08-11 Thread Roger Heflin
For high end stuff Serverworks is supposed to have some AMD stuff soon (this is rumor I heard). From what Allen said, the implication to me is that something in the current NVIDIA stat NCQ chipset is *not* fully under NVIDIA's control, ie they got some piece of technology from someone else and

RE: MCE problem on dual Opteron

2005-08-04 Thread Roger Heflin
If this does not happen immediately at boot up (before the machine finished all init stuff), it is generally a hardware problem. In my experience with new machines 75% of the time it will be the cpu itself, and another 25% it will be a serious memory error. The machine I have dealt with are

RE: MCE problem on dual Opteron

2005-08-04 Thread Roger Heflin
If this does not happen immediately at boot up (before the machine finished all init stuff), it is generally a hardware problem. In my experience with new machines 75% of the time it will be the cpu itself, and another 25% it will be a serious memory error. The machine I have dealt with are

ECC Support in Linux

2005-08-01 Thread Roger Heflin
I have had a fair amount of trouble with the limited support for ecc reporting on higher end dual and quad cpu servers as the reporting is pretty weak. On the opterons I can tell which cpu gets errors, but mcelog does not isolate things down to the dimm level properly, is there a way to do

ECC Support in Linux

2005-08-01 Thread Roger Heflin
I have had a fair amount of trouble with the limited support for ecc reporting on higher end dual and quad cpu servers as the reporting is pretty weak. On the opterons I can tell which cpu gets errors, but mcelog does not isolate things down to the dimm level properly, is there a way to do

RE: accessing CD fs from initrd

2005-07-25 Thread Roger Heflin
/dev/cdrom is a link to the proper device, if that link is not on the initrd /dev/cdrom won't work. I previously had some statically linked linuxrc C code (I don't have the code anymore- it was a work-for-hire), that scanned the various locations that the cd could be (/dev/hd[abcd...]) and

RE: HELP: NFS mount hangs when attempting to copy file

2005-07-25 Thread Roger Heflin
A kde and gnome are well above MTU they don't know anything about MTU and neither does NFS, if those hang it up you have a network configuration problem, and should probably fix it, as a number of other things will show the problem also. Routers almost always have hard coded MTU limits, and they

RE: HELP: NFS mount hangs when attempting to copy file

2005-07-25 Thread Roger Heflin
A kde and gnome are well above MTU they don't know anything about MTU and neither does NFS, if those hang it up you have a network configuration problem, and should probably fix it, as a number of other things will show the problem also. Routers almost always have hard coded MTU limits, and they

RE: accessing CD fs from initrd

2005-07-25 Thread Roger Heflin
/dev/cdrom is a link to the proper device, if that link is not on the initrd /dev/cdrom won't work. I previously had some statically linked linuxrc C code (I don't have the code anymore- it was a work-for-hire), that scanned the various locations that the cd could be (/dev/hd[abcd...]) and

RE: Memory Management

2005-07-22 Thread Roger Heflin
I have seen RH3.0 crash on 32GB systems because it has too much memory tied up in write cache. It required update 2 (this was a while ago) and a change of a parameter in /proc to prevent the crash, it was because of a overagressive write caching change RH implemented in the kernel resulted in

RE: Memory Management

2005-07-22 Thread Roger Heflin
I have seen RH3.0 crash on 32GB systems because it has too much memory tied up in write cache. It required update 2 (this was a while ago) and a change of a parameter in /proc to prevent the crash, it was because of a overagressive write caching change RH implemented in the kernel resulted in