Re: OT: Processor recommendation for RAID6

2021-04-07 Thread Roger Heflin
I ran some tests on a 4 intel socket box with files in tmpfs (gold
6152 I think) and with the files interleaved 4way (I think) got the
same speeds you got on your intels (roughly) with defaults.

I also tested on my 6 core/4500u ryzen and got almost the same
speed(slightly slower) as on your large ryzen boxes with many numa
nodes, so it has to be effectively only using a single numa node and a
single cpu.

I did test my 4500u ryzen machine with fewer cores enabled,  1 core
got 18M, 2 cores got 23M, and 3 got 32M so it did not appear scale
past 3 cores.

I also testing on an ancient a8-5600k and was almost the same speed as
the ryzen.

>From the calls there must be a lot of reading memory.   And I got the
same speed using shm, using tmpfs, using tmpfs+hugepages and using
files on a disk that should have been in file cache.


Re: OT: Processor recommendation for RAID6

2021-04-02 Thread Roger Heflin
On Fri, Apr 2, 2021 at 4:13 AM Paul Menzel  wrote:
>
> Dear Linux folks,
>
>

> > Are these values a good benchmark for comparing processors?
>
> After two years, yes they are. I created 16 10 GB files in `/dev/shm`,
> set them up as loop devices, and created a RAID6. For resync speed it
> makes difference.
>
> 2 x AMD EPYC 7601 32-Core Processor:34671K/sec
> 2 x Intel Xeon Gold 6248 CPU @ 2.50GHz: 87533K/sec
>
> So, the current state of affairs seems to be, that AVX512 instructions
> do help for software RAIDs, if you want fast rebuild/resync times.
> Getting, for example, a four core/eight thread Intel Xeon Gold 5222
> might be useful.
>
> Now, the question remains, if AMD processors could make it up with
> higher performance, or better optimized code, or if AVX512 instructions
> are a must,
>
> […]
>
>
> Kind regards,
>
> Paul
>
>
> PS: Here are the commands on the AMD EPYC system:
>
> ```
> $ for i in $(seq 1 16); do truncate -s 10G /dev/shm/vdisk$i.img; done
> $ for i in /dev/shm/v*.img; do sudo losetup --find --show $i; done
> /dev/loop0
> /dev/loop1
> /dev/loop2
> /dev/loop3
> /dev/loop4
> /dev/loop5
> /dev/loop6
> /dev/loop7
> /dev/loop8
> /dev/loop9
> /dev/loop10
> /dev/loop11
> /dev/loop12
> /dev/loop13
> /dev/loop14
> /dev/loop15
> $ sudo mdadm --create /dev/md1 --level=6 --raid-devices=16
> /dev/loop{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
> mdadm: Defaulting to version 1.2 metadata
> mdadm: array /dev/md1 started.
> $ more /proc/mdstat
> Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4]
> [multipath]
> md1 : active raid6 loop15[15] loop14[14] loop13[13] loop12[12]
> loop11[11] loop10[10] loop9[9] loop8[8] loop7[7] loop6[6] loop5[5]266
> loop4[4] loop3[3] lo
> op2[2] loop1[1] loop0[0]
>146671616 blocks super 1.2 level 6, 512k chunk, algorithm 276
> [16/16] []
>[>]  resync =  3.9% (416880/10476544)
> finish=5.6min speed=29777K/sec
>
> unused devices: 
> $ more /proc/mdstat
> Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4]
> [multipath]
> md1 : active raid6 loop15[15] loop14[14] loop13[13] loop12[12]
> loop11[11] loop10[10] loop9[9] loop8[8] loop7[7] loop6[6] loop5[5]
> loop4[4] loop3[3] lo
> op2[2] loop1[1] loop0[0]
>146671616 blocks super 1.2 level 6, 512k chunk, algorithm 2
> [16/16] []
>[>]  resync =  4.1% (439872/10476544)
> finish=5.3min speed=31419K/sec
> $ sudo mdadm -S /dev/md1
> mdadm: stopped /dev/md1
> $ sudo losetup -D
> $ sudo rm /dev/shm/vdisk*.img


I think you are testing something else.  Your speeds are way below
what the raw processor can do. You are probably testing memory
speed/numa arch differences between the 2.

On the intel arch there are 2 numa nodes total with 4 channels, so the
system  has 8 usable channels of bandwidth, but a allocation on a
single numa node will only have 4 channels usable (ddr4-2933)

On the epyc there are 8 numa nodes with 2 channels each (ddr4-2666),
so any single memory allocation will have only 2 channels available
and if the accesses are across the numa bus will be slower.

So 4*2933/2*2666 = 2.20 * 34671 = 76286 (fairly close to your results).

How the allocation for memory works depends a lot on how much ram you
actually have per numa node and how much for the whole machine.  But
any single block for any single device should be on a single numa node
almost all of the time.

You might want to drop the cache before the test, run numactl
--hardware to see how much memory is free per numa node, then rerun
the test and at the of the test before the stop run numactl --hardware
again to see how it was spread across numa nodes.  Even if it spreads
it across multiple numa nodes that may well mean that on the epyc case
you are running with several numa nodes were the main raid processes
are running against remote numa nodes, and because intel only has 2
then there is a decent chance that it is only running on 1 most of the
time (so no remote memory).  I have also seen in benchmarks I have run
on 2P and 4P intel machines that interleaved on a 2P single thread job
is faster than running on a single numa nodes memory (with the process
pinned to a single cpu on one of the numa nodes, memory interleaved
over both), but on a 4P/4numa node machine interleaving slows it down
significantly.  And in the default case any single write/read of a
block is likely only on a single numa node so that specific read/write
is constrained by a single numa node bandwidth giving an advantage to
fewer faster/bigger numa nodes and less remote memory.

Outside of rebooting and forcing the entire machine to interleave I am
not sure how to get shm to interleave.   It might be a good enough
test to just force the epyc to interleave and see if the benchmark
result changes in any way.  If the result does change repeat on the
intel.  Overall for the most part the raid would not be able to use
very many cpu anyway, so a bigger machine with more 

Re: Consistent block device references for root= cmdline

2020-06-10 Thread Roger Heflin
No idea if this would still work, but back before label/uuid and lvm
in initird I had a staticly linked "C" program that ran inside initrd,
it searched for likely places a boot device could be (mounted them and
looked for a file to confirm it was the right device, then unmounted
it), and when it found the right one, it then echo's is major/minor
numbers into /proc/sys/kernel/real-root-dev and that is used for root=
without it being on the command line.  Assuming you could get
something similar started by sytemd and/or udev inside the initrd it
might still work.

On Wed, Jun 10, 2020 at 11:51 AM Ulf Hansson  wrote:
>
> On Wed, 10 Jun 2020 at 15:15, Matthias Schiffer
>  wrote:
> >
> > Hello all,
> >
> > there have been numerous attempts to make the numbering of mmcblk
> > devices consistent, mostly by using aliases from the DTS ([1], [2],
> > [3]), but all have been (rightfully) rejected. Unless I have overlooked
> > a more recent development, no attempts for a different solution were
> > made.
>
> According to aliases attempts, I think those have failed, mainly
> because of two reasons.
>
> 1. Arguments stating that LABELs/UUIDs are variable alternatives. This
> isn't the case, which I think was also concluded from the several
> earlier discussions.
> 2. Patches that tried adding support for mmc aliases, were not
> correctly coded. More precisely, what needs to be addressed is that
> the mmc core also preserves the same ids to be set for the host class
> as the block device, mmc[n] must correspond to mmcblk[n].
>
> >
> > As far as I can tell, the core of the issue seems to be the following:
> >
> > The existing solutions like LABELs and UUIDs are viable alternatives in
> > many cases, but in particular on embedded systems, this is not quite
> > sufficient: In addition to the problem that more knowledge about the
> > system to boot is required in the bootloader, this approach fails
> > completely when the same firmware image exists on multiple devices, for
> > example on an eMMC and an SD card - not an entirely uncommon situation
> > during the development of embedded systems.
> >
> > With udev, I can refer to a specific partition using a path like
> > /dev/disk/by-path/platform-2194000.usdhc-part2. In [4] it was proposed
> > to add a way to refer to a device path/phandle from the kernel command
> > line. Has there been any progress on this proposal?
>
> Lots of time during the years I have been approached, both publicly
> and offlist, about whether it would be possible to add support for
> "consistent" mmcblk devices. To me, I am fine with the aliases
> approach, as long as it gets implemented correctly.
>
> >
> > Kind regards,
> > Matthias
> >
> >
> > [1] https://patchwork.kernel.org/patch/8685711/
> > [2] https://lore.kernel.org/patchwork/cover/674381/
> > [3] https://www.spinics.net/lists/linux-mmc/msg26586.html
> > [4] https://www.spinics.net/lists/linux-mmc/msg26708.html
> >
>
> Kind regards
> Uffe


Re: Question on MSI support in PCI and PCI-E devices

2015-03-04 Thread Roger Heflin
We verified the exact same device worked with the previous cpu in the
same mb/bios combination same os/kernel combination, only identified
change for us was a ivy bridge vs a sandy bridge in the same
mb/bios/boardfirmware.

And in this case only one device driver/pci board was using the given
interrupt. Hardware vendor for the given pci board debugged a
firmware dump to determine what state the firmware was in and it was
waiting for in intx that never came. Switching to msi has
resulting in things working reliably.

On Wed, Mar 4, 2015 at 11:04 AM, McKay, Luke  wrote:
> Legacy INTx is shared amongst multiple devices.  Since it is a level 
> sensitive simulation of the interrupt line, it only takes one device (or 
> driver) to forget to clear the interrupt, and then it stuck and won't work 
> for any of the devices using it.
>
> If you're working with one particular device that seems to be causing these 
> sorts of problems then you can verify misbehaving hardware with a PCIe 
> analyzer.  With the analyzer you can verify that when the driver informs the 
> device that it has processed the interrupt that the device sends the 
> deassertion message for the INTx line.
>
> Or if that isn't available, simply verifying that interrupt being cleared by 
> the driver on the end device is taken correctly and then verifying the chain 
> of propagation that clears the interrupt status.  It can be verified through 
> any switch that is in the path, to the root port where the legacy PCI 
> interrupt controller that the interrupt is cleared, to the top level 
> interrupt controller.
>
> Regards,
> Luke
>
> --
> Luke McKay
> Senior Engineer
> Cobham AvComm
> T : +1 (316) 529 5585
>
> Please consider the environment before printing this email
>
>
>
> -Original Message-
> From: Roger Heflin [mailto:rogerhef...@gmail.com]
> Sent: Wednesday, March 04, 2015 10:31 AM
> To: McKay, Luke
> Cc: Andrey Utkin; Andrey Utkin; Stephen Hemminger; 
> kernel-ment...@selenic.com; linux-kernel@vger.kernel.org; kernelnewbies
> Subject: Re: Question on MSI support in PCI and PCI-E devices
>
> I know from some data I have seen that between the Intel Sandy Bridge and 
> Intel Ivy Bridge the same motherboards stopped delivering INTx reliably (int 
> lost under load around 1x every 30 days, driver and
> firmware has no method to recover from failure)   We had to transition
> to using MSI on some PCI cards that had this issue. Our issue was duplicated 
> on a large number of different physical machines so if it was a hardware 
> error is was a lot of different physical machines that had the defect.
>
> On Wed, Mar 4, 2015 at 10:03 AM, McKay, Luke  wrote:
>> I don't personally know of any PCI drivers that use polling instead of 
>> interrupts, since that would really mean the hardware is broke.
>>
>> Basically all you need to do is create a timer, and have it's callback set 
>> to your driver routine that can check the device status registers to 
>> determine if there is work to be done.  The status register(s) would be the 
>> same indicators that should have generated an interrupt.
>>
>> Regards,
>> Luke
>>
>>
>> --
>> Luke McKay
>> Senior Engineer
>> Cobham AvComm
>> T : +1 (316) 529 5585
>>
>> Please consider the environment before printing this email
>>
>>
>>
>> -Original Message-
>> From: Andrey Utkin [mailto:andrey.ut...@corp.bluecherry.net]
>> Sent: Tuesday, March 03, 2015 8:29 AM
>> To: McKay, Luke
>> Cc: Andrey Utkin; Stephen Hemminger; kernel-ment...@selenic.com;
>> linux-kernel@vger.kernel.org; kernelnewbies
>> Subject: Re: Question on MSI support in PCI and PCI-E devices
>>
>> On Mon, Mar 2, 2015 at 4:02 PM, McKay, Luke  wrote:
>>> It doesn't appear that your device supports MSI.  If it did lspci -v should 
>>> list the MSI capability and whether or not it is enabled.
>>>
>>> i.e. Something like...
>>> Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>>
>>> Without a listing that shows the capability is present, there is nothing to 
>>> enable.
>>>
>>> Have you tried polling instead of using interrupts?  Definitely not ideal, 
>>> but it might help you to determine whether hardware is dropping/missing an 
>>> interrupt or whether the hardware is being completely hung up.
>>>
>>> Do you know if this missing interrupt is occurring in other systems as 
>>> well?  How about whether it happens with different boards in the same 
>>> system?  Answers to these questions would help to determine whether you 
>>> might have a defective 

Re: Question on MSI support in PCI and PCI-E devices

2015-03-04 Thread Roger Heflin
I know from some data I have seen that between the Intel Sandy Bridge
and Intel Ivy Bridge the same motherboards stopped delivering INTx
reliably (int lost under load around 1x every 30 days, driver and
firmware has no method to recover from failure)   We had to transition
to using MSI on some PCI cards that had this issue. Our issue was
duplicated on a large number of different physical machines so if it
was a hardware error is was a lot of different physical machines that
had the defect.

On Wed, Mar 4, 2015 at 10:03 AM, McKay, Luke  wrote:
> I don't personally know of any PCI drivers that use polling instead of 
> interrupts, since that would really mean the hardware is broke.
>
> Basically all you need to do is create a timer, and have it's callback set to 
> your driver routine that can check the device status registers to determine 
> if there is work to be done.  The status register(s) would be the same 
> indicators that should have generated an interrupt.
>
> Regards,
> Luke
>
>
> --
> Luke McKay
> Senior Engineer
> Cobham AvComm
> T : +1 (316) 529 5585
>
> Please consider the environment before printing this email
>
>
>
> -Original Message-
> From: Andrey Utkin [mailto:andrey.ut...@corp.bluecherry.net]
> Sent: Tuesday, March 03, 2015 8:29 AM
> To: McKay, Luke
> Cc: Andrey Utkin; Stephen Hemminger; kernel-ment...@selenic.com; 
> linux-kernel@vger.kernel.org; kernelnewbies
> Subject: Re: Question on MSI support in PCI and PCI-E devices
>
> On Mon, Mar 2, 2015 at 4:02 PM, McKay, Luke  wrote:
>> It doesn't appear that your device supports MSI.  If it did lspci -v should 
>> list the MSI capability and whether or not it is enabled.
>>
>> i.e. Something like...
>> Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>
>> Without a listing that shows the capability is present, there is nothing to 
>> enable.
>>
>> Have you tried polling instead of using interrupts?  Definitely not ideal, 
>> but it might help you to determine whether hardware is dropping/missing an 
>> interrupt or whether the hardware is being completely hung up.
>>
>> Do you know if this missing interrupt is occurring in other systems as well? 
>>  How about whether it happens with different boards in the same system?  
>> Answers to these questions would help to determine whether you might have a 
>> defective board, or some sort of incompatibility with the system.
>
> We have just three setups reproducing this. We have no boards for replacement 
> experiments, unfortunately.
> Polling instead of using interrupts sounds interesting. Is there an example 
> of such usage in any other PCI device driver?
>
> --
> Bluecherry developer.
>
>
> Aeroflex is now a Cobham company
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on MSI support in PCI and PCI-E devices

2015-03-04 Thread Roger Heflin
We verified the exact same device worked with the previous cpu in the
same mb/bios combination same os/kernel combination, only identified
change for us was a ivy bridge vs a sandy bridge in the same
mb/bios/boardfirmware.

And in this case only one device driver/pci board was using the given
interrupt. Hardware vendor for the given pci board debugged a
firmware dump to determine what state the firmware was in and it was
waiting for in intx that never came. Switching to msi has
resulting in things working reliably.

On Wed, Mar 4, 2015 at 11:04 AM, McKay, Luke luke.mc...@aeroflex.com wrote:
 Legacy INTx is shared amongst multiple devices.  Since it is a level 
 sensitive simulation of the interrupt line, it only takes one device (or 
 driver) to forget to clear the interrupt, and then it stuck and won't work 
 for any of the devices using it.

 If you're working with one particular device that seems to be causing these 
 sorts of problems then you can verify misbehaving hardware with a PCIe 
 analyzer.  With the analyzer you can verify that when the driver informs the 
 device that it has processed the interrupt that the device sends the 
 deassertion message for the INTx line.

 Or if that isn't available, simply verifying that interrupt being cleared by 
 the driver on the end device is taken correctly and then verifying the chain 
 of propagation that clears the interrupt status.  It can be verified through 
 any switch that is in the path, to the root port where the legacy PCI 
 interrupt controller that the interrupt is cleared, to the top level 
 interrupt controller.

 Regards,
 Luke

 --
 Luke McKay
 Senior Engineer
 Cobham AvComm
 T : +1 (316) 529 5585

 Please consider the environment before printing this email



 -Original Message-
 From: Roger Heflin [mailto:rogerhef...@gmail.com]
 Sent: Wednesday, March 04, 2015 10:31 AM
 To: McKay, Luke
 Cc: Andrey Utkin; Andrey Utkin; Stephen Hemminger; 
 kernel-ment...@selenic.com; linux-kernel@vger.kernel.org; kernelnewbies
 Subject: Re: Question on MSI support in PCI and PCI-E devices

 I know from some data I have seen that between the Intel Sandy Bridge and 
 Intel Ivy Bridge the same motherboards stopped delivering INTx reliably (int 
 lost under load around 1x every 30 days, driver and
 firmware has no method to recover from failure)   We had to transition
 to using MSI on some PCI cards that had this issue. Our issue was duplicated 
 on a large number of different physical machines so if it was a hardware 
 error is was a lot of different physical machines that had the defect.

 On Wed, Mar 4, 2015 at 10:03 AM, McKay, Luke luke.mc...@aeroflex.com wrote:
 I don't personally know of any PCI drivers that use polling instead of 
 interrupts, since that would really mean the hardware is broke.

 Basically all you need to do is create a timer, and have it's callback set 
 to your driver routine that can check the device status registers to 
 determine if there is work to be done.  The status register(s) would be the 
 same indicators that should have generated an interrupt.

 Regards,
 Luke


 --
 Luke McKay
 Senior Engineer
 Cobham AvComm
 T : +1 (316) 529 5585

 Please consider the environment before printing this email



 -Original Message-
 From: Andrey Utkin [mailto:andrey.ut...@corp.bluecherry.net]
 Sent: Tuesday, March 03, 2015 8:29 AM
 To: McKay, Luke
 Cc: Andrey Utkin; Stephen Hemminger; kernel-ment...@selenic.com;
 linux-kernel@vger.kernel.org; kernelnewbies
 Subject: Re: Question on MSI support in PCI and PCI-E devices

 On Mon, Mar 2, 2015 at 4:02 PM, McKay, Luke luke.mc...@aeroflex.com wrote:
 It doesn't appear that your device supports MSI.  If it did lspci -v should 
 list the MSI capability and whether or not it is enabled.

 i.e. Something like...
 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+

 Without a listing that shows the capability is present, there is nothing to 
 enable.

 Have you tried polling instead of using interrupts?  Definitely not ideal, 
 but it might help you to determine whether hardware is dropping/missing an 
 interrupt or whether the hardware is being completely hung up.

 Do you know if this missing interrupt is occurring in other systems as 
 well?  How about whether it happens with different boards in the same 
 system?  Answers to these questions would help to determine whether you 
 might have a defective board, or some sort of incompatibility with the 
 system.

 We have just three setups reproducing this. We have no boards for 
 replacement experiments, unfortunately.
 Polling instead of using interrupts sounds interesting. Is there an example 
 of such usage in any other PCI device driver?

 --
 Bluecherry developer.


 Aeroflex is now a Cobham company


 Aeroflex is now a Cobham company

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo

Re: Question on MSI support in PCI and PCI-E devices

2015-03-04 Thread Roger Heflin
I know from some data I have seen that between the Intel Sandy Bridge
and Intel Ivy Bridge the same motherboards stopped delivering INTx
reliably (int lost under load around 1x every 30 days, driver and
firmware has no method to recover from failure)   We had to transition
to using MSI on some PCI cards that had this issue. Our issue was
duplicated on a large number of different physical machines so if it
was a hardware error is was a lot of different physical machines that
had the defect.

On Wed, Mar 4, 2015 at 10:03 AM, McKay, Luke luke.mc...@aeroflex.com wrote:
 I don't personally know of any PCI drivers that use polling instead of 
 interrupts, since that would really mean the hardware is broke.

 Basically all you need to do is create a timer, and have it's callback set to 
 your driver routine that can check the device status registers to determine 
 if there is work to be done.  The status register(s) would be the same 
 indicators that should have generated an interrupt.

 Regards,
 Luke


 --
 Luke McKay
 Senior Engineer
 Cobham AvComm
 T : +1 (316) 529 5585

 Please consider the environment before printing this email



 -Original Message-
 From: Andrey Utkin [mailto:andrey.ut...@corp.bluecherry.net]
 Sent: Tuesday, March 03, 2015 8:29 AM
 To: McKay, Luke
 Cc: Andrey Utkin; Stephen Hemminger; kernel-ment...@selenic.com; 
 linux-kernel@vger.kernel.org; kernelnewbies
 Subject: Re: Question on MSI support in PCI and PCI-E devices

 On Mon, Mar 2, 2015 at 4:02 PM, McKay, Luke luke.mc...@aeroflex.com wrote:
 It doesn't appear that your device supports MSI.  If it did lspci -v should 
 list the MSI capability and whether or not it is enabled.

 i.e. Something like...
 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+

 Without a listing that shows the capability is present, there is nothing to 
 enable.

 Have you tried polling instead of using interrupts?  Definitely not ideal, 
 but it might help you to determine whether hardware is dropping/missing an 
 interrupt or whether the hardware is being completely hung up.

 Do you know if this missing interrupt is occurring in other systems as well? 
  How about whether it happens with different boards in the same system?  
 Answers to these questions would help to determine whether you might have a 
 defective board, or some sort of incompatibility with the system.

 We have just three setups reproducing this. We have no boards for replacement 
 experiments, unfortunately.
 Polling instead of using interrupts sounds interesting. Is there an example 
 of such usage in any other PCI device driver?

 --
 Bluecherry developer.


 Aeroflex is now a Cobham company
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: read performance is too low compared to write - /dev/sda1

2014-11-14 Thread Roger Heflin
What kind of underlying disk is it?

On Fri, Nov 14, 2014 at 7:36 AM, Jagan Teki  wrote:
> On 14 November 2014 18:50, Roger Heflin  wrote:
>> If you are robocoping small files you will hit other limits.
>>
>> Best I have seen with small files is around 30 files/second, and that
>> involves multiple copies going on.   Remember with a small files there
>> are several reads and writes that need to be done to complete a create
>> of a small file and each of these take time.30 files/second ~ 30ms
>> per file, not that bad considering that on a real spinning disk a
>> single read/write op is 5-10ms, and creating the file entry, copying
>> data and closing the file takes several operations (at least create
>> file entry, write small amount of data, update file entry
>> date/time/info). If the write in the middle is not a significant
>> amount of data, the 2 extra ops are what hurts.
>>
>
> But, I tried 4gb and 1gb files both got a similar numbers.
>
>> On Fri, Nov 14, 2014 at 6:55 AM, Jagan Teki  wrote:
>>> Hi,
>>>
>>> I'm doing a performance testing on my bench ARM box.
>>>
>>> 1. dd test: I have validate the read and write by mounting /dev/sda1
>>> with ext4 filesystem,
>>> able to get the good performance numbers where read is high
>>> compared to write
>>>
>>> 2.  robocopy test:
>>>  - mkfs.ext4 /dev/sda1
>>>  - mount /dev/sda1 /media/disk
>>>  - << configured samba >>
>>>  - Mapped the /media/disk on windows
>>>  - login on the mapped driver in windows
>>>  - did a robocopy test, where write got 84MBps and read 14MBps
>>>
>>> read performance is too slow when compared to write in robocopy case.
>>> Can anyone help me out, how to debug this further.
>
> thanks!
> --
> Jagan.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: read performance is too low compared to write - /dev/sda1

2014-11-14 Thread Roger Heflin
If you are robocoping small files you will hit other limits.

Best I have seen with small files is around 30 files/second, and that
involves multiple copies going on.   Remember with a small files there
are several reads and writes that need to be done to complete a create
of a small file and each of these take time.30 files/second ~ 30ms
per file, not that bad considering that on a real spinning disk a
single read/write op is 5-10ms, and creating the file entry, copying
data and closing the file takes several operations (at least create
file entry, write small amount of data, update file entry
date/time/info). If the write in the middle is not a significant
amount of data, the 2 extra ops are what hurts.

On Fri, Nov 14, 2014 at 6:55 AM, Jagan Teki  wrote:
> Hi,
>
> I'm doing a performance testing on my bench ARM box.
>
> 1. dd test: I have validate the read and write by mounting /dev/sda1
> with ext4 filesystem,
> able to get the good performance numbers where read is high
> compared to write
>
> 2.  robocopy test:
>  - mkfs.ext4 /dev/sda1
>  - mount /dev/sda1 /media/disk
>  - << configured samba >>
>  - Mapped the /media/disk on windows
>  - login on the mapped driver in windows
>  - did a robocopy test, where write got 84MBps and read 14MBps
>
> read performance is too slow when compared to write in robocopy case.
> Can anyone help me out, how to debug this further.
>
> thanks!
> --
> Jagan.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: read performance is too low compared to write - /dev/sda1

2014-11-14 Thread Roger Heflin
If you are robocoping small files you will hit other limits.

Best I have seen with small files is around 30 files/second, and that
involves multiple copies going on.   Remember with a small files there
are several reads and writes that need to be done to complete a create
of a small file and each of these take time.30 files/second ~ 30ms
per file, not that bad considering that on a real spinning disk a
single read/write op is 5-10ms, and creating the file entry, copying
data and closing the file takes several operations (at least create
file entry, write small amount of data, update file entry
date/time/info). If the write in the middle is not a significant
amount of data, the 2 extra ops are what hurts.

On Fri, Nov 14, 2014 at 6:55 AM, Jagan Teki jagannadh.t...@gmail.com wrote:
 Hi,

 I'm doing a performance testing on my bench ARM box.

 1. dd test: I have validate the read and write by mounting /dev/sda1
 with ext4 filesystem,
 able to get the good performance numbers where read is high
 compared to write

 2.  robocopy test:
  - mkfs.ext4 /dev/sda1
  - mount /dev/sda1 /media/disk
  -  configured samba 
  - Mapped the /media/disk on windows
  - login on the mapped driver in windows
  - did a robocopy test, where write got 84MBps and read 14MBps

 read performance is too slow when compared to write in robocopy case.
 Can anyone help me out, how to debug this further.

 thanks!
 --
 Jagan.
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: read performance is too low compared to write - /dev/sda1

2014-11-14 Thread Roger Heflin
What kind of underlying disk is it?

On Fri, Nov 14, 2014 at 7:36 AM, Jagan Teki jagannadh.t...@gmail.com wrote:
 On 14 November 2014 18:50, Roger Heflin rogerhef...@gmail.com wrote:
 If you are robocoping small files you will hit other limits.

 Best I have seen with small files is around 30 files/second, and that
 involves multiple copies going on.   Remember with a small files there
 are several reads and writes that need to be done to complete a create
 of a small file and each of these take time.30 files/second ~ 30ms
 per file, not that bad considering that on a real spinning disk a
 single read/write op is 5-10ms, and creating the file entry, copying
 data and closing the file takes several operations (at least create
 file entry, write small amount of data, update file entry
 date/time/info). If the write in the middle is not a significant
 amount of data, the 2 extra ops are what hurts.


 But, I tried 4gb and 1gb files both got a similar numbers.

 On Fri, Nov 14, 2014 at 6:55 AM, Jagan Teki jagannadh.t...@gmail.com wrote:
 Hi,

 I'm doing a performance testing on my bench ARM box.

 1. dd test: I have validate the read and write by mounting /dev/sda1
 with ext4 filesystem,
 able to get the good performance numbers where read is high
 compared to write

 2.  robocopy test:
  - mkfs.ext4 /dev/sda1
  - mount /dev/sda1 /media/disk
  -  configured samba 
  - Mapped the /media/disk on windows
  - login on the mapped driver in windows
  - did a robocopy test, where write got 84MBps and read 14MBps

 read performance is too slow when compared to write in robocopy case.
 Can anyone help me out, how to debug this further.

 thanks!
 --
 Jagan.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)

2014-08-01 Thread Roger Heflin
Doesn't NFS have an intr flag to allow kill -9 to work?   Whenever I
have had that set it has appeared to work after about 30 seconds or
so...without that kill -9 does not work when the nfs server is
missing.



On Fri, Aug 1, 2014 at 8:21 PM, Jeff Layton  wrote:
> On Fri, 1 Aug 2014 07:50:53 +1000
> NeilBrown  wrote:
>
>> On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear  
>> wrote:
>>
>> > -BEGIN PGP SIGNED MESSAGE-
>> > Hash: SHA1
>> >
>> > On 07/31/2014 01:42 PM, NeilBrown wrote:
>> > > On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear  
>> > > wrote:
>> > >
>> > >> So, this has been asked all over the interweb for years and years, but 
>> > >> the best answer I can find is to reboot the system or create a fake NFS 
>> > >> server
>> > >> somewhere with the same IP as the gone-away NFS server.
>> > >>
>> > >> The problem is:
>> > >>
>> > >> I have some mounts to an NFS server that no longer exists 
>> > >> (crashed/powered down).
>> > >>
>> > >> I have some processes stuck trying to write to files open on these 
>> > >> mounts.
>> > >>
>> > >> I want to kill the process and unmount.
>> > >>
>> > >> umount -l will make the mount go a way, sort of.  But process is still 
>> > >> hung. umount -f complains: umount2:  Device or resource busy 
>> > >> umount.nfs: /mnt/foo:
>> > >> device is busy
>> > >>
>> > >> kill -9 does not work on process.
>> > >
>> > > Kill -1 should work (since about 2.6.25 or so).
>> >
>> > That is -[ONE], right?  Assuming so, it did not work for me.
>>
>> No, it was "-9"  sorry, I really shouldn't be let out without my proof
>> reader.
>>
>> However the 'stack' is sufficient to see what is going on.
>>
>> The problem is that it is blocked inside the "VM" well away from NFS and
>> there is no way for NFS to say "give up and go home".
>>
>> I'd suggest that is a bug.   I cannot see any justification for fsync to not
>> be killable.
>> It wouldn't be too hard to create a patch to make it so.
>> It would be a little harder to examine all call paths and create a
>> convincing case that the patch was safe.
>> It might be herculean task to convince others that it was the right thing
>> to do so let's start with that one.
>>
>> Hi Linux-mm and fs-devel people.  What do people think of making "fsync" and
>> variants "KILLABLE" ??
>>
>> I probably only need a little bit of encouragement to write a patch
>>
>> Thanks,
>> NeilBrown
>>
>
>
> It would be good to fix this in some fashion once and for all, and the
> wait_on_page_writeback wait is a major source of pain for a lot of
> people.
>
> So to summarize...
>
> The problem in a nutshell is that Ben has some cached writes to the
> NFS server, but the server has gone away (presumably forever). The
> question is -- how do we communicate to the kernel that that server
> isn't coming back and that those dirty pages should be invalidated so
> that we can umount the filesystem?
>
> Allowing fsync/close to be killable sounds reasonable to me as at least
> a partial solution. Both close(2) and fsync(2) are allowed to return
> EINTR according to the POSIX spec. Allowing a kill -9 there seems
> like it should be fine, and maybe we ought to even consider letting it
> be susceptible to lesser signals.
>
> That still leaves some open questions though...
>
> Is that enough to fix it? You'd still have the dirty pages lingering
> around, right? Would a umount -f presumably work at that point?
>
>> >
>> > Kernel is 3.14.4+, with some of extra patches, but probably nothing that
>> > influences this particular behaviour.
>> >
>> > [root@lf1005-14010010 ~]# cat /proc/3805/stack
>> > [] sleep_on_page+0x9/0xd
>> > [] wait_on_page_bit+0x71/0x78
>> > [] filemap_fdatawait_range+0xa2/0x16d
>> > [] filemap_write_and_wait_range+0x3b/0x77
>> > [] nfs_file_fsync+0x37/0x83 [nfs]
>> > [] vfs_fsync_range+0x19/0x1b
>> > [] vfs_fsync+0x17/0x19
>> > [] nfs_file_flush+0x6b/0x6f [nfs]
>> > [] filp_close+0x3f/0x71
>> > [] __close_fd+0x80/0x98
>> > [] SyS_close+0x1c/0x3e
>> > [] system_call_fastpath+0x16/0x1b
>> > [] 0x
>> > [root@lf1005-14010010 ~]# kill -1 3805
>> > [root@lf1005-14010010 ~]# cat /proc/3805/stack
>> > [] sleep_on_page+0x9/0xd
>> > [] wait_on_page_bit+0x71/0x78
>> > [] filemap_fdatawait_range+0xa2/0x16d
>> > [] filemap_write_and_wait_range+0x3b/0x77
>> > [] nfs_file_fsync+0x37/0x83 [nfs]
>> > [] vfs_fsync_range+0x19/0x1b
>> > [] vfs_fsync+0x17/0x19
>> > [] nfs_file_flush+0x6b/0x6f [nfs]
>> > [] filp_close+0x3f/0x71
>> > [] __close_fd+0x80/0x98
>> > [] SyS_close+0x1c/0x3e
>> > [] system_call_fastpath+0x16/0x1b
>> > [] 0x
>> >
>> > Thanks,
>> > Ben
>> >
>> > > If it doesn't please report the kernel version and cat /proc/$PID/stack
>> > >
>> > > for some processes that cannot be killed.
>> > >
>> > > NeilBrown
>> > >
>> > >>
>> > >>
>> > >> Aside from bringing a fake NFS server back up on the same IP, is there 
>> > >> any other way to get these mounts unmounted and the processes killed 
>> > >> without
>> > >> rebooting?

Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)

2014-08-01 Thread Roger Heflin
Doesn't NFS have an intr flag to allow kill -9 to work?   Whenever I
have had that set it has appeared to work after about 30 seconds or
so...without that kill -9 does not work when the nfs server is
missing.



On Fri, Aug 1, 2014 at 8:21 PM, Jeff Layton jlay...@poochiereds.net wrote:
 On Fri, 1 Aug 2014 07:50:53 +1000
 NeilBrown ne...@suse.de wrote:

 On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear gree...@candelatech.com 
 wrote:

  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
  On 07/31/2014 01:42 PM, NeilBrown wrote:
   On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear gree...@candelatech.com 
   wrote:
  
   So, this has been asked all over the interweb for years and years, but 
   the best answer I can find is to reboot the system or create a fake NFS 
   server
   somewhere with the same IP as the gone-away NFS server.
  
   The problem is:
  
   I have some mounts to an NFS server that no longer exists 
   (crashed/powered down).
  
   I have some processes stuck trying to write to files open on these 
   mounts.
  
   I want to kill the process and unmount.
  
   umount -l will make the mount go a way, sort of.  But process is still 
   hung. umount -f complains: umount2:  Device or resource busy 
   umount.nfs: /mnt/foo:
   device is busy
  
   kill -9 does not work on process.
  
   Kill -1 should work (since about 2.6.25 or so).
 
  That is -[ONE], right?  Assuming so, it did not work for me.

 No, it was -9  sorry, I really shouldn't be let out without my proof
 reader.

 However the 'stack' is sufficient to see what is going on.

 The problem is that it is blocked inside the VM well away from NFS and
 there is no way for NFS to say give up and go home.

 I'd suggest that is a bug.   I cannot see any justification for fsync to not
 be killable.
 It wouldn't be too hard to create a patch to make it so.
 It would be a little harder to examine all call paths and create a
 convincing case that the patch was safe.
 It might be herculean task to convince others that it was the right thing
 to do so let's start with that one.

 Hi Linux-mm and fs-devel people.  What do people think of making fsync and
 variants KILLABLE ??

 I probably only need a little bit of encouragement to write a patch

 Thanks,
 NeilBrown



 It would be good to fix this in some fashion once and for all, and the
 wait_on_page_writeback wait is a major source of pain for a lot of
 people.

 So to summarize...

 The problem in a nutshell is that Ben has some cached writes to the
 NFS server, but the server has gone away (presumably forever). The
 question is -- how do we communicate to the kernel that that server
 isn't coming back and that those dirty pages should be invalidated so
 that we can umount the filesystem?

 Allowing fsync/close to be killable sounds reasonable to me as at least
 a partial solution. Both close(2) and fsync(2) are allowed to return
 EINTR according to the POSIX spec. Allowing a kill -9 there seems
 like it should be fine, and maybe we ought to even consider letting it
 be susceptible to lesser signals.

 That still leaves some open questions though...

 Is that enough to fix it? You'd still have the dirty pages lingering
 around, right? Would a umount -f presumably work at that point?

 
  Kernel is 3.14.4+, with some of extra patches, but probably nothing that
  influences this particular behaviour.
 
  [root@lf1005-14010010 ~]# cat /proc/3805/stack
  [811371ba] sleep_on_page+0x9/0xd
  [8113738e] wait_on_page_bit+0x71/0x78
  [8113769a] filemap_fdatawait_range+0xa2/0x16d
  [8113780e] filemap_write_and_wait_range+0x3b/0x77
  [a0f04734] nfs_file_fsync+0x37/0x83 [nfs]
  [811a8d32] vfs_fsync_range+0x19/0x1b
  [811a8d4b] vfs_fsync+0x17/0x19
  [a0f05305] nfs_file_flush+0x6b/0x6f [nfs]
  [81183e46] filp_close+0x3f/0x71
  [8119c8ae] __close_fd+0x80/0x98
  [81183de5] SyS_close+0x1c/0x3e
  [815c55f9] system_call_fastpath+0x16/0x1b
  [] 0x
  [root@lf1005-14010010 ~]# kill -1 3805
  [root@lf1005-14010010 ~]# cat /proc/3805/stack
  [811371ba] sleep_on_page+0x9/0xd
  [8113738e] wait_on_page_bit+0x71/0x78
  [8113769a] filemap_fdatawait_range+0xa2/0x16d
  [8113780e] filemap_write_and_wait_range+0x3b/0x77
  [a0f04734] nfs_file_fsync+0x37/0x83 [nfs]
  [811a8d32] vfs_fsync_range+0x19/0x1b
  [811a8d4b] vfs_fsync+0x17/0x19
  [a0f05305] nfs_file_flush+0x6b/0x6f [nfs]
  [81183e46] filp_close+0x3f/0x71
  [8119c8ae] __close_fd+0x80/0x98
  [81183de5] SyS_close+0x1c/0x3e
  [815c55f9] system_call_fastpath+0x16/0x1b
  [] 0x
 
  Thanks,
  Ben
 
   If it doesn't please report the kernel version and cat /proc/$PID/stack
  
   for some processes that cannot be killed.
  
   NeilBrown
  
  
  
   Aside from bringing a fake NFS server back up on the same IP, is there 
   

Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
pretty much any smartcommands...I was running something that got all
of the smart stats 1x per hour per disk...and this made it crash about
1x per week, if you were pushing the disks hard it appear to make it
even more likely to crash under the smart cmds, removing the commands
took things up to 2-3 months between crashes.

I suspect if you just put a simple smartcmd --all /dev/sdX and ran it
a few times a minute if it had the issue it would almost certainly
crash in less than a day, I did not figure out the smart cmds were
crashing it, someone else's post indicate that they had determined
that and I figured out what I had doing smartcmds and removed them and
things got much betterr.

For finding good vendors, I know others on the md-raid list have given
up on cheap and found decent but more expensive controllers.

I would expect LSI and Adaptec to care enough about their names to
make a decent quality product.   There appears to be  4pt (1-8087
pt-jbod/nonraid) adaptec that may be some variant of marvell that is
about $130US on newegg, given it is adaptec they may have made the
marvell actually work.   There are a number of 8pt non-raid cards up
around $250-$300 that would probably work great if you wanted to pay
that much, these cards have 2x8087 ports and need a 8087->4sata cable
cable.   Given how nice it is to have a machine that just mostly works
without messing around with it I would probably pay the extra for
stability.

Last time I looked at the 2pt/pciex1 cards I found significant
indications of instability enough to expect that I would have to put
several hours (or more) of testing/crashing/RMA  pain in to figure out
which worked.I went so far as crossing out any of the motherboards
with non-AMD/non-intel sata ports as I have been burned before on
large MB vendors doing a bad job of integrating others (possibly bad)
sata ports in, it is a sad state, but it also has been this way for a
long time.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
I had a 9230...on older kernels it worked "ok" so long as you did not
do any smart commands, I removed it and went to something that works.
   Marvell appears to be hit and miss with some cards/chips working
right and some not...

Do enough smartcmds and the entire board (all 4 ports) locked up and
required a reboot, I quit doing smartcmds and stability went way up,
but it was still not 100% stable.

Supplier support "claimed" it to be a Linux AHCI bug as the "claim"
that their board correctly supports AHCI, even though all other AHCI
boards work right in this exact same use case in the exact same
machine.

On Fri, May 30, 2014 at 8:58 AM, Jérôme Carretero  wrote:
> On Fri, 30 May 2014 20:37:58 +1000
> Benjamin Herrenschmidt  wrote:
>
>> We've switched to a 9235 instead which seems to work fine.
>
> Weird (I hadn't seen that you reported the 9235 working...), I have
> IOMMU problems with a 9235...
>
> What system are you running it on (when you say "power box", is it a
> beefy x86 computer or literally a PowerPC)?
> For me, AMD 990FX chipset, latest master linux.
> My board works fine* on another non-IOMMU system.
>
> --
> Jérôme
>
> * with issues with port multipliers
>
> Link to Benjamin's first message: https://lkml.org/lkml/2014/3/27/43
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
I had a 9230...on older kernels it worked ok so long as you did not
do any smart commands, I removed it and went to something that works.
   Marvell appears to be hit and miss with some cards/chips working
right and some not...

Do enough smartcmds and the entire board (all 4 ports) locked up and
required a reboot, I quit doing smartcmds and stability went way up,
but it was still not 100% stable.

Supplier support claimed it to be a Linux AHCI bug as the claim
that their board correctly supports AHCI, even though all other AHCI
boards work right in this exact same use case in the exact same
machine.

On Fri, May 30, 2014 at 8:58 AM, Jérôme Carretero cj...@zougloub.eu wrote:
 On Fri, 30 May 2014 20:37:58 +1000
 Benjamin Herrenschmidt b...@kernel.crashing.org wrote:

 We've switched to a 9235 instead which seems to work fine.

 Weird (I hadn't seen that you reported the 9235 working...), I have
 IOMMU problems with a 9235...

 What system are you running it on (when you say power box, is it a
 beefy x86 computer or literally a PowerPC)?
 For me, AMD 990FX chipset, latest master linux.
 My board works fine* on another non-IOMMU system.

 --
 Jérôme

 * with issues with port multipliers

 Link to Benjamin's first message: https://lkml.org/lkml/2014/3/27/43
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bad DMA from Marvell 9230

2014-05-30 Thread Roger Heflin
pretty much any smartcommands...I was running something that got all
of the smart stats 1x per hour per disk...and this made it crash about
1x per week, if you were pushing the disks hard it appear to make it
even more likely to crash under the smart cmds, removing the commands
took things up to 2-3 months between crashes.

I suspect if you just put a simple smartcmd --all /dev/sdX and ran it
a few times a minute if it had the issue it would almost certainly
crash in less than a day, I did not figure out the smart cmds were
crashing it, someone else's post indicate that they had determined
that and I figured out what I had doing smartcmds and removed them and
things got much betterr.

For finding good vendors, I know others on the md-raid list have given
up on cheap and found decent but more expensive controllers.

I would expect LSI and Adaptec to care enough about their names to
make a decent quality product.   There appears to be  4pt (1-8087
pt-jbod/nonraid) adaptec that may be some variant of marvell that is
about $130US on newegg, given it is adaptec they may have made the
marvell actually work.   There are a number of 8pt non-raid cards up
around $250-$300 that would probably work great if you wanted to pay
that much, these cards have 2x8087 ports and need a 8087-4sata cable
cable.   Given how nice it is to have a machine that just mostly works
without messing around with it I would probably pay the extra for
stability.

Last time I looked at the 2pt/pciex1 cards I found significant
indications of instability enough to expect that I would have to put
several hours (or more) of testing/crashing/RMA  pain in to figure out
which worked.I went so far as crossing out any of the motherboards
with non-AMD/non-intel sata ports as I have been burned before on
large MB vendors doing a bad job of integrating others (possibly bad)
sata ports in, it is a sad state, but it also has been this way for a
long time.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS V4 calls for a NFS v3 mount

2014-04-06 Thread Roger Heflin
No where in the mount command did you tell it that this was a
nfsversion 3 only mount, the mount name itself means nothing to mount,
so it tired nfs version 4 first then nfs version 3.

Note this in the man page for nfs:
nfsvers=n  The NFS protocol version number used to contact the
server's NFS service.  If the server does  not  support
  the  requested  version, the mount request
fails.  If this option is not specified, the client negotiates a
  suitable version with the server, trying version
4 first, version 3 second, and version 2 last.

On Sun, Apr 6, 2014 at 12:27 PM, Toralf Förster  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Probably a question better suited for a NFS noobs mailing list (is there any 
> around ?) ...
>
> While playing with kernel 3.13.x, wireshark and NFS I realized, that mounting 
> a NFS v3 share results in NFS V4 Calls - is this indented or a wireshark 
> dissector issue ?
>
>
> $ cat /etc/exports
> # /etc/exports: NFS file systems being exported.  See exports(5).
> /mnt/ramdisk
> 192.168.0.0/16(rw,fsid=0,insecure,no_subtree_check,async,no_root_squash)
>
>
> $ grep nfsv3 /etc/fstab
> n22:/mnt/ramdisk/mnt/nfsv3  nfs auto,bg,intr,soft
>
>
> $ ~/devel/wireshark/tshark -r /var/tmp/nfsv3.pcapng.gz
>   1 0.0127.0.0.1 -> 127.0.0.1DNS 73 Standard query 0x50bd  A 
> n22.fritz.box
>   2 0.000465000127.0.0.1 -> 127.0.0.1DNS 73 Standard query 0xa14f  
>  n22.fritz.box
>   3 0.006264000127.0.0.1 -> 127.0.0.1DNS 119 Standard query response 
> 0x50bd  A 192.168.178.21
>   4 0.007134000127.0.0.1 -> 127.0.0.1DNS 115 Standard query response 
> 0xa14f
>   5 0.017775000 192.168.178.21 -> 192.168.178.21 TCP 74 733 → nfs [SYN] Seq=0 
> Win=43690 Len=0
>   6 0.017791000 192.168.178.21 -> 192.168.178.21 TCP 74 nfs → 733 [SYN, ACK] 
> Seq=0 Ack=1 Win=65535 Len=0
>   7 0.017808000 192.168.178.21 -> 192.168.178.21 TCP 66 733 → nfs [ACK] Seq=1 
> Ack=1 Win=342 Len=0
>   8 0.022889000 192.168.178.21 -> 192.168.178.21 NFS 110 V4 NULL Call
>   9 0.022953000 192.168.178.21 -> 192.168.178.21 TCP 66 nfs → 733 [ACK] Seq=1 
> Ack=45 Win=1024 Len=0
>  10 0.023021000 192.168.178.21 -> 192.168.178.21 NFS 94 V4 NULL Reply (Call 
> In 8)
>  11 0.023049000 192.168.178.21 -> 192.168.178.21 TCP 66 733 → nfs [ACK] 
> Seq=45 Ack=29 Win=342 Len=0
>  12 0.030779000 192.168.178.21 -> 192.168.178.21 NFS 254 V4 Call SETCLIENTID
>  13 0.030892000 192.168.178.21 -> 192.168.178.21 NFS 130 V4 Reply (Call In 
> 12) SETCLIENTID
>  14 0.031247000 192.168.178.21 -> 192.168.178.21 NFS 166 V4 Call 
> SETCLIENTID_CONFIRM
>  15 0.031433000 192.168.178.21 -> 192.168.178.21 NFS 114 V4 Reply (Call In 
> 14) SETCLIENTID_CONFIRM
>  16 0.031455000 192.168.178.21 -> 192.168.178.21 TCP 74 945 → 48964 [SYN] 
> Seq=0 Win=43690 Len=0
>  17 0.031469000 192.168.178.21 -> 192.168.178.21 TCP 74 48964 → 945 [SYN, 
> ACK] Seq=0 Ack=1 Win=32768 Len=0
>  18 0.031482000 192.168.178.21 -> 192.168.178.21 TCP 66 945 → 48964 [ACK] 
> Seq=1 Ack=1 Win=342 Len=0
>  19 0.031506000 192.168.178.21 -> 192.168.178.21 NFS 134 V1 CB_NULL Call
>  20 0.031514000 192.168.178.21 -> 192.168.178.21 TCP 66 48964 → 945 [ACK] 
> Seq=1 Ack=69 Win=256 Len=0
>  21 0.031527000 192.168.178.21 -> 192.168.178.21 NFS 94 V1 CB_NULL Reply 
> (Call In 19)
>  22 0.031538000 192.168.178.21 -> 192.168.178.21 TCP 66 945 → 48964 [ACK] 
> Seq=69 Ack=29 Win=342 Len=0
>  23 0.060368000 192.168.178.21 -> 192.168.178.21 NFS 222 V4 Call PUTROOTFH | 
> GETATTR
>  24 0.060433000 192.168.178.21 -> 192.168.178.21 NFS 278 V4 Reply (Call In 
> 23) PUTROOTFH | GETATTR
>  25 0.06050 192.168.178.21 -> 192.168.178.21 NFS 226 V4 Call GETATTR FH: 
> 0x62d40c52
>  26 0.06055 192.168.178.21 -> 192.168.178.21 NFS 162 V4 Reply (Call In 
> 25) GETATTR
>  27 0.06059 192.168.178.21 -> 192.168.178.21 NFS 230 V4 Call GETATTR FH: 
> 0x62d40c52
>  28 0.060632000 192.168.178.21 -> 192.168.178.21 NFS 178 V4 Reply (Call In 
> 27) GETATTR
>  29 0.060674000 192.168.178.21 -> 192.168.178.21 NFS 226 V4 Call GETATTR FH: 
> 0x62d40c52
>  30 0.060714000 192.168.178.21 -> 192.168.178.21 NFS 162 V4 Reply (Call In 
> 29) GETATTR
>  31 0.060787000 192.168.178.21 -> 192.168.178.21 NFS 230 V4 Call GETATTR FH: 
> 0x62d40c52
>  32 0.060815000 192.168.178.21 -> 192.168.178.21 NFS 178 V4 Reply (Call In 
> 31) GETATTR
>  33 0.060857000 192.168.178.21 -> 192.168.178.21 NFS 226 V4 Call GETATTR FH: 
> 0x62d40c52
>  34 0.060885000 192.168.178.21 -> 192.168.178.21 NFS 142 V4 Reply (Call In 
> 33) GETATTR
>  35 0.061002000 192.168.178.21 -> 192.168.178.21 NFS 226 V4 Call GETATTR FH: 
> 0x62d40c52
>  36 0.061032000 192.168.178.21 -> 192.168.178.21 NFS 162 V4 Reply (Call In 
> 35) GETATTR
>  37 0.061074000 192.168.178.21 -> 192.168.178.21 NFS 230 V4 Call GETATTR FH: 
> 0x62d40c52
>  38 0.061101000 192.168.178.21 -> 192.168.178.21 NFS 258 V4 Reply (Call In 
> 37) GETATTR
>  39 0.061186000 192.168.178.21 -> 192.168.178.21 

Re: NFS V4 calls for a NFS v3 mount

2014-04-06 Thread Roger Heflin
No where in the mount command did you tell it that this was a
nfsversion 3 only mount, the mount name itself means nothing to mount,
so it tired nfs version 4 first then nfs version 3.

Note this in the man page for nfs:
nfsvers=n  The NFS protocol version number used to contact the
server's NFS service.  If the server does  not  support
  the  requested  version, the mount request
fails.  If this option is not specified, the client negotiates a
  suitable version with the server, trying version
4 first, version 3 second, and version 2 last.

On Sun, Apr 6, 2014 at 12:27 PM, Toralf Förster toralf.foers...@gmx.de wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 Probably a question better suited for a NFS noobs mailing list (is there any 
 around ?) ...

 While playing with kernel 3.13.x, wireshark and NFS I realized, that mounting 
 a NFS v3 share results in NFS V4 Calls - is this indented or a wireshark 
 dissector issue ?


 $ cat /etc/exports
 # /etc/exports: NFS file systems being exported.  See exports(5).
 /mnt/ramdisk
 192.168.0.0/16(rw,fsid=0,insecure,no_subtree_check,async,no_root_squash)


 $ grep nfsv3 /etc/fstab
 n22:/mnt/ramdisk/mnt/nfsv3  nfs auto,bg,intr,soft


 $ ~/devel/wireshark/tshark -r /var/tmp/nfsv3.pcapng.gz
   1 0.0127.0.0.1 - 127.0.0.1DNS 73 Standard query 0x50bd  A 
 n22.fritz.box
   2 0.000465000127.0.0.1 - 127.0.0.1DNS 73 Standard query 0xa14f  
  n22.fritz.box
   3 0.006264000127.0.0.1 - 127.0.0.1DNS 119 Standard query response 
 0x50bd  A 192.168.178.21
   4 0.007134000127.0.0.1 - 127.0.0.1DNS 115 Standard query response 
 0xa14f
   5 0.017775000 192.168.178.21 - 192.168.178.21 TCP 74 733 → nfs [SYN] Seq=0 
 Win=43690 Len=0
   6 0.017791000 192.168.178.21 - 192.168.178.21 TCP 74 nfs → 733 [SYN, ACK] 
 Seq=0 Ack=1 Win=65535 Len=0
   7 0.017808000 192.168.178.21 - 192.168.178.21 TCP 66 733 → nfs [ACK] Seq=1 
 Ack=1 Win=342 Len=0
   8 0.022889000 192.168.178.21 - 192.168.178.21 NFS 110 V4 NULL Call
   9 0.022953000 192.168.178.21 - 192.168.178.21 TCP 66 nfs → 733 [ACK] Seq=1 
 Ack=45 Win=1024 Len=0
  10 0.023021000 192.168.178.21 - 192.168.178.21 NFS 94 V4 NULL Reply (Call 
 In 8)
  11 0.023049000 192.168.178.21 - 192.168.178.21 TCP 66 733 → nfs [ACK] 
 Seq=45 Ack=29 Win=342 Len=0
  12 0.030779000 192.168.178.21 - 192.168.178.21 NFS 254 V4 Call SETCLIENTID
  13 0.030892000 192.168.178.21 - 192.168.178.21 NFS 130 V4 Reply (Call In 
 12) SETCLIENTID
  14 0.031247000 192.168.178.21 - 192.168.178.21 NFS 166 V4 Call 
 SETCLIENTID_CONFIRM
  15 0.031433000 192.168.178.21 - 192.168.178.21 NFS 114 V4 Reply (Call In 
 14) SETCLIENTID_CONFIRM
  16 0.031455000 192.168.178.21 - 192.168.178.21 TCP 74 945 → 48964 [SYN] 
 Seq=0 Win=43690 Len=0
  17 0.031469000 192.168.178.21 - 192.168.178.21 TCP 74 48964 → 945 [SYN, 
 ACK] Seq=0 Ack=1 Win=32768 Len=0
  18 0.031482000 192.168.178.21 - 192.168.178.21 TCP 66 945 → 48964 [ACK] 
 Seq=1 Ack=1 Win=342 Len=0
  19 0.031506000 192.168.178.21 - 192.168.178.21 NFS 134 V1 CB_NULL Call
  20 0.031514000 192.168.178.21 - 192.168.178.21 TCP 66 48964 → 945 [ACK] 
 Seq=1 Ack=69 Win=256 Len=0
  21 0.031527000 192.168.178.21 - 192.168.178.21 NFS 94 V1 CB_NULL Reply 
 (Call In 19)
  22 0.031538000 192.168.178.21 - 192.168.178.21 TCP 66 945 → 48964 [ACK] 
 Seq=69 Ack=29 Win=342 Len=0
  23 0.060368000 192.168.178.21 - 192.168.178.21 NFS 222 V4 Call PUTROOTFH | 
 GETATTR
  24 0.060433000 192.168.178.21 - 192.168.178.21 NFS 278 V4 Reply (Call In 
 23) PUTROOTFH | GETATTR
  25 0.06050 192.168.178.21 - 192.168.178.21 NFS 226 V4 Call GETATTR FH: 
 0x62d40c52
  26 0.06055 192.168.178.21 - 192.168.178.21 NFS 162 V4 Reply (Call In 
 25) GETATTR
  27 0.06059 192.168.178.21 - 192.168.178.21 NFS 230 V4 Call GETATTR FH: 
 0x62d40c52
  28 0.060632000 192.168.178.21 - 192.168.178.21 NFS 178 V4 Reply (Call In 
 27) GETATTR
  29 0.060674000 192.168.178.21 - 192.168.178.21 NFS 226 V4 Call GETATTR FH: 
 0x62d40c52
  30 0.060714000 192.168.178.21 - 192.168.178.21 NFS 162 V4 Reply (Call In 
 29) GETATTR
  31 0.060787000 192.168.178.21 - 192.168.178.21 NFS 230 V4 Call GETATTR FH: 
 0x62d40c52
  32 0.060815000 192.168.178.21 - 192.168.178.21 NFS 178 V4 Reply (Call In 
 31) GETATTR
  33 0.060857000 192.168.178.21 - 192.168.178.21 NFS 226 V4 Call GETATTR FH: 
 0x62d40c52
  34 0.060885000 192.168.178.21 - 192.168.178.21 NFS 142 V4 Reply (Call In 
 33) GETATTR
  35 0.061002000 192.168.178.21 - 192.168.178.21 NFS 226 V4 Call GETATTR FH: 
 0x62d40c52
  36 0.061032000 192.168.178.21 - 192.168.178.21 NFS 162 V4 Reply (Call In 
 35) GETATTR
  37 0.061074000 192.168.178.21 - 192.168.178.21 NFS 230 V4 Call GETATTR FH: 
 0x62d40c52
  38 0.061101000 192.168.178.21 - 192.168.178.21 NFS 258 V4 Reply (Call In 
 37) GETATTR
  39 0.061186000 192.168.178.21 - 192.168.178.21 NFS 238 V4 Call ACCESS FH: 
 0x62d40c52, [Check: RD LU MD XT DL]
  40 0.061247000 192.168.178.21 - 192.168.178.21 

Re: possible viri in tarballs?

2014-02-05 Thread Roger Heflin
Gene,

How big is the file you have?  Here is what I have, and this is
from several different kernels.

 wc gadget_multi.txt
 150  830 5482 gadget_multi.tx

cksum gadget_multi.txt
3973522114 5482 gadget_multi.txt

ls -l gadget_multi.txt
-rw-rw-r-- 1 root root 5482 Dec 20 09:51 gadget_multi.txt

If you size/cksum is different something modified your file


On Wed, Feb 5, 2014 at 1:52 PM, Theodore Ts'o  wrote:
> On Wed, Feb 05, 2014 at 01:24:59PM -0500, Gene Heskett wrote:
>> >/home/gene/src/linux-3.2.40/Documentation/usb/gadget_multi.txt:
>> >MBL_400944.UNOFFICIAL FOUND
>>
>> You will see more history.
>>
>> So that file needs sanitized.  I was under the impression that a file with
>> the .txt extension was supposed to be pure ascii text, but its loaded to
>> the gills with some sort of markup crap.  And I have at least 20 copies of
>> it.
>
> Huh?   There are lines with
>
> * Overview
>
> ...
>
> ** Linux host drivers
>
> ...
>
> in that file, sure.  But I'd hardly call that "loaded to the gills
> with markup crap".
>
> If the file was had any amount of XML or XHTML2, that would be markup
> crap.  But some Twiki style ascii markup is hardly a problem -- it
> looks just fine when viewed in a text reader.
>
>  - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: possible viri in tarballs?

2014-02-05 Thread Roger Heflin
Gene,

How big is the file you have?  Here is what I have, and this is
from several different kernels.

 wc gadget_multi.txt
 150  830 5482 gadget_multi.tx

cksum gadget_multi.txt
3973522114 5482 gadget_multi.txt

ls -l gadget_multi.txt
-rw-rw-r-- 1 root root 5482 Dec 20 09:51 gadget_multi.txt

If you size/cksum is different something modified your file


On Wed, Feb 5, 2014 at 1:52 PM, Theodore Ts'o ty...@mit.edu wrote:
 On Wed, Feb 05, 2014 at 01:24:59PM -0500, Gene Heskett wrote:
 /home/gene/src/linux-3.2.40/Documentation/usb/gadget_multi.txt:
 MBL_400944.UNOFFICIAL FOUND

 You will see more history.

 So that file needs sanitized.  I was under the impression that a file with
 the .txt extension was supposed to be pure ascii text, but its loaded to
 the gills with some sort of markup crap.  And I have at least 20 copies of
 it.

 Huh?   There are lines with

 * Overview

 ...

 ** Linux host drivers

 ...

 in that file, sure.  But I'd hardly call that loaded to the gills
 with markup crap.

 If the file was had any amount of XML or XHTML2, that would be markup
 crap.  But some Twiki style ascii markup is hardly a problem -- it
 looks just fine when viewed in a text reader.

  - Ted
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Probably silly Q about bootable partitions

2013-12-31 Thread Roger Heflin
rescue boot it, change the /boot mount line in /etc/fstab to add
noauto (like noauto,defaults...or whatever else you already have) and
change the last column to 0 to disable fsck on it.

It should boot then, and you have the machine fully up were you can do
better debugging.

ie mount /boot may give you a useful error, or it may work...if it
works that implies that in the initrd some piece needed to start /boot
is not there (fs mayb?).

/boot is only needed by the bios...it is not needed when the os is up
except to update teh booting kernel to a new one...

On Tue, Dec 31, 2013 at 12:05 PM, Gene Heskett  wrote:
> Greetings;
>
> I can't build a bootable 3.12.6 kernel, it seems to die quite fast with a
> trace blaming binfmt-some-hex-number.  Or fail well into the boot waiting
> for / to come available.  But if I choose a shell at that failure, it
> isn't / that is not shown in a blkid report, it is /boot, named "ububoot"
> thats missing. "/", named uburoot, is fine.
>
> Here is blkid output booted to 3.12.0.
>
> gene@coyote:~/src/linux-3.12.6$ blkid
> /dev/sdc1: UUID="1321fc90-ba7a-4742-8176-f7b3a8284be5" TYPE="ext4"
> /dev/sdc2: LABEL="amandatapes-1-T" 
> UUID="b7657920-d9a2-4379-ae21-08a0651b65cc" SEC_TYPE="ext2" TYPE="ext3"
> /dev/sda1: LABEL="ububoot" UUID="f54ba7af-1545-43f3-a86e-bfc0017b4526" 
> SEC_TYPE="ext2" TYPE="ext3"
> /dev/sda2: LABEL="uburoot" UUID="ec677e9c-6be6-4311-b97b-3889d42ce6ef" 
> TYPE="ext4"
> /dev/sda3: UUID="edc2880e-257d-4521-8220-0df5b57dcae4" TYPE="swap"
> /dev/sdb1: UUID="80ab0463-d6fc-4f5b-af08-5aa43d55fdf8" SEC_TYPE="ext2" 
> TYPE="ext3"
> /dev/sdb5: UUID="b4841721-a040-48bc-80dc-e742164ad38a" TYPE="swap"
> /dev/sdd1: LABEL="home2" UUID="7601432d-7a30-42a3-80b5-57f08ae71f2a" 
> TYPE="ext4"
> /dev/sdd2: LABEL="opt2" UUID="748b01e1-ae7b-4b17-b8e9-c88429bcefbf" 
> TYPE="ext4"
>
> Duplicating this 3.12.0's settings under "filesystems" for 3.12.6 is
> apparently not the needed fix.
>
> Clues for the apparently clueless?
>
> Thanks all.
>
>
> Cheers, Gene
> --
> "There are four boxes to be used in defense of liberty:
>  soap, ballot, jury, and ammo. Please use in that order."
> -Ed Howdershelt (Author)
> Genes Web page 
>
> There is no substitute for good manners, except, perhaps, fast reflexes.
> A pen in the hand of this president is far more
> dangerous than 200 million guns in the hands of
>  law-abiding citizens.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Probably silly Q about bootable partitions

2013-12-31 Thread Roger Heflin
rescue boot it, change the /boot mount line in /etc/fstab to add
noauto (like noauto,defaults...or whatever else you already have) and
change the last column to 0 to disable fsck on it.

It should boot then, and you have the machine fully up were you can do
better debugging.

ie mount /boot may give you a useful error, or it may work...if it
works that implies that in the initrd some piece needed to start /boot
is not there (fs mayb?).

/boot is only needed by the bios...it is not needed when the os is up
except to update teh booting kernel to a new one...

On Tue, Dec 31, 2013 at 12:05 PM, Gene Heskett ghesk...@wdtv.com wrote:
 Greetings;

 I can't build a bootable 3.12.6 kernel, it seems to die quite fast with a
 trace blaming binfmt-some-hex-number.  Or fail well into the boot waiting
 for / to come available.  But if I choose a shell at that failure, it
 isn't / that is not shown in a blkid report, it is /boot, named ububoot
 thats missing. /, named uburoot, is fine.

 Here is blkid output booted to 3.12.0.

 gene@coyote:~/src/linux-3.12.6$ blkid
 /dev/sdc1: UUID=1321fc90-ba7a-4742-8176-f7b3a8284be5 TYPE=ext4
 /dev/sdc2: LABEL=amandatapes-1-T 
 UUID=b7657920-d9a2-4379-ae21-08a0651b65cc SEC_TYPE=ext2 TYPE=ext3
 /dev/sda1: LABEL=ububoot UUID=f54ba7af-1545-43f3-a86e-bfc0017b4526 
 SEC_TYPE=ext2 TYPE=ext3
 /dev/sda2: LABEL=uburoot UUID=ec677e9c-6be6-4311-b97b-3889d42ce6ef 
 TYPE=ext4
 /dev/sda3: UUID=edc2880e-257d-4521-8220-0df5b57dcae4 TYPE=swap
 /dev/sdb1: UUID=80ab0463-d6fc-4f5b-af08-5aa43d55fdf8 SEC_TYPE=ext2 
 TYPE=ext3
 /dev/sdb5: UUID=b4841721-a040-48bc-80dc-e742164ad38a TYPE=swap
 /dev/sdd1: LABEL=home2 UUID=7601432d-7a30-42a3-80b5-57f08ae71f2a 
 TYPE=ext4
 /dev/sdd2: LABEL=opt2 UUID=748b01e1-ae7b-4b17-b8e9-c88429bcefbf 
 TYPE=ext4

 Duplicating this 3.12.0's settings under filesystems for 3.12.6 is
 apparently not the needed fix.

 Clues for the apparently clueless?

 Thanks all.


 Cheers, Gene
 --
 There are four boxes to be used in defense of liberty:
  soap, ballot, jury, and ammo. Please use in that order.
 -Ed Howdershelt (Author)
 Genes Web page http://geneslinuxbox.net:6309/gene

 There is no substitute for good manners, except, perhaps, fast reflexes.
 A pen in the hand of this president is far more
 dangerous than 200 million guns in the hands of
  law-abiding citizens.
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-15 Thread Roger Heflin

Lukas Hejtmanek wrote:

On Fri, Feb 15, 2008 at 03:42:58PM +0100, Jan Engelhardt wrote:

Also consider
- DMA (e.g. only UDMA2 selected)
- aging disk


it's not the case.

hdparm reports udma5 is used, if it is reliable with libata.

The disk is 3 months old, kernel does not report any errors. And it has never 
been different.




A new current ide/sata disk should do around 60mb/second, check the
min/max bps rate listed on the disk companies site, and divide by 8, and
take maybe 80% of that

Also you may consider using the -l option on the scp command to limit
its total usage.

This feature has been around at least 8 years (from 2.2) that high
levels of writes would significantly starve out reads, mainly because
you can queue up 1000's of writes, and a read, when the read
finishes there are another 1000's writes for the next read to
get in line behind, and wait, and this continues until the
writes stop.

Roger
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-15 Thread Roger Heflin

Lukas Hejtmanek wrote:

On Fri, Feb 15, 2008 at 03:42:58PM +0100, Jan Engelhardt wrote:

Also consider
- DMA (e.g. only UDMA2 selected)
- aging disk


it's not the case.

hdparm reports udma5 is used, if it is reliable with libata.

The disk is 3 months old, kernel does not report any errors. And it has never 
been different.




A new current ide/sata disk should do around 60mb/second, check the
min/max bps rate listed on the disk companies site, and divide by 8, and
take maybe 80% of that

Also you may consider using the -l option on the scp command to limit
its total usage.

This feature has been around at least 8 years (from 2.2) that high
levels of writes would significantly starve out reads, mainly because
you can queue up 1000's of writes, and a read, when the read
finishes there are another 1000's writes for the next read to
get in line behind, and wait, and this continues until the
writes stop.

Roger
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Hangs and reboots under high loads, oops with DEBUG_SHIRQ

2007-07-31 Thread Roger Heflin

Attila Nagy wrote:

On 2007.07.30. 18:19, Alan Cox wrote:

O> MCE:
 

[153103.918654] HARDWARE ERROR
[153103.918655] CPU 1: Machine Check Exception:5 Bank 
0: b2401400
[153104.066037] RIP !INEXACT! 10: 
{mwait_idle+0x46/0x60}

[153104.145699] TSC 1167e915e93ce
[153104.183554] This is not a software problem!
[153104.234724] Run through mcelog --ascii to decode and contact your 
hardware vendor



If you it through mcelog as it suggests it wil decode the meaning of the
MCE data and that should give you some idea. Generally speaking MCE
errors are real hardware errors but can certainly be caused by external
factors (power supply glitches, heat etc)
  
Sorry, of course I ran that through mcelog, but inadvertently attached 
the original version.


I've tried the machines with two types of power sources (different 
UPSes, line filtering, etc,
and the chassis have redundant PSes), monitoring the temperatures (seems 
to be OK,
the CPUs don't go over 30 °C even under load). I have the latest BIOS 
for the

motherboard.
But I will recheck everything.

BTW, here's the output from mcelog, I see this occasionally on all four 
machines:


HARDWARE ERROR
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 BANK 0 TSC 1167e915e93ce
MCG status:RIPV MCIP
MCi status:
Uncorrected error
Error enabled
Processor context corrupt
MCA: Internal Timer error
STATUS b2401400 MCGSTATUS 5
This is not a software problem!
Run through mcelog --ascii to decode and contact your hardware vendor

HARDWARE ERROR
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 BANK 5 TSC 1167e915e9ea8
MCG status:RIPV MCIP
MCi status:
Uncorrected error
Error enabled
Processor context corrupt
MCA: Internal Timer error
STATUS b200221024080400 MCGSTATUS 5
This is not a software problem!
Run through mcelog --ascii to decode and contact your hardware vendor

Thanks,

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Attila,

We had some issues with very similar boards all of the problems
seem to be around the PCIX bus area of the machine, setting the
PCIX buses to 66 mhz in the bios made things stable (but slow).   Not using
the PCIX bus also seemed to make things work.   We got MCE's and
other odd crashes under heavy IO loads.   I believe turning things
down to 100mhz made things more stable, but things still crashed.

Supermicro reported being able to fix the issue with:
setting the PCI Configuration -> PCI-e I/O performance
setting to Colasce 128B.

I am not exactly sure where to set it as we did not try it
as we had already changed to a different motherboard that did not
have the issue.

If this works please tell me.

 Roger





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Hangs and reboots under high loads, oops with DEBUG_SHIRQ

2007-07-31 Thread Roger Heflin

Attila Nagy wrote:

On 2007.07.30. 18:19, Alan Cox wrote:

O MCE:
 

[153103.918654] HARDWARE ERROR
[153103.918655] CPU 1: Machine Check Exception:5 Bank 
0: b2401400
[153104.066037] RIP !INEXACT! 10:802569e6 
{mwait_idle+0x46/0x60}

[153104.145699] TSC 1167e915e93ce
[153104.183554] This is not a software problem!
[153104.234724] Run through mcelog --ascii to decode and contact your 
hardware vendor



If you it through mcelog as it suggests it wil decode the meaning of the
MCE data and that should give you some idea. Generally speaking MCE
errors are real hardware errors but can certainly be caused by external
factors (power supply glitches, heat etc)
  
Sorry, of course I ran that through mcelog, but inadvertently attached 
the original version.


I've tried the machines with two types of power sources (different 
UPSes, line filtering, etc,
and the chassis have redundant PSes), monitoring the temperatures (seems 
to be OK,
the CPUs don't go over 30 °C even under load). I have the latest BIOS 
for the

motherboard.
But I will recheck everything.

BTW, here's the output from mcelog, I see this occasionally on all four 
machines:


HARDWARE ERROR
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 BANK 0 TSC 1167e915e93ce
MCG status:RIPV MCIP
MCi status:
Uncorrected error
Error enabled
Processor context corrupt
MCA: Internal Timer error
STATUS b2401400 MCGSTATUS 5
This is not a software problem!
Run through mcelog --ascii to decode and contact your hardware vendor

HARDWARE ERROR
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 BANK 5 TSC 1167e915e9ea8
MCG status:RIPV MCIP
MCi status:
Uncorrected error
Error enabled
Processor context corrupt
MCA: Internal Timer error
STATUS b200221024080400 MCGSTATUS 5
This is not a software problem!
Run through mcelog --ascii to decode and contact your hardware vendor

Thanks,

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Attila,

We had some issues with very similar boards all of the problems
seem to be around the PCIX bus area of the machine, setting the
PCIX buses to 66 mhz in the bios made things stable (but slow).   Not using
the PCIX bus also seemed to make things work.   We got MCE's and
other odd crashes under heavy IO loads.   I believe turning things
down to 100mhz made things more stable, but things still crashed.

Supermicro reported being able to fix the issue with:
setting the PCI Configuration - PCI-e I/O performance
setting to Colasce 128B.

I am not exactly sure where to set it as we did not try it
as we had already changed to a different motherboard that did not
have the issue.

If this works please tell me.

 Roger





-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-29 Thread Roger Heflin

Dave Kleikamp wrote:

On Tue, 2007-05-29 at 12:16 -0500, Roger Heflin wrote:


Dave,

Apparently there appears to be another different similar lockup,
The MTBF has risen from 1-2 hours without that patch to >100 hours,
so I am fairly sure the patch did correct the original lockup, or
at the very least make it a lot less likely.

I hit the machine across NFS for 5 days before it deadlocked, before
the patch I could only get an hour or two (2-4 different tries).

Given that pdflush is "D" it does not appear to be an NFS issue.

Included is the sysrq-t.

This is with 2.6.21.1 + the JFSIO patch.


Is the system still in this state?  Can you cat /proc/fs/jfs/TxAnchor
(if CONFIG_JFS_DEBUG is defined) and /proc/fs/jfs/txstats (if
CONFIG_JFS_STATISTICS is defined)?

Thanks,
Shaggy


Yes, the machine is still in that state.

Apparently I don't have either of those configured.

Anything else that we can collect before I rebuild the kernel with
those options setup?

 Roger

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-29 Thread Roger Heflin

Dave Kleikamp wrote:

On Tue, 2007-05-29 at 12:16 -0500, Roger Heflin wrote:


Dave,

Apparently there appears to be another different similar lockup,
The MTBF has risen from 1-2 hours without that patch to 100 hours,
so I am fairly sure the patch did correct the original lockup, or
at the very least make it a lot less likely.

I hit the machine across NFS for 5 days before it deadlocked, before
the patch I could only get an hour or two (2-4 different tries).

Given that pdflush is D it does not appear to be an NFS issue.

Included is the sysrq-t.

This is with 2.6.21.1 + the JFSIO patch.


Is the system still in this state?  Can you cat /proc/fs/jfs/TxAnchor
(if CONFIG_JFS_DEBUG is defined) and /proc/fs/jfs/txstats (if
CONFIG_JFS_STATISTICS is defined)?

Thanks,
Shaggy


Yes, the machine is still in that state.

Apparently I don't have either of those configured.

Anything else that we can collect before I rebuild the kernel with
those options setup?

 Roger

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


BUG: sleeping function called from invalid context at mm/mempool.c:210

2007-05-18 Thread Roger Heflin

I am getting this bug under heavy IO/NFS on 2.6.21.1.

BUG: sleeping function called from invalid context at mm/mempool.c:210

So far I have got the error I believe 3 times.

  Roger
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


BUG: sleeping function called from invalid context at mm/mempool.c:210

2007-05-18 Thread Roger Heflin

I am getting this bug under heavy IO/NFS on 2.6.21.1.

BUG: sleeping function called from invalid context at mm/mempool.c:210

So far I have got the error I believe 3 times.

  Roger
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-17 Thread Roger Heflin

Dave Kleikamp wrote:



I don't have an answer to an ext3 deadlock, but this looks like a jfs
problem that was recently fixed in linux-2.6.22-rc1.  I had intended to
send it to the stable kernel after it was picked up in mainline, but
hadn't gotten to it yet.

The patch is here:
http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=05ec9e26be1f668ccba4ca54d9a4966c6208c611



Dave,

That appears to have fixed the JFS hangup.

MTBF before was about 1 hour, under the same test I am over 20 hours
and things appear to still be holding together.

Roger
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-17 Thread Roger Heflin

Dave Kleikamp wrote:



I don't have an answer to an ext3 deadlock, but this looks like a jfs
problem that was recently fixed in linux-2.6.22-rc1.  I had intended to
send it to the stable kernel after it was picked up in mainline, but
hadn't gotten to it yet.

The patch is here:
http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=05ec9e26be1f668ccba4ca54d9a4966c6208c611



Dave,

That appears to have fixed the JFS hangup.

MTBF before was about 1 hour, under the same test I am over 20 hours
and things appear to still be holding together.

Roger
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [NFS] Kernel BUG at mm/slab.c:2380 on NFS server in nfsd (2.6.21.1)

2007-05-16 Thread Roger Heflin

J. Bruce Fields wrote:

On Wed, May 16, 2007 at 08:55:19AM -0500, Roger Heflin wrote:

Running bonnie over nfs on a RHEL4.4 client against a 2.6.21.1 server
got me this crash after about 4 hours of running on the server:

This was running lvm -> ext3 -> nfs  nfsclient (RHEL4.4).


Yipes.  Has this happened only once, or do you have a way to reliably
reproduce it?


I have not reproduced it yet, I will update if I do, I suspect
that I will be able to reproduce it, but it took several hours
of running.



Is it a new problem?  (And, if so, what changed?)


New different tests.




May 15 21:10:31 vault1 kernel: [ cut here ]
May 15 21:10:31 vault1 kernel: kernel BUG at mm/slab.c:2380!


That's the check_spinlock_acquired() in cache_alloc_refill().  What
causes that to fail?


May 15 21:10:31 vault1 kernel: invalid opcode:  [1] SMP
May 15 21:10:31 vault1 kernel: CPU 0
May 15 21:10:31 vault1 kernel: Modules linked in: qla2xxx nfsd exportfs 
lockd nfs_acl sunrpc hidp l2cap bluetooth ipv6 cpufreq_ondemand jfs 
dm_mirror dm_multipath dm_mod video sbs i2c_ec dock button battery 
asus_acpi ac lp snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy 
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device sg snd_pcm_oss 
floppy snd_mixer_oss snd_pcm cfi_cmdset_0002 cfi_util mtdpart snd_timer 
jedec_probe cfi_probe gen_probe snd ck804xrom sata_nv mtdcore chipreg 
i2c_nforce2 soundcore map_funcs libata snd_page_alloc pcspkr i2c_core 
k8temp hwmon forcedeth ohci1394 ieee1394 parport_pc ide_cd parport cdrom 
serio_raw scsi_transport_fc shpchp megaraid_mbox sd_mod scsi_mod 
megaraid_mm ext3 jbd ehci_hcd ohci_hcd uhci_hcd

May 15 21:10:31 vault1 kernel: Pid: 4256, comm: nfsd Not tainted 2.6.21.1 #1
May 15 21:10:31 vault1 kernel: RIP: 0010:[] 
[] cache_alloc_refill+0xe6/0x1f3

May 15 21:10:31 vault1 kernel: RSP: 0018:81021dead6d0  EFLAGS: 00010002
May 15 21:10:31 vault1 kernel: RAX: 0001 RBX: 
81012211d960 RCX: 810120013040
May 15 21:10:31 vault1 kernel: RDX: 000e RSI: 
81013902 RDI: 810120013040


There wasn't a backtrace?

--b.



That was all I had before it took the machine out, it is on a serial
console at this time, so if it happens again I should get better data.

 Roger
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Kernel BUG at mm/slab.c:2380 on NFS server in nfsd (2.6.21.1)

2007-05-16 Thread Roger Heflin

Running bonnie over nfs on a RHEL4.4 client against a 2.6.21.1 server
got me this crash after about 4 hours of running on the server:

This was running lvm -> ext3 -> nfs  nfsclient (RHEL4.4).

Ideas?

  Roger

May 15 21:10:31 vault1 kernel: [ cut here ]
May 15 21:10:31 vault1 kernel: kernel BUG at mm/slab.c:2380!
May 15 21:10:31 vault1 kernel: invalid opcode:  [1] SMP
May 15 21:10:31 vault1 kernel: CPU 0
May 15 21:10:31 vault1 kernel: Modules linked in: qla2xxx nfsd exportfs 
lockd nfs_acl sunrpc hidp l2cap bluetooth ipv6 cpufreq_ondemand jfs 
dm_mirror dm_multipath dm_mod video sbs i2c_ec dock button battery 
asus_acpi ac lp snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy 
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device sg snd_pcm_oss 
floppy snd_mixer_oss snd_pcm cfi_cmdset_0002 cfi_util mtdpart snd_timer 
jedec_probe cfi_probe gen_probe snd ck804xrom sata_nv mtdcore chipreg 
i2c_nforce2 soundcore map_funcs libata snd_page_alloc pcspkr i2c_core 
k8temp hwmon forcedeth ohci1394 ieee1394 parport_pc ide_cd parport cdrom 
serio_raw scsi_transport_fc shpchp megaraid_mbox sd_mod scsi_mod 
megaraid_mm ext3 jbd ehci_hcd ohci_hcd uhci_hcd

May 15 21:10:31 vault1 kernel: Pid: 4256, comm: nfsd Not tainted 2.6.21.1 #1
May 15 21:10:31 vault1 kernel: RIP: 0010:[] 
[] cache_alloc_refill+0xe6/0x1f3

May 15 21:10:31 vault1 kernel: RSP: 0018:81021dead6d0  EFLAGS: 00010002
May 15 21:10:31 vault1 kernel: RAX: 0001 RBX: 
81012211d960 RCX: 810120013040
May 15 21:10:31 vault1 kernel: RDX: 000e RSI: 
81013902 RDI: 810120013040

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Kernel BUG at mm/slab.c:2380 on NFS server in nfsd (2.6.21.1)

2007-05-16 Thread Roger Heflin

Running bonnie over nfs on a RHEL4.4 client against a 2.6.21.1 server
got me this crash after about 4 hours of running on the server:

This was running lvm - ext3 - nfs  nfsclient (RHEL4.4).

Ideas?

  Roger

May 15 21:10:31 vault1 kernel: [ cut here ]
May 15 21:10:31 vault1 kernel: kernel BUG at mm/slab.c:2380!
May 15 21:10:31 vault1 kernel: invalid opcode:  [1] SMP
May 15 21:10:31 vault1 kernel: CPU 0
May 15 21:10:31 vault1 kernel: Modules linked in: qla2xxx nfsd exportfs 
lockd nfs_acl sunrpc hidp l2cap bluetooth ipv6 cpufreq_ondemand jfs 
dm_mirror dm_multipath dm_mod video sbs i2c_ec dock button battery 
asus_acpi ac lp snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy 
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device sg snd_pcm_oss 
floppy snd_mixer_oss snd_pcm cfi_cmdset_0002 cfi_util mtdpart snd_timer 
jedec_probe cfi_probe gen_probe snd ck804xrom sata_nv mtdcore chipreg 
i2c_nforce2 soundcore map_funcs libata snd_page_alloc pcspkr i2c_core 
k8temp hwmon forcedeth ohci1394 ieee1394 parport_pc ide_cd parport cdrom 
serio_raw scsi_transport_fc shpchp megaraid_mbox sd_mod scsi_mod 
megaraid_mm ext3 jbd ehci_hcd ohci_hcd uhci_hcd

May 15 21:10:31 vault1 kernel: Pid: 4256, comm: nfsd Not tainted 2.6.21.1 #1
May 15 21:10:31 vault1 kernel: RIP: 0010:[8025a710] 
[8025a710] cache_alloc_refill+0xe6/0x1f3

May 15 21:10:31 vault1 kernel: RSP: 0018:81021dead6d0  EFLAGS: 00010002
May 15 21:10:31 vault1 kernel: RAX: 0001 RBX: 
81012211d960 RCX: 810120013040
May 15 21:10:31 vault1 kernel: RDX: 000e RSI: 
81013902 RDI: 810120013040

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [NFS] Kernel BUG at mm/slab.c:2380 on NFS server in nfsd (2.6.21.1)

2007-05-16 Thread Roger Heflin

J. Bruce Fields wrote:

On Wed, May 16, 2007 at 08:55:19AM -0500, Roger Heflin wrote:

Running bonnie over nfs on a RHEL4.4 client against a 2.6.21.1 server
got me this crash after about 4 hours of running on the server:

This was running lvm - ext3 - nfs  nfsclient (RHEL4.4).


Yipes.  Has this happened only once, or do you have a way to reliably
reproduce it?


I have not reproduced it yet, I will update if I do, I suspect
that I will be able to reproduce it, but it took several hours
of running.



Is it a new problem?  (And, if so, what changed?)


New different tests.




May 15 21:10:31 vault1 kernel: [ cut here ]
May 15 21:10:31 vault1 kernel: kernel BUG at mm/slab.c:2380!


That's the check_spinlock_acquired() in cache_alloc_refill().  What
causes that to fail?


May 15 21:10:31 vault1 kernel: invalid opcode:  [1] SMP
May 15 21:10:31 vault1 kernel: CPU 0
May 15 21:10:31 vault1 kernel: Modules linked in: qla2xxx nfsd exportfs 
lockd nfs_acl sunrpc hidp l2cap bluetooth ipv6 cpufreq_ondemand jfs 
dm_mirror dm_multipath dm_mod video sbs i2c_ec dock button battery 
asus_acpi ac lp snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy 
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device sg snd_pcm_oss 
floppy snd_mixer_oss snd_pcm cfi_cmdset_0002 cfi_util mtdpart snd_timer 
jedec_probe cfi_probe gen_probe snd ck804xrom sata_nv mtdcore chipreg 
i2c_nforce2 soundcore map_funcs libata snd_page_alloc pcspkr i2c_core 
k8temp hwmon forcedeth ohci1394 ieee1394 parport_pc ide_cd parport cdrom 
serio_raw scsi_transport_fc shpchp megaraid_mbox sd_mod scsi_mod 
megaraid_mm ext3 jbd ehci_hcd ohci_hcd uhci_hcd

May 15 21:10:31 vault1 kernel: Pid: 4256, comm: nfsd Not tainted 2.6.21.1 #1
May 15 21:10:31 vault1 kernel: RIP: 0010:[8025a710] 
[8025a710] cache_alloc_refill+0xe6/0x1f3

May 15 21:10:31 vault1 kernel: RSP: 0018:81021dead6d0  EFLAGS: 00010002
May 15 21:10:31 vault1 kernel: RAX: 0001 RBX: 
81012211d960 RCX: 810120013040
May 15 21:10:31 vault1 kernel: RDX: 000e RSI: 
81013902 RDI: 810120013040


There wasn't a backtrace?

--b.



That was all I had before it took the machine out, it is on a serial
console at this time, so if it happens again I should get better data.

 Roger
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-15 Thread Roger Heflin

Dave Kleikamp wrote:

Sorry if I'm missing anyone on the reply, but my mail feed is messed up
and I'm replying from the gmane archive.

On Tue, 15 May 2007 09:08:25 -0500, Roger Heflin wrote:


Hello,

Running 2.6.21.1 (FC6 Dist), with a RHEL client (client
appears to not be having issues) I am getting what I believe
is a deadlock on the server end.This is with JFS and
NFSD, I have not tested yet with a non-JFS filesystem,
though our customer indicated that they have duplicated it with
the ext3 filesystem.


I don't have an answer to an ext3 deadlock, but this looks like a jfs
problem that was recently fixed in linux-2.6.22-rc1.  I had intended to
send it to the stable kernel after it was picked up in mainline, but
hadn't gotten to it yet.

The patch is here:
http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=05ec9e26be1f668ccba4ca54d9a4966c6208c611



Ok.

My customer reported that he though he had a ext3, so far I have
not been able to duplicate the ext3 hang.

If ext3 survives until tomorrow, I will retest unpatched jfs, and then
patch it and test again.



The basic setup is:
fiber channel array -> qlogic fiber card -> /dev/sdx -> LVM stripe ->
jfs -> nfs.

Running bonnie on a NFS share has apparently produced a deadlock.   I
have ran bonnie several times without having any issues, I don't believe
this is a HW issue, we have a couple of other machines configured with
slightly different HW and are also able to duplicate this problem on
those machines.  There are no abnormal messages in dmesg or in the
messages file.

After having the apparent deadlock I started a dd of a on the deadlocked
filesystem and according to vmstat 1 that was actually working, I then
did a "mkdir junk" on the deadlocked filesystem and that apparently put
the cat into a permanent "D" state.   I will include the sysrq -t from
before the cat/mkdir and after the cat/mkdir.

I believe I can duplicate this again, and other than the processes going
into the "D" state everything else seems to work.   Other filesytems
appear to be functional, I can still login to the machine.

Right now the machine is in the deadlocked state, and I will wait for
any suggestions of more data to collect or other tests to try.


I haven't tried it on a locked-up system, but you may try waking up the
[jfsIO] kernel thread with a signal.  I'm not sure what signals may get
through, since the thread doesn't specifically act on a signal.



I will try on the next lockup.

   Roger
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Apparent Deadlock with nfsd/jfs on 2.6.21.1 under bonnie.

2007-05-15 Thread Roger Heflin

Dave Kleikamp wrote:

Sorry if I'm missing anyone on the reply, but my mail feed is messed up
and I'm replying from the gmane archive.

On Tue, 15 May 2007 09:08:25 -0500, Roger Heflin wrote:


Hello,

Running 2.6.21.1 (FC6 Dist), with a RHEL client (client
appears to not be having issues) I am getting what I believe
is a deadlock on the server end.This is with JFS and
NFSD, I have not tested yet with a non-JFS filesystem,
though our customer indicated that they have duplicated it with
the ext3 filesystem.


I don't have an answer to an ext3 deadlock, but this looks like a jfs
problem that was recently fixed in linux-2.6.22-rc1.  I had intended to
send it to the stable kernel after it was picked up in mainline, but
hadn't gotten to it yet.

The patch is here:
http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=05ec9e26be1f668ccba4ca54d9a4966c6208c611



Ok.

My customer reported that he though he had a ext3, so far I have
not been able to duplicate the ext3 hang.

If ext3 survives until tomorrow, I will retest unpatched jfs, and then
patch it and test again.



The basic setup is:
fiber channel array - qlogic fiber card - /dev/sdx - LVM stripe -
jfs - nfs.

Running bonnie on a NFS share has apparently produced a deadlock.   I
have ran bonnie several times without having any issues, I don't believe
this is a HW issue, we have a couple of other machines configured with
slightly different HW and are also able to duplicate this problem on
those machines.  There are no abnormal messages in dmesg or in the
messages file.

After having the apparent deadlock I started a dd of a on the deadlocked
filesystem and according to vmstat 1 that was actually working, I then
did a mkdir junk on the deadlocked filesystem and that apparently put
the cat into a permanent D state.   I will include the sysrq -t from
before the cat/mkdir and after the cat/mkdir.

I believe I can duplicate this again, and other than the processes going
into the D state everything else seems to work.   Other filesytems
appear to be functional, I can still login to the machine.

Right now the machine is in the deadlocked state, and I will wait for
any suggestions of more data to collect or other tests to try.


I haven't tried it on a locked-up system, but you may try waking up the
[jfsIO] kernel thread with a signal.  I'm not sure what signals may get
through, since the thread doesn't specifically act on a signal.



I will try on the next lockup.

   Roger
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Nvidia MCP55 Machine reboots on ixgb driver load

2007-01-24 Thread Roger Heflin

Auke Kok wrote:

[added netdev to CC]

Roger Heflin wrote:

I have a machine (actually 2 machines) that upon loading
the intel 10GBe driver (ixgb) the machine reboots, I am
using a RHAS4.4 based distribution with Vanilla 2.6.19.2
(the RHAS 4.4.03 kernel also reboots with the ixgb load),
I don't see any messages on the machine before it reboots,
loading the driver with debug does not appear to produce
any extra messages.   The basic steps are the I load
the driver, the machine locks up, and then in a second
or 2 it starts to post.


some suggestions immediately come to mind:

* have you tried the latest driver from http://e1000.sf.net/ ?


I have tried the ixgb-1.0-126 driver from intel's web site, with it doing
the same thing, and that looks to be the same as the sf driver.



* what happens when you unplug the card and modprobe the ixgb module?


That loads just fine, and prints out the driver information, and
the copyright.



* have you tried capturing a trace with netconsole or serial console? 
probing the ixgb driver produces at least 1 syslog line that should show 
up. If nothing shows up on serial or netconsole, the issue may be way 
outside the ixgb driver.


I *think* have got the line listing the interrupt 1 or 2 times just
before it goes away, I got the serial crossover working to a
laptop and got no extra kernel messages when the driver was loaded
and rebooted the machine, I did see the full kernel bootup so
I know the serial console was working correctly.




I have tried the default ixgb driver in 2.6.19.2, and I
have tried the open source intel driver on RH4.4, both cause
the same reboot.I also tried the linux firmware
development kit, and booting fc6 causes the same reboot
upon the network driver load.


just for completeness, which driver versions were this?


The ixgb-1.0.126 driver from Intel's site.

The default driver on 4.4.03 does not support the CX4 board,
but loads just fine, just does not find any cards that it
can drive.   I did confirm that it does not list the PCIID
for the CX4 card.



and when you say "booting fc6" I assume you mean that fc6 boots OK until 
you modprobe the ixgb driver?


Yes, that is correct the machine goes to full multiuser (I have
the .ko file moved elsewhere so automatic module loading does
not kill the machine-until it choose to test it), and has been
used for some io test with no issues until the ixgb driver is loaded.

Both machines have been heavily tested with high cpu applications
that verify their results to make sure memory is working correctly
under load.




I have tried pci=nomsi


try compiling your kernel without MSI support alltogether. There have 
been serious issues found with MSI on certain configurations, and in 
your case this might be the cause. Allthough passing pci=nomsi should be 
the same as compiling it out, it can't hurt to try.


Ok, I tried that.

I found out that breaks some other unrelated stuff, but loading the ixgb
driver still crashes the machine.




Also note that the [EMAIL PROTECTED] address is apparently
unused and returns an email telling you to use a web page,
and so far after using the web page I have not got any response
indicating that anything got to anyone.


I've taken action to get that straightened out. You're always welcome to 
mail netdev, the e1000 devel list or even lkml which we will all pick up 
on.



I can't think of a specific reason for the issue right now, other than 
attempting to get a serial/netconsole attached and trying without any 
msi support in the kernel. Please give that a try and let us know.




Nothing extra from the serial console, and still locks up with no
msi support.

 Roger
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Nvidia MCP55 Machine reboots on ixgb driver load

2007-01-24 Thread Roger Heflin

Jeff V. Merkey wrote:


I have seen something similiar with the ixgb.   make certain there are 
**NO** other adapters sharing the PCI bus with the
ixgb.   There are some serious hardware compatibility issues with the 
ixgb mixing it with other cards on the same PCI-X bus,
and I have seen power loading problems, performance slowdowns, reboots, 
and other issues when an ixgb is installed in a

system.


It is on its own pci-x bus with no other cards on the
pci slots that are on the same bus, the machine in question has
4 slots divided into 2 buses.

Roger

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Nvidia MCP55 Machine reboots on ixgb driver load

2007-01-24 Thread Roger Heflin

Jeff V. Merkey wrote:


I have seen something similiar with the ixgb.   make certain there are 
**NO** other adapters sharing the PCI bus with the
ixgb.   There are some serious hardware compatibility issues with the 
ixgb mixing it with other cards on the same PCI-X bus,
and I have seen power loading problems, performance slowdowns, reboots, 
and other issues when an ixgb is installed in a

system.


It is on its own pci-x bus with no other cards on the
pci slots that are on the same bus, the machine in question has
4 slots divided into 2 buses.

Roger

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Nvidia MCP55 Machine reboots on ixgb driver load

2007-01-24 Thread Roger Heflin

Auke Kok wrote:

[added netdev to CC]

Roger Heflin wrote:

I have a machine (actually 2 machines) that upon loading
the intel 10GBe driver (ixgb) the machine reboots, I am
using a RHAS4.4 based distribution with Vanilla 2.6.19.2
(the RHAS 4.4.03 kernel also reboots with the ixgb load),
I don't see any messages on the machine before it reboots,
loading the driver with debug does not appear to produce
any extra messages.   The basic steps are the I load
the driver, the machine locks up, and then in a second
or 2 it starts to post.


some suggestions immediately come to mind:

* have you tried the latest driver from http://e1000.sf.net/ ?


I have tried the ixgb-1.0-126 driver from intel's web site, with it doing
the same thing, and that looks to be the same as the sf driver.



* what happens when you unplug the card and modprobe the ixgb module?


That loads just fine, and prints out the driver information, and
the copyright.



* have you tried capturing a trace with netconsole or serial console? 
probing the ixgb driver produces at least 1 syslog line that should show 
up. If nothing shows up on serial or netconsole, the issue may be way 
outside the ixgb driver.


I *think* have got the line listing the interrupt 1 or 2 times just
before it goes away, I got the serial crossover working to a
laptop and got no extra kernel messages when the driver was loaded
and rebooted the machine, I did see the full kernel bootup so
I know the serial console was working correctly.




I have tried the default ixgb driver in 2.6.19.2, and I
have tried the open source intel driver on RH4.4, both cause
the same reboot.I also tried the linux firmware
development kit, and booting fc6 causes the same reboot
upon the network driver load.


just for completeness, which driver versions were this?


The ixgb-1.0.126 driver from Intel's site.

The default driver on 4.4.03 does not support the CX4 board,
but loads just fine, just does not find any cards that it
can drive.   I did confirm that it does not list the PCIID
for the CX4 card.



and when you say booting fc6 I assume you mean that fc6 boots OK until 
you modprobe the ixgb driver?


Yes, that is correct the machine goes to full multiuser (I have
the .ko file moved elsewhere so automatic module loading does
not kill the machine-until it choose to test it), and has been
used for some io test with no issues until the ixgb driver is loaded.

Both machines have been heavily tested with high cpu applications
that verify their results to make sure memory is working correctly
under load.




I have tried pci=nomsi


try compiling your kernel without MSI support alltogether. There have 
been serious issues found with MSI on certain configurations, and in 
your case this might be the cause. Allthough passing pci=nomsi should be 
the same as compiling it out, it can't hurt to try.


Ok, I tried that.

I found out that breaks some other unrelated stuff, but loading the ixgb
driver still crashes the machine.




Also note that the [EMAIL PROTECTED] address is apparently
unused and returns an email telling you to use a web page,
and so far after using the web page I have not got any response
indicating that anything got to anyone.


I've taken action to get that straightened out. You're always welcome to 
mail netdev, the e1000 devel list or even lkml which we will all pick up 
on.



I can't think of a specific reason for the issue right now, other than 
attempting to get a serial/netconsole attached and trying without any 
msi support in the kernel. Please give that a try and let us know.




Nothing extra from the serial console, and still locks up with no
msi support.

 Roger
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: kernel 2.6.13 buffer strangeness

2005-09-09 Thread Roger Heflin

I saw it mentioned before that the kernel only allows a certain
percentage of total memory to be dirty, I thought the number was
around 40%, and I have seen machines with large amounts of ram,
hit the 40% and then put the writing application into disk wait
until certain amounts of things are written out, and then take
it out of disk wait, and repeat when it again hits 40%, given your
rate different it would be close to 40% in 50seconds.

And I think that you mean MB(yte) not Mb(it).

   Roger

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Anthony Wesley
> Sent: Friday, September 09, 2005 4:11 AM
> To: linux-kernel@vger.kernel.org
> Subject: Re: kernel 2.6.13 buffer strangeness
> 
> Thanks David, but if you read my original post in full you'll 
> see that I've tried that, and while I can start the write out 
> sooner by lowering /proc/sys/vm/dirty_ratio , it makes no 
> difference to the results that I am getting. I still seem to 
> run out of steam after only 50 seconds where it should take 
> about 3 minutes.
> 
> regards, Anthony
> 
> --
> Anthony Wesley
> Director and IT/Network Consultant
> Smart Networks Pty Ltd
> Acquerra Pty Ltd
> 
> [EMAIL PROTECTED]
> Phone: (02) 62595404 or 0419409836
> 
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in the body of a message to 
> [EMAIL PROTECTED] More majordomo info at  
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: kernel 2.6.13 buffer strangeness

2005-09-09 Thread Roger Heflin

I saw it mentioned before that the kernel only allows a certain
percentage of total memory to be dirty, I thought the number was
around 40%, and I have seen machines with large amounts of ram,
hit the 40% and then put the writing application into disk wait
until certain amounts of things are written out, and then take
it out of disk wait, and repeat when it again hits 40%, given your
rate different it would be close to 40% in 50seconds.

And I think that you mean MB(yte) not Mb(it).

   Roger

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Anthony Wesley
 Sent: Friday, September 09, 2005 4:11 AM
 To: linux-kernel@vger.kernel.org
 Subject: Re: kernel 2.6.13 buffer strangeness
 
 Thanks David, but if you read my original post in full you'll 
 see that I've tried that, and while I can start the write out 
 sooner by lowering /proc/sys/vm/dirty_ratio , it makes no 
 difference to the results that I am getting. I still seem to 
 run out of steam after only 50 seconds where it should take 
 about 3 minutes.
 
 regards, Anthony
 
 --
 Anthony Wesley
 Director and IT/Network Consultant
 Smart Networks Pty Ltd
 Acquerra Pty Ltd
 
 [EMAIL PROTECTED]
 Phone: (02) 62595404 or 0419409836
 
 -
 To unsubscribe from this list: send the line unsubscribe 
 linux-kernel in the body of a message to 
 [EMAIL PROTECTED] More majordomo info at  
 http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Roger Heflin

With the Segate sata's I worked with before, I had to
actually remove them from the blacklist, this was a couple
of months ago with the native sata seagate disks.

With the drive in the blacklist the drive worked right
under light conditions, but under a dd read from the boot
seagate the entire machine appeared to block on any io
going to that disk, it did not stop (verified by vmstat),
but I could never get the 55-60MiB/second expected, and
was getting around 15MiB/second, with enormous amounts
of interrupts, after removing it from the blacklist,
I got the 55-60MiB/second rate, and the interrupts were
much more reasonable, and the response of the system
was actually useable.When the lockup occurred, stopping
the dd resulting in all things unlocking and continuing
on, I duplicated this several times with the latest kernel
at the time.

   Roger 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Chris Boot
> Sent: Thursday, August 11, 2005 4:55 PM
> To: linux-kernel@vger.kernel.org
> Subject: SiI 3112A + Seagate HDs = still no go?
> 
> Hi all,
> 
> I just recently took the plunge and bought 4 250 GB Seagate 
> drives and a 2 port Silicon Image 3112A controller card for 
> the 2 drives my motherboard doesn't handle. No matter how 
> hard I try, I can't get the hard drives to work: they are 
> detected correctly and work reasonably well under _very_ 
> light load, but anything like building a RAID array is a bit 
> much and the whole controller seems to lock up.
> 
> I've tried adding the drive to the blacklist in the 
> sata_sil.c driver and I still have the same trouble: as you 
> can see the messages below relate to my patched kernel with 
> the blacklist fix. I've seen that this was discussed just 
> yesterday, but that seemed to give nothing:  
> http://www.ussg.iu.edu/hypermail/linux/kernel/0508.1/0310.html
> 
> Ready and willing to hack my kernel to pieces; this machine 
> is no use until I get all the drives working! Needless to say 
> the drives connected to the on-board VIA controller work 
> fine, as do the drives currently on the SiI controller if I 
> swap them around.
> 
> Any ideas?
> 
> TIA
> Chris
> 
> The following messages are sent to the log when everything goes mad:
> 
> ata1: command 0x35 timeout, stat 0xd8 host_stat 0x0
> ata1: status=0xd8 { Busy }
> SCSI error : <0 0 0 0> return code = 0x8002
> sda: Current: sense key=0xb
> ASC=0x47 ASCQ=0x0
> end_request: I/O error, dev sda, sector 2990370
> ATA: abnormal status 0xD8 on port E0802087
> ATA: abnormal status 0xD8 on port E0802087
> ATA: abnormal status 0xD8 on port E0802087 [ the above is 
> transcribed so may not be 100% accurate ]
> 
> Dmesg log during boot (and detection):
> 
> Aug 11 21:47:05 arcadia Linux version 2.6.12-gentoo-r6
> ([EMAIL PROTECTED]) (gcc version 3.3.5-20050130 (Gentoo 
> 3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1)) #2 Thu 
> Aug 11 20:19:00 BST 2005 ...
> Aug 11 17:30:12 arcadia sata_sil version 0.9 Aug 11 17:30:12 
> arcadia ACPI: PCI Interrupt :00:0a.0[A] -> GSI 18 (level, 
> low) -> IRQ 177 Aug 11 17:30:12 arcadia ata1: SATA max 
> UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma 0xE0802000 irq 
> 177 Aug 11 17:30:12 arcadia ata2: SATA max UDMA/100 cmd 
> 0xE08020C0 ctl 0xE08020CA bmdma 0xE0802008 irq 177 Aug 11 
> 17:30:12 arcadia ata1: dev 0 cfg 49:2f00 82:346b 83:7d01
> 84:4023 85:3469 86:3c01 87:4023 88:207f
> Aug 11 17:30:12 arcadia ata1: dev 0 ATA, max UDMA/133, 488397168
> sectors: lba48
> Aug 11 17:30:12 arcadia ata1(0): applying Seagate errata fix 
> Aug 11 17:30:12 arcadia ata1: dev 0 configured for UDMA/100 
> Aug 11 17:30:12 arcadia scsi0 : sata_sil Aug 11 17:30:12 
> arcadia ata2: dev 0 cfg 49:2f00 82:346b 83:7d01
> 84:4023 85:3469 86:3c01 87:4023 88:207f
> Aug 11 17:30:12 arcadia ata2: dev 0 ATA, max UDMA/133, 488397168
> sectors: lba48
> Aug 11 17:30:12 arcadia ata2(0): applying Seagate errata fix 
> Aug 11 17:30:12 arcadia ata2: dev 0 configured for UDMA/100 
> Aug 11 17:30:12 arcadia scsi1 : sata_sil
> Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
> Rev: 3.03
> Aug 11 17:30:12 arcadia Type:   Direct-Access   
> ANSI SCSI revision: 05
> Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
> Rev: 3.03
> Aug 11 17:30:12 arcadia Type:   Direct-Access   
> ANSI SCSI revision: 05
> 
> lspci:
> 
> :00:00.0 Host bridge: VIA Technologies, Inc. VT8377 
> [KT400/KT600 AGP] Host Bridge :00:01.0 PCI bridge: VIA 
> Technologies, Inc. VT8235 PCI Bridge :00:0a.0 Unknown 
> mass storage controller: Silicon Image, Inc. SiI
> 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02) 
> :00:0c.0 FireWire (IEEE 1394): Agere Systems (former Lucent
> Microelectronics) FW323 (rev 61)
> :00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA 
> VT6420 SATA RAID Controller (rev 80)
> :00:0f.1 IDE interface: VIA Technologies, Inc. 
> 

RE: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Roger Heflin

With the Segate sata's I worked with before, I had to
actually remove them from the blacklist, this was a couple
of months ago with the native sata seagate disks.

With the drive in the blacklist the drive worked right
under light conditions, but under a dd read from the boot
seagate the entire machine appeared to block on any io
going to that disk, it did not stop (verified by vmstat),
but I could never get the 55-60MiB/second expected, and
was getting around 15MiB/second, with enormous amounts
of interrupts, after removing it from the blacklist,
I got the 55-60MiB/second rate, and the interrupts were
much more reasonable, and the response of the system
was actually useable.When the lockup occurred, stopping
the dd resulting in all things unlocking and continuing
on, I duplicated this several times with the latest kernel
at the time.

   Roger 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Chris Boot
 Sent: Thursday, August 11, 2005 4:55 PM
 To: linux-kernel@vger.kernel.org
 Subject: SiI 3112A + Seagate HDs = still no go?
 
 Hi all,
 
 I just recently took the plunge and bought 4 250 GB Seagate 
 drives and a 2 port Silicon Image 3112A controller card for 
 the 2 drives my motherboard doesn't handle. No matter how 
 hard I try, I can't get the hard drives to work: they are 
 detected correctly and work reasonably well under _very_ 
 light load, but anything like building a RAID array is a bit 
 much and the whole controller seems to lock up.
 
 I've tried adding the drive to the blacklist in the 
 sata_sil.c driver and I still have the same trouble: as you 
 can see the messages below relate to my patched kernel with 
 the blacklist fix. I've seen that this was discussed just 
 yesterday, but that seemed to give nothing:  
 http://www.ussg.iu.edu/hypermail/linux/kernel/0508.1/0310.html
 
 Ready and willing to hack my kernel to pieces; this machine 
 is no use until I get all the drives working! Needless to say 
 the drives connected to the on-board VIA controller work 
 fine, as do the drives currently on the SiI controller if I 
 swap them around.
 
 Any ideas?
 
 TIA
 Chris
 
 The following messages are sent to the log when everything goes mad:
 
 ata1: command 0x35 timeout, stat 0xd8 host_stat 0x0
 ata1: status=0xd8 { Busy }
 SCSI error : 0 0 0 0 return code = 0x8002
 sda: Current: sense key=0xb
 ASC=0x47 ASCQ=0x0
 end_request: I/O error, dev sda, sector 2990370
 ATA: abnormal status 0xD8 on port E0802087
 ATA: abnormal status 0xD8 on port E0802087
 ATA: abnormal status 0xD8 on port E0802087 [ the above is 
 transcribed so may not be 100% accurate ]
 
 Dmesg log during boot (and detection):
 
 Aug 11 21:47:05 arcadia Linux version 2.6.12-gentoo-r6
 ([EMAIL PROTECTED]) (gcc version 3.3.5-20050130 (Gentoo 
 3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1)) #2 Thu 
 Aug 11 20:19:00 BST 2005 ...
 Aug 11 17:30:12 arcadia sata_sil version 0.9 Aug 11 17:30:12 
 arcadia ACPI: PCI Interrupt :00:0a.0[A] - GSI 18 (level, 
 low) - IRQ 177 Aug 11 17:30:12 arcadia ata1: SATA max 
 UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma 0xE0802000 irq 
 177 Aug 11 17:30:12 arcadia ata2: SATA max UDMA/100 cmd 
 0xE08020C0 ctl 0xE08020CA bmdma 0xE0802008 irq 177 Aug 11 
 17:30:12 arcadia ata1: dev 0 cfg 49:2f00 82:346b 83:7d01
 84:4023 85:3469 86:3c01 87:4023 88:207f
 Aug 11 17:30:12 arcadia ata1: dev 0 ATA, max UDMA/133, 488397168
 sectors: lba48
 Aug 11 17:30:12 arcadia ata1(0): applying Seagate errata fix 
 Aug 11 17:30:12 arcadia ata1: dev 0 configured for UDMA/100 
 Aug 11 17:30:12 arcadia scsi0 : sata_sil Aug 11 17:30:12 
 arcadia ata2: dev 0 cfg 49:2f00 82:346b 83:7d01
 84:4023 85:3469 86:3c01 87:4023 88:207f
 Aug 11 17:30:12 arcadia ata2: dev 0 ATA, max UDMA/133, 488397168
 sectors: lba48
 Aug 11 17:30:12 arcadia ata2(0): applying Seagate errata fix 
 Aug 11 17:30:12 arcadia ata2: dev 0 configured for UDMA/100 
 Aug 11 17:30:12 arcadia scsi1 : sata_sil
 Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
 Rev: 3.03
 Aug 11 17:30:12 arcadia Type:   Direct-Access   
 ANSI SCSI revision: 05
 Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
 Rev: 3.03
 Aug 11 17:30:12 arcadia Type:   Direct-Access   
 ANSI SCSI revision: 05
 
 lspci:
 
 :00:00.0 Host bridge: VIA Technologies, Inc. VT8377 
 [KT400/KT600 AGP] Host Bridge :00:01.0 PCI bridge: VIA 
 Technologies, Inc. VT8235 PCI Bridge :00:0a.0 Unknown 
 mass storage controller: Silicon Image, Inc. SiI
 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02) 
 :00:0c.0 FireWire (IEEE 1394): Agere Systems (former Lucent
 Microelectronics) FW323 (rev 61)
 :00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA 
 VT6420 SATA RAID Controller (rev 80)
 :00:0f.1 IDE interface: VIA Technologies, Inc. 
 VT82C586A/B/ VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 
 06) :00:10.0 USB Controller: VIA 

RE: NCQ support NVidia NForce4 (CK804) SATAII

2005-08-11 Thread Roger Heflin

For high end stuff Serverworks is supposed to have some
AMD stuff soon (this is rumor I heard).

>From what Allen said, the implication to me is that something
in the current NVIDIA stat NCQ chipset is *not* fully under
NVIDIA's control, ie they got some piece of technology from
someone else and cannot disclose its details, which would be
why the could release a "clean" redesigned one.

  Roger

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Michael Thonke
> Sent: Thursday, August 11, 2005 10:44 AM
> To: Lee Revell
> Cc: [EMAIL PROTECTED]; Allen Martin; linux mailing-list
> Subject: Re: NCQ support NVidia NForce4 (CK804) SATAII
> 
> Lee Revell schrieb:
> 
> >On Thu, 2005-08-11 at 17:17 +0200, Michael Thonke wrote:
> >  
> >
> >>*frustrated*
> >>
> >>
> >
> >Hey I don't like it any more than you do.  But Nvidia is an 
> IP company 
> >and they act like one.  Most of us would probably do the exact same 
> >thing in their position, AKA whatever the lawyers tell them ;-)
> >
> >Lee
> >
> >
> >  
> >
> Jepp Lee your are right.
> 
> Well, the lawyers sometimes like cancer..nobody wants them 
> nobody need them, but they still there. *hard ironic* 
> Couldn't we tie all togehter, to make the world better? To 
> make more hardware loving Linux? And peace on earth *ironic 
> off - taken from Miss Hardware Support election live from Germany*
> 
> But omitting a costumer in some decisions could break there neck..
> 
> If we would have such company we would may do something like 
> that, but hey..saying my Hardware is Linux friendly is much 
> cooler :-) And mh some goverments and schools using AMD and 
> NForce Chipsets..this is a lost market..they are blind?
> In my old school we had 200 + 100 computers with NForce2..and 
> they wanted to move to Linux OS..but can't..sharing my 
> knowledge stuck at the point of driver support from NVidia 
> for Linux so the problem was on the root (hardware). Now they 
> changed to Intel and what happen no NVidia anymore - isn't 
> the worth would NVidia say right?
> 
> Greets & Best regards
> Michael
> 
> --
> Michael Thonke
> IT-Systemintegrator /
> System- and Softwareanalyist
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: NCQ support NVidia NForce4 (CK804) SATAII

2005-08-11 Thread Roger Heflin

For high end stuff Serverworks is supposed to have some
AMD stuff soon (this is rumor I heard).

From what Allen said, the implication to me is that something
in the current NVIDIA stat NCQ chipset is *not* fully under
NVIDIA's control, ie they got some piece of technology from
someone else and cannot disclose its details, which would be
why the could release a clean redesigned one.

  Roger

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Michael Thonke
 Sent: Thursday, August 11, 2005 10:44 AM
 To: Lee Revell
 Cc: [EMAIL PROTECTED]; Allen Martin; linux mailing-list
 Subject: Re: NCQ support NVidia NForce4 (CK804) SATAII
 
 Lee Revell schrieb:
 
 On Thu, 2005-08-11 at 17:17 +0200, Michael Thonke wrote:
   
 
 *frustrated*
 
 
 
 Hey I don't like it any more than you do.  But Nvidia is an 
 IP company 
 and they act like one.  Most of us would probably do the exact same 
 thing in their position, AKA whatever the lawyers tell them ;-)
 
 Lee
 
 
   
 
 Jepp Lee your are right.
 
 Well, the lawyers sometimes like cancer..nobody wants them 
 nobody need them, but they still there. *hard ironic* 
 Couldn't we tie all togehter, to make the world better? To 
 make more hardware loving Linux? And peace on earth *ironic 
 off - taken from Miss Hardware Support election live from Germany*
 
 But omitting a costumer in some decisions could break there neck..
 
 If we would have such company we would may do something like 
 that, but hey..saying my Hardware is Linux friendly is much 
 cooler :-) And mh some goverments and schools using AMD and 
 NForce Chipsets..this is a lost market..they are blind?
 In my old school we had 200 + 100 computers with NForce2..and 
 they wanted to move to Linux OS..but can't..sharing my 
 knowledge stuck at the point of driver support from NVidia 
 for Linux so the problem was on the root (hardware). Now they 
 changed to Intel and what happen no NVidia anymore - isn't 
 the worth would NVidia say right?
 
 Greets  Best regards
 Michael
 
 --
 Michael Thonke
 IT-Systemintegrator /
 System- and Softwareanalyist
 
 
 
 -
 To unsubscribe from this list: send the line unsubscribe 
 linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: MCE problem on dual Opteron

2005-08-04 Thread Roger Heflin

If this does not happen immediately at boot up (before the machine
finished all init stuff), it is generally a hardware problem.  In
my experience with new machines 75% of the time it will be the cpu
itself, and another 25% it will be a serious memory error.

The machine I have dealt with are dual cpu with ECC's and we have
seen and eliminated a large number of these, that upon testing are 
duplicatable on a per machine basis with large numbers of identical
machines *NOT* having the same issue with all having the same os.

So in my experience this has always been hardware, except that after
a initial cold power on (with the kernel I was using at the time) there
seems to be some weird initial state that causes these to happen, we
would get MCE's on around 10% of machines on an improper power cycle
(150KW main breaker blew), and after a couple of reboots the machines 
that had the error would come up and be fine.

Fedora also sets quiet so the messages won't scare any of the common
desktop type people, so I would not count there use of nomce as being 
important on a server class machine, from the MCE's that I have got,
I don't think you would for the most part actually get any (someone
correct me if I am wrong on this) if you did not have ECC ram, as
all of the ones I have got (and I have dealt with a large sample of
dual cpu servers- >500) have been related to ECC (except for the 
weird initial power on one).

  Roger



> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Martin Drab
> Sent: Thursday, August 04, 2005 8:55 AM
> To: Linux Kernel Mailing List
> Subject: MCE problem on dual Opteron
> 
> Hi,
> 
> I get the following problem with 2.6.13-rc5-git1 on a dual Opteron
> machine:
> 
> -
> ...
> [   847.745921] CPU 0: Machine Check Exception:   
>  7 Bank 3: b400083b
> [   847.746066] RIP 10: {pci_conf1_read+0xbe/0x110}
> [   847.746149] TSC 189fe311d3f ADDR fdfc000cfe
> [   847.746218] Kernel panic - not syncing: Uncorrected machine check
> -
> 
> This appears during bootup and it hangs. So my question is: 
> Is this a HW problem or is it some kernel (MCE ?) bug? If it 
> is a HW problem is it possible to determine what's wrong somehow?
> 
> The above mentioned output I get also from 2.6.13-rc4-git4 
> and 2.6.12.3. 
> When I run the original FC4 kernel 2.6.11-1.1369_FC4smp I get 
> the same followed by the following call trace:
> 
> -
> Call Trace: <#MC> {panic+133} 
> {print_mce+159} 
> {mce_panic+137} 
>  {pci_conf1_read+190}
> {pci_conf1_read+190}
> {machine_check+127}
> {selinux_d_instantiate+0}
> {pci_conf1_read+190}  
> {pci_direct_init+119} 
> {init+482} {child_rip+8} 
> {init+0} {child_rip+0}
> 
> 
> Interesting is, that FC4 automatically sets the 'nomce' 
> option to the kernel command line by default (which leads me 
> to that it may actually be a bug in the kernel). And when 
> 'nomce' is used the system boots and runs quite normally.
> 
> Only recently with 2.6.12.3 (which the box was running past few
> months) from time to time (so far it happend 3 times in about 
> a month) the box completly stops responding to the outside 
> world (no network, display turns off (no signal), USB 
> keyboard and mouse both go dead, however the comp isn't 
> turned off because for instance the disks are still normally 
> flashing with the LEDs, but that may be due to the 
> intelligent LSI 1030 controller with its own independent 
> processor), so basically the box is dead to te outside world. 
> There's nothing unusual in the kernel logs. The only thing 
> that may be a result of that is that the IPMI server 
> management card registers the following 4 system events, 
> however I'm not very clever from that:
> 
> -
> 1)
>   SEL Entry Number = 5
>   SEL Record ID = 0050
>   SEL Record Type = 02 - System Event Record
>   Timestamp: 3.8.2005 02:31:59
>   Generator ID: 21 00
>   SEL Message Rev = 04
>   Sensor Type = 20 - OS Critical Stop
>   Sensor Number = 41 (unknown)
>   SEL Event Type = 6F - Sensor-specific, Assertion
>   SEL Event Data = A1 69 65
> 2)
>   SEL Entry Number = 6
>   SEL Record ID = 0060
>   SEL Record Type = 0F - OEM Defined
>   Timestamp:
>   Generator ID: 65 65
>   SEL Message Rev = 2C
>   Sensor Type = 20 - OS Critical Stop
>   Sensor Number = 6B - (unknown)
>   SEL Event Type = 69
>   SEL Event Data = 6C 6C 69
> 3)
>   SEL Entry Number = 7
>   SEL Record ID = 0070
>   SEL Record Type = 0F - OEM Defined
>   Timestamp:
>   Generator ID: 20 69
>   SEL Message Rev = 6E
>   Sensor Type = 74
>   Sensor Number = 65 - (unknown)
>   SEL Event Type = 72
>   SEL Event Data = 72 75 70
> 4)
>   SEL Entry Number = 8
>   SEL Record ID = 0080
>   SEL Record Type = 0F - OEM Defined
>   Timestamp:
>   Generator ID: 68 61
>   SEL Message 

RE: MCE problem on dual Opteron

2005-08-04 Thread Roger Heflin

If this does not happen immediately at boot up (before the machine
finished all init stuff), it is generally a hardware problem.  In
my experience with new machines 75% of the time it will be the cpu
itself, and another 25% it will be a serious memory error.

The machine I have dealt with are dual cpu with ECC's and we have
seen and eliminated a large number of these, that upon testing are 
duplicatable on a per machine basis with large numbers of identical
machines *NOT* having the same issue with all having the same os.

So in my experience this has always been hardware, except that after
a initial cold power on (with the kernel I was using at the time) there
seems to be some weird initial state that causes these to happen, we
would get MCE's on around 10% of machines on an improper power cycle
(150KW main breaker blew), and after a couple of reboots the machines 
that had the error would come up and be fine.

Fedora also sets quiet so the messages won't scare any of the common
desktop type people, so I would not count there use of nomce as being 
important on a server class machine, from the MCE's that I have got,
I don't think you would for the most part actually get any (someone
correct me if I am wrong on this) if you did not have ECC ram, as
all of the ones I have got (and I have dealt with a large sample of
dual cpu servers- 500) have been related to ECC (except for the 
weird initial power on one).

  Roger



 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Martin Drab
 Sent: Thursday, August 04, 2005 8:55 AM
 To: Linux Kernel Mailing List
 Subject: MCE problem on dual Opteron
 
 Hi,
 
 I get the following problem with 2.6.13-rc5-git1 on a dual Opteron
 machine:
 
 -
 ...
 [   847.745921] CPU 0: Machine Check Exception:   
  7 Bank 3: b400083b
 [   847.746066] RIP 10:802c04ee {pci_conf1_read+0xbe/0x110}
 [   847.746149] TSC 189fe311d3f ADDR fdfc000cfe
 [   847.746218] Kernel panic - not syncing: Uncorrected machine check
 -
 
 This appears during bootup and it hangs. So my question is: 
 Is this a HW problem or is it some kernel (MCE ?) bug? If it 
 is a HW problem is it possible to determine what's wrong somehow?
 
 The above mentioned output I get also from 2.6.13-rc4-git4 
 and 2.6.12.3. 
 When I run the original FC4 kernel 2.6.11-1.1369_FC4smp I get 
 the same followed by the following call trace:
 
 -
 Call Trace: #MC 80139195{panic+133} 
 80115e1f{print_mce+159} 
 80115ed9{mce_panic+137} 
 801165b4do_machine_check+852}
 802e8f5e{pci_conf1_read+190}
 802e8f5e{pci_conf1_read+190}
 8010fe7f{machine_check+127}
 801f2c60{selinux_d_instantiate+0}
 802e8f5e{pci_conf1_read+190} EOE 
 80541f97{pci_direct_init+119} 
 8010c232{init+482} 8010f76b{child_rip+8} 
 8010c050{init+0} 8010f763{child_rip+0}
 
 
 Interesting is, that FC4 automatically sets the 'nomce' 
 option to the kernel command line by default (which leads me 
 to that it may actually be a bug in the kernel). And when 
 'nomce' is used the system boots and runs quite normally.
 
 Only recently with 2.6.12.3 (which the box was running past few
 months) from time to time (so far it happend 3 times in about 
 a month) the box completly stops responding to the outside 
 world (no network, display turns off (no signal), USB 
 keyboard and mouse both go dead, however the comp isn't 
 turned off because for instance the disks are still normally 
 flashing with the LEDs, but that may be due to the 
 intelligent LSI 1030 controller with its own independent 
 processor), so basically the box is dead to te outside world. 
 There's nothing unusual in the kernel logs. The only thing 
 that may be a result of that is that the IPMI server 
 management card registers the following 4 system events, 
 however I'm not very clever from that:
 
 -
 1)
   SEL Entry Number = 5
   SEL Record ID = 0050
   SEL Record Type = 02 - System Event Record
   Timestamp: 3.8.2005 02:31:59
   Generator ID: 21 00
   SEL Message Rev = 04
   Sensor Type = 20 - OS Critical Stop
   Sensor Number = 41 (unknown)
   SEL Event Type = 6F - Sensor-specific, Assertion
   SEL Event Data = A1 69 65
 2)
   SEL Entry Number = 6
   SEL Record ID = 0060
   SEL Record Type = 0F - OEM Defined
   Timestamp:
   Generator ID: 65 65
   SEL Message Rev = 2C
   Sensor Type = 20 - OS Critical Stop
   Sensor Number = 6B - (unknown)
   SEL Event Type = 69
   SEL Event Data = 6C 6C 69
 3)
   SEL Entry Number = 7
   SEL Record ID = 0070
   SEL Record Type = 0F - OEM Defined
   Timestamp:
   Generator ID: 20 69
   SEL Message Rev = 6E
   Sensor Type = 74
   Sensor Number = 65 - (unknown)
   SEL Event Type = 72
   SEL Event Data = 72 75 70
 4)
 

ECC Support in Linux

2005-08-01 Thread Roger Heflin
 

I have had a fair amount of trouble with the limited support
for ecc reporting on higher end dual and quad cpu servers as
the reporting is pretty weak.

On the opterons I can tell which cpu gets errors, but mcelog
does not isolate things down to the dimm level properly, is
there a way to do this sort of thing?   I am talking about most
of the whitebox type motherboards.

On the newer Intels I have not found any useable ECC support
is there any in the kernels?

I can test a variety of hardware if someone needs it, and can
probably even come up with some test memory that will generate ecc
errors.

 Roger   

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


ECC Support in Linux

2005-08-01 Thread Roger Heflin
 

I have had a fair amount of trouble with the limited support
for ecc reporting on higher end dual and quad cpu servers as
the reporting is pretty weak.

On the opterons I can tell which cpu gets errors, but mcelog
does not isolate things down to the dimm level properly, is
there a way to do this sort of thing?   I am talking about most
of the whitebox type motherboards.

On the newer Intels I have not found any useable ECC support
is there any in the kernels?

I can test a variety of hardware if someone needs it, and can
probably even come up with some test memory that will generate ecc
errors.

 Roger   

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: accessing CD fs from initrd

2005-07-25 Thread Roger Heflin

/dev/cdrom is a link to the proper device, if that link is not
on the initrd /dev/cdrom won't work.

I previously had some statically linked linuxrc C code (I don't
have the code anymore- it was a work-for-hire), that scanned
the various locations that the cd could be (/dev/hd[abcd...])
and looked for specific files that it expected to be on the
cd, once it found it it setup real-root-dev to be proper for
that device.

This work rather nicely in situations where the location of
the cd drive was not the same from one machine to another,
and was rather simple to write.

Roger 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of zvi Dubitzki
> Sent: Monday, July 25, 2005 2:28 AM
> To: linux-kernel@vger.kernel.org
> Subject: accessing CD fs from initrd
> 
> Hi there
> 
> I want to be CC-ed on a possible answer to the following question.
> I have not found yet an answer to the question in the Linux archives.
> 
> In need access the CD filesystem (iso9660) from within the 
> Linux initrd or right after that (make it root fs).
> I need an example for that since allocating enough ramdisk 
> space (ramdisk_size=90k in kernel command line)  + loading 
> the cdrom.o module at the initrd did not help  mount the CD 
> device (/dev/cdrom)  at the initrd Also I need know how to 
> pivot between the initrd and the CD filesystem
> 
> I am actually using Isolinux/syslinux, but can make test on a 
> regular Linux .
> Any pointer to a literature will also be welcomed .
> 
> thanks
> 
> Zvi Dubitzki
> 
> _
> FREE pop-up blocking with the new MSN Toolbar - get it now! 
> http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
> 
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in the body of a message to 
> [EMAIL PROTECTED] More majordomo info at  
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: HELP: NFS mount hangs when attempting to copy file

2005-07-25 Thread Roger Heflin
A kde and gnome are well above MTU they don't know anything
about MTU and neither does NFS, if those hang it up you have
a network configuration problem, and should probably fix it, 
as a number of other things will show the problem also.

Routers almost always have hard coded MTU limits, and they are
almost never the default 1500, so everything needs to be
properly told what your networks MTU is, or some external
device needs to be taking care of it properly.

Roger

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Timothy Miller
> Sent: Saturday, July 23, 2005 9:52 PM
> To: Trond Myklebust
> Cc: linux-kernel@vger.kernel.org
> Subject: Re: HELP: NFS mount hangs when attempting to copy file
> 
> On 7/23/05, Trond Myklebust <[EMAIL PROTECTED]> wrote:
> 
> > I beg to disagree. A lot of these VPN solutions are 
> unfriendly to MTU 
> > path discovery over UDP. Sun uses TCP by default when mounting NFS 
> > partitions. Have you tried this on your Linux box?
> 
> I changed the protocol to TCP and changed rsize and wsize to 
> 1024.  I don't know which of those fixed it, but I'm going to 
> leave it for now.
> 
> As for MTU, yeah, the Watchguard box seems to have some 
> hard-coded limits, and for whatever reason KDE and GNOME 
> graphical logins do something that exceeds those limits, 
> completely independent of NFS, and hang up hard.
> 
> Thanks.
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in the body of a message to 
> [EMAIL PROTECTED] More majordomo info at  
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: HELP: NFS mount hangs when attempting to copy file

2005-07-25 Thread Roger Heflin
A kde and gnome are well above MTU they don't know anything
about MTU and neither does NFS, if those hang it up you have
a network configuration problem, and should probably fix it, 
as a number of other things will show the problem also.

Routers almost always have hard coded MTU limits, and they are
almost never the default 1500, so everything needs to be
properly told what your networks MTU is, or some external
device needs to be taking care of it properly.

Roger

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Timothy Miller
 Sent: Saturday, July 23, 2005 9:52 PM
 To: Trond Myklebust
 Cc: linux-kernel@vger.kernel.org
 Subject: Re: HELP: NFS mount hangs when attempting to copy file
 
 On 7/23/05, Trond Myklebust [EMAIL PROTECTED] wrote:
 
  I beg to disagree. A lot of these VPN solutions are 
 unfriendly to MTU 
  path discovery over UDP. Sun uses TCP by default when mounting NFS 
  partitions. Have you tried this on your Linux box?
 
 I changed the protocol to TCP and changed rsize and wsize to 
 1024.  I don't know which of those fixed it, but I'm going to 
 leave it for now.
 
 As for MTU, yeah, the Watchguard box seems to have some 
 hard-coded limits, and for whatever reason KDE and GNOME 
 graphical logins do something that exceeds those limits, 
 completely independent of NFS, and hang up hard.
 
 Thanks.
 -
 To unsubscribe from this list: send the line unsubscribe 
 linux-kernel in the body of a message to 
 [EMAIL PROTECTED] More majordomo info at  
 http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: accessing CD fs from initrd

2005-07-25 Thread Roger Heflin

/dev/cdrom is a link to the proper device, if that link is not
on the initrd /dev/cdrom won't work.

I previously had some statically linked linuxrc C code (I don't
have the code anymore- it was a work-for-hire), that scanned
the various locations that the cd could be (/dev/hd[abcd...])
and looked for specific files that it expected to be on the
cd, once it found it it setup real-root-dev to be proper for
that device.

This work rather nicely in situations where the location of
the cd drive was not the same from one machine to another,
and was rather simple to write.

Roger 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of zvi Dubitzki
 Sent: Monday, July 25, 2005 2:28 AM
 To: linux-kernel@vger.kernel.org
 Subject: accessing CD fs from initrd
 
 Hi there
 
 I want to be CC-ed on a possible answer to the following question.
 I have not found yet an answer to the question in the Linux archives.
 
 In need access the CD filesystem (iso9660) from within the 
 Linux initrd or right after that (make it root fs).
 I need an example for that since allocating enough ramdisk 
 space (ramdisk_size=90k in kernel command line)  + loading 
 the cdrom.o module at the initrd did not help  mount the CD 
 device (/dev/cdrom)  at the initrd Also I need know how to 
 pivot between the initrd and the CD filesystem
 
 I am actually using Isolinux/syslinux, but can make test on a 
 regular Linux .
 Any pointer to a literature will also be welcomed .
 
 thanks
 
 Zvi Dubitzki
 
 _
 FREE pop-up blocking with the new MSN Toolbar - get it now! 
 http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
 
 -
 To unsubscribe from this list: send the line unsubscribe 
 linux-kernel in the body of a message to 
 [EMAIL PROTECTED] More majordomo info at  
 http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Memory Management

2005-07-22 Thread Roger Heflin

I have seen RH3.0 crash on 32GB systems because it has too
much memory tied up in write cache.  It required update 2 
(this was a while ago) and a change of a parameter in /proc
to prevent the crash, it was because of a overagressive
write caching change RH implemented in the kernel resulted
in the crash.  This crash was an OOM related crash.  To
duplicate the bug, you booted the machine and ran a dd
to create a very large file filling the disk.

We did test and did determine that it did not appear to have
the issue if you had less than 28GB of ram, this was on an
itanium machine, so I don't know if it occurs on other arches,
and if it occurs at the same memory limits on the other arches
either.

Roger

 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Márcio Oliveira
> Sent: Friday, July 22, 2005 2:42 PM
> To: Neil Horman
> Cc: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org
> Subject: Re: Memory Management
> 
> Neil Horman wrote:
> 
> >On Fri, Jul 22, 2005 at 11:32:52AM -0300, Márcio Oliveira wrote:
> >  
> >
> >>Neil Horman wrote:
> >>
> >>
> >>
> >>>On Thu, Jul 21, 2005 at 10:40:54AM -0300, Márcio Oliveira wrote:
> >>>
> >>>
> >>>  
> >>>
> >http://people.redhat.com/nhorman/papers/rhel3_vm.pdf
> >I wrote this with norm awhile back.  It may help you out.
> >Regards
> >Neil
> >
> >
> >
> >
> >  
> >
> Neil,
> 
> Thanks.~10-12GB of total RAM (16GB) are
> 
> How can Proc virtual memory parameters like 
> inactive_clean_percent, 
> overcommit_memory, overcommit_ratio and page_cache help 
> me to solve 
> / reduce Out Of Memory conditions on servers with 16GB 
> RAM and lots 
> of GB swap?
> 
>   
> 
> 
> 
> >>>I wouldn't touch memory overcommit if you are already 
> seeing out of 
> >>>memory issues.  If you are using lots of pagecache, I 
> would suggest 
> >>>increasing inactive_clean percent, reducing the 
> pagecahce.max value, 
> >>>and modifying the bdflush parameters in the above document 
> such that 
> >>>bdflush runs sooner, more often, and does more work per 
> iteration.  
> >>>This will help you move data in pagecache back to disk more 
> >>>aggressively so that memory will be available for other purposes, 
> >>>like heap allocations. Also if you're using a Red Hat 
> kernel and you 
> >>>have 16GB of ram in your system, you're a good candidate for the 
> >>>hugemem kernel.  Rather than a straightforward out of memory 
> >>>condition, you may be seeing a exhaustion of your kernels address 
> >>>space (check LowFree in /proc/meminfo).  In this even the hugemem 
> >>>kernel will help you in that it increases your Low Memory address 
> >>>space from 1GB to 4GB, preventing some OOM conditions.
> >>>
> >>>
> >>>
> >>>
> >>>  
> >>>
> Kernel does not free cached memory (~10-12GB of total RAM 
> - 16GB). Is 
> there some way to force the kernel to free cached memory?
> 
>   
> 
> 
> 
> >>>Cached memory is freed on demand.  Just because its listed 
> under the 
> >>>cached line
> >>>below doesn't mean it can't be freed and used for another 
> purpose.  
> >>>Implement
> >>>the tunings above, and your situation should improve.
> >>>
> >>>Regards
> >>>Neil
> >>>
> >>>
> >>>
> >>>  
> >>>
> /proc/meminfo:
> 
>    total:used:free:  shared: buffers:  cached:
> Mem:16603488256 1652632 801546240 
> 70651904 13194563584
> Swap:   17174257664 11771904 17162485760
> MemTotal: 16214344 kB
> MemFree: 78276 kB
> Buffers: 68996 kB
> Cached:   12874808 kB
> 
> Thanks to all.
> 
> Marcio.
>   
> 
> 
> 
> >>Neil,
> >>
> >>  Thanks for the answers!
> >>
> >>The following lines are the Out Of Memory log:
> >>
> >>Jul 20 13:45:44 server kernel: Out of Memory: Killed 
> process 23716 (oracle).
> >>Jul 20 13:45:44 server kernel: Fixed up OOM kill of mm-less task
> >>Jul 20 13:45:45 server su(pam_unix)[3848]: session closed 
> for user root
> >>Jul 20 13:45:48 server kernel: Mem-info:
> >>Jul 20 13:45:48 server kernel: Zone:DMA freepages:  1884 min: 0 
> >>low: 0 high: 0
> >>Jul 20 13:45:48 server kernel: Zone:Normal freepages:  1084 
> min:  1279 
> >>low:  4544 high:  6304
> >>Jul 20 13:45:48 server kernel: Zone:HighMem 
> freepages:386679 min:   255 
> >>low: 61952 high: 92928
> >>Jul 20 13:45:48 server kernel: Free pages:  389647 
> (386679 HighMem)
> >>Jul 20 13:45:48 server kernel: ( Active: 2259787/488777, 
> >>inactive_laundry: 244282, inactive_clean: 244366, free: 389647 )
> >>Jul 20 13:45:48 server kernel:   aa:0 ac:0 id:0 il:0 ic:0 fr:1884
> >>Jul 20 13:45:48 server kernel:   aa:1620 ac:1801 id:231 
> il:15 ic:0 fr:1085
> >>Jul 20 13:45:48 server kernel:   aa:1099230 ac:1157136 id:488536 
> >>il:244277 ic:244366 fr:386679
> >>Jul 

RE: Memory Management

2005-07-22 Thread Roger Heflin

I have seen RH3.0 crash on 32GB systems because it has too
much memory tied up in write cache.  It required update 2 
(this was a while ago) and a change of a parameter in /proc
to prevent the crash, it was because of a overagressive
write caching change RH implemented in the kernel resulted
in the crash.  This crash was an OOM related crash.  To
duplicate the bug, you booted the machine and ran a dd
to create a very large file filling the disk.

We did test and did determine that it did not appear to have
the issue if you had less than 28GB of ram, this was on an
itanium machine, so I don't know if it occurs on other arches,
and if it occurs at the same memory limits on the other arches
either.

Roger

 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Márcio Oliveira
 Sent: Friday, July 22, 2005 2:42 PM
 To: Neil Horman
 Cc: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org
 Subject: Re: Memory Management
 
 Neil Horman wrote:
 
 On Fri, Jul 22, 2005 at 11:32:52AM -0300, Márcio Oliveira wrote:
   
 
 Neil Horman wrote:
 
 
 
 On Thu, Jul 21, 2005 at 10:40:54AM -0300, Márcio Oliveira wrote:
 
 
   
 
 http://people.redhat.com/nhorman/papers/rhel3_vm.pdf
 I wrote this with norm awhile back.  It may help you out.
 Regards
 Neil
 
 
 
 
   
 
 Neil,
 
 Thanks.~10-12GB of total RAM (16GB) are
 
 How can Proc virtual memory parameters like 
 inactive_clean_percent, 
 overcommit_memory, overcommit_ratio and page_cache help 
 me to solve 
 / reduce Out Of Memory conditions on servers with 16GB 
 RAM and lots 
 of GB swap?
 
   
 
 
 
 I wouldn't touch memory overcommit if you are already 
 seeing out of 
 memory issues.  If you are using lots of pagecache, I 
 would suggest 
 increasing inactive_clean percent, reducing the 
 pagecahce.max value, 
 and modifying the bdflush parameters in the above document 
 such that 
 bdflush runs sooner, more often, and does more work per 
 iteration.  
 This will help you move data in pagecache back to disk more 
 aggressively so that memory will be available for other purposes, 
 like heap allocations. Also if you're using a Red Hat 
 kernel and you 
 have 16GB of ram in your system, you're a good candidate for the 
 hugemem kernel.  Rather than a straightforward out of memory 
 condition, you may be seeing a exhaustion of your kernels address 
 space (check LowFree in /proc/meminfo).  In this even the hugemem 
 kernel will help you in that it increases your Low Memory address 
 space from 1GB to 4GB, preventing some OOM conditions.
 
 
 
 
   
 
 Kernel does not free cached memory (~10-12GB of total RAM 
 - 16GB). Is 
 there some way to force the kernel to free cached memory?
 
   
 
 
 
 Cached memory is freed on demand.  Just because its listed 
 under the 
 cached line
 below doesn't mean it can't be freed and used for another 
 purpose.  
 Implement
 the tunings above, and your situation should improve.
 
 Regards
 Neil
 
 
 
   
 
 /proc/meminfo:
 
total:used:free:  shared: buffers:  cached:
 Mem:16603488256 1652632 801546240 
 70651904 13194563584
 Swap:   17174257664 11771904 17162485760
 MemTotal: 16214344 kB
 MemFree: 78276 kB
 Buffers: 68996 kB
 Cached:   12874808 kB
 
 Thanks to all.
 
 Marcio.
   
 
 
 
 Neil,
 
   Thanks for the answers!
 
 The following lines are the Out Of Memory log:
 
 Jul 20 13:45:44 server kernel: Out of Memory: Killed 
 process 23716 (oracle).
 Jul 20 13:45:44 server kernel: Fixed up OOM kill of mm-less task
 Jul 20 13:45:45 server su(pam_unix)[3848]: session closed 
 for user root
 Jul 20 13:45:48 server kernel: Mem-info:
 Jul 20 13:45:48 server kernel: Zone:DMA freepages:  1884 min: 0 
 low: 0 high: 0
 Jul 20 13:45:48 server kernel: Zone:Normal freepages:  1084 
 min:  1279 
 low:  4544 high:  6304
 Jul 20 13:45:48 server kernel: Zone:HighMem 
 freepages:386679 min:   255 
 low: 61952 high: 92928
 Jul 20 13:45:48 server kernel: Free pages:  389647 
 (386679 HighMem)
 Jul 20 13:45:48 server kernel: ( Active: 2259787/488777, 
 inactive_laundry: 244282, inactive_clean: 244366, free: 389647 )
 Jul 20 13:45:48 server kernel:   aa:0 ac:0 id:0 il:0 ic:0 fr:1884
 Jul 20 13:45:48 server kernel:   aa:1620 ac:1801 id:231 
 il:15 ic:0 fr:1085
 Jul 20 13:45:48 server kernel:   aa:1099230 ac:1157136 id:488536 
 il:244277 ic:244366 fr:386679
 Jul 20 13:45:48 server kernel: 0*4kB 0*8kB 1*16kB 1*32kB 
 1*64kB 0*128kB 
 1*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 7536kB)Jul 20 13:45:48 
 server kernel: 55*4kB 9*8kB 19*16kB 9*32kB 0*64kB 1*128kB 1*256kB 
 0*512kB 1*1024kB 1*2048kB 0*4096kB = 4340kB)
 Jul 20 13:45:48 server kernel: 291229*4kB 46179*8kB 711*16kB 1*32kB 
 1*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 
 1546716kB)
 Jul 20 13:45:48 server kernel: Swap cache: add 192990, 
 delete 189665, 
 find 21145/90719, race 0+0
 Jul 20 13:45:48