3.10.10: WARNING: at kernel/smp.c:181 generic_smp_call_function_single_interrupt+0x11c/0x130()

2013-09-10 Thread Martin MOKREJŠ
Hi,
  I got this stacktrace shortly after bootup. I am disabling the two
hyper-threaded cores in my startup scripts so possibly that was the
trigger, at least per earlier report from Fernando Soto on Jul 08 2013:
  https://lkml.org/lkml/2013/7/8/301

  I don't see his name in this patch series which Google bring to me:

Re: [PATCH v1 2/3] SMP: simpilify function 
generic_smp_call_function_single_interrupt()

does the patch fix the issue? Are you aware of the issue? That is why
I CCed all of you. ;-) Please, somebody add to Cc: also Fernando, I cannot
find his email address in SPAM-protected email archives. surely you have
his original email in your Inbox or Trash. Thanks.


My stacktrace from 3.10.10 is:

[   22.908990] [ cut here ]
[   22.909011] WARNING: at kernel/smp.c:181 
generic_smp_call_function_single_interrupt+0x11c/0x130()
[   22.909021] Modules linked in: iwldvm iwlwifi
[   22.909029] CPU: 2 PID: 17 Comm: migration/2 Not tainted 
3.10.10-default-pciehp #2
[   22.909032] Hardware name: Dell Inc. Vostro 3550/, BIOS A11 08/03/2012
[   22.909044]  0009 88019fb03f08 817f9734 
88019fb03f48
[   22.909054]  8109698b 8801979adfd8 88019fb13d80 
0004
[   22.909064]  88019fb03f68 88019893bf01  
88019fb03f58
[   22.909066] Call Trace:
[   22.909083][] dump_stack+0x19/0x1b
[   22.909093]  [] warn_slowpath_common+0x6b/0xa0
[   22.909102]  [] warn_slowpath_null+0x15/0x20
[   22.909111]  [] 
generic_smp_call_function_single_interrupt+0x11c/0x130
[   22.909120]  [] 
smp_call_function_single_interrupt+0x22/0x40
[   22.909131]  [] call_function_single_interrupt+0x6f/0x80
[   22.909146][] ? stop_machine_cpu_stop+0xb1/0xf0
[   22.909155]  [] ? cpu_stop_queue_work+0xa0/0xa0
[   22.909164]  [] cpu_stopper_thread+0x84/0x140
[   22.909172]  [] ? cpu_stop_should_run+0x2c/0x50
[   22.909181]  [] ? _raw_spin_unlock_irqrestore+0x3a/0x70
[   22.909188]  [] ? trace_hardirqs_on_caller+0x105/0x1d0
[   22.909193]  [] ? trace_hardirqs_on+0xd/0x10
[   22.909199]  [] smpboot_thread_fn+0xff/0x190
[   22.909203]  [] ? lg_global_unlock+0x70/0x70
[   22.909207]  [] kthread+0xe5/0xf0
[   22.909212]  [] ? flush_kthread_worker+0x150/0x150
[   22.909218]  [] ret_from_fork+0x7c/0xb0
[   22.909222]  [] ? flush_kthread_worker+0x150/0x150
[   22.909226] ---[ end trace a42128a3e17e6268 ]---
[   22.909261] smpboot: CPU 2 is now offline
[   23.044977] smpboot: CPU 3 is now offline


Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.10.9: kmemleak disables all CPUs except CPU0

2013-09-03 Thread Martin MOKREJŠ


Catalin Marinas wrote:
> On Mon, Sep 02, 2013 at 04:51:17PM +0100, Martin MOKREJŠ wrote:
>> Catalin Marinas wrote:
>>> On Mon, Sep 02, 2013 at 04:44:52PM +0100, Max Filippov wrote:
>>>> On Mon, Sep 2, 2013 at 7:31 PM, Catalin Marinas  
>>>> wrote:
>>>>> On 31 August 2013 14:35, Martin MOKREJŠ  wrote:
>>>>>>   never realized that my CPUs are gone if I compile into kernel kmemleak.
>>>>>> Is that really the aim?
>>>>>>
>>>>>> CONFIG_HAVE_DEBUG_KMEMLEAK=y
>>>>>> CONFIG_DEBUG_KMEMLEAK=y
>>>>>> CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400
>>>>>> # CONFIG_DEBUG_KMEMLEAK_TEST is not set
>>>>>> # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
>>>>>>
>>>>>> 1.  Why isn't there /sys/devices/system/cpu/cpu0/online file?
>>>>>> Does not matter if it contains 0 or 1. It just should exist.
>>>>>
>>>>> I can't really see how kmemleak would do this, maybe other config
>>>>> options that get set/cleared in the process of selecting kmemleak. Can
>>>>
>>>> Seems to be kmemcheck: from arch/x86/mm/kmemcheck/kmemcheck.c:
>>>>
>>>> int __init kmemcheck_init(void)
>>>> {
>>>> #ifdef CONFIG_SMP
>>>> /*
>>>>  * Limit SMP to use a single CPU. We rely on the fact that this 
>>>> code
>>>>  * runs before SMP is set up.
>>>>  */
>>>> if (setup_max_cpus > 1) {
>>>> printk(KERN_INFO
>>>> "kmemcheck: Limiting number of CPUs to 1.\n");
>>>> setup_max_cpus = 1;
>>>> }
>>>> #endif
>>>
>>> Ah, ok, not my problem then ;)
>>
>> Fine, so would somebody please update the help text accessible in 
>> "menuconfig"
>> for this entry? It should be clear that it has a huge performance impact if 
>> enabled.
>> And, by compiling in it is enabled by default.
> 
> Otherwise, of no-one volunteers, please feel free to send a patch ;)

No, I am not a kernel developer. Please someone else do so update the "help 
text".

Moreover, it does not explain why /sys/devices/system/cpu/cpu0/online file is 
missing
while /sys/devices/system/cpu/cpu[1-3]/online do exist at the same time.

Thank you,
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.10.9: kmemleak disables all CPUs except CPU0

2013-09-02 Thread Martin MOKREJŠ


Catalin Marinas wrote:
> On Mon, Sep 02, 2013 at 04:44:52PM +0100, Max Filippov wrote:
>> On Mon, Sep 2, 2013 at 7:31 PM, Catalin Marinas  
>> wrote:
>>> On 31 August 2013 14:35, Martin MOKREJŠ  wrote:
>>>>   never realized that my CPUs are gone if I compile into kernel kmemleak.
>>>> Is that really the aim?
>>>>
>>>> CONFIG_HAVE_DEBUG_KMEMLEAK=y
>>>> CONFIG_DEBUG_KMEMLEAK=y
>>>> CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400
>>>> # CONFIG_DEBUG_KMEMLEAK_TEST is not set
>>>> # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
>>>>
>>>> 1.  Why isn't there /sys/devices/system/cpu/cpu0/online file?
>>>> Does not matter if it contains 0 or 1. It just should exist.
>>>
>>> I can't really see how kmemleak would do this, maybe other config
>>> options that get set/cleared in the process of selecting kmemleak. Can
>>
>> Seems to be kmemcheck: from arch/x86/mm/kmemcheck/kmemcheck.c:
>>
>> int __init kmemcheck_init(void)
>> {
>> #ifdef CONFIG_SMP
>> /*
>>  * Limit SMP to use a single CPU. We rely on the fact that this code
>>  * runs before SMP is set up.
>>  */
>> if (setup_max_cpus > 1) {
>> printk(KERN_INFO
>> "kmemcheck: Limiting number of CPUs to 1.\n");
>> setup_max_cpus = 1;
>> }
>> #endif
> 
> Ah, ok, not my problem then ;)


Fine, so would somebody please update the help text accessible in "menuconfig"
for this entry? It should be clear that it has a huge performance impact if 
enabled.
And, by compiling in it is enabled by default.

Thank you
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.10.9: kmemleak disables all CPUs except CPU0

2013-09-02 Thread Martin MOKREJŠ


Catalin Marinas wrote:
> On 31 August 2013 14:35, Martin MOKREJŠ  wrote:
>>   never realized that my CPUs are gone if I compile into kernel kmemleak.
>> Is that really the aim?
>>
>> CONFIG_HAVE_DEBUG_KMEMLEAK=y
>> CONFIG_DEBUG_KMEMLEAK=y
>> CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400
>> # CONFIG_DEBUG_KMEMLEAK_TEST is not set
>> # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
>>
>> 1.  Why isn't there /sys/devices/system/cpu/cpu0/online file?
>> Does not matter if it contains 0 or 1. It just should exist.
> 
> I can't really see how kmemleak would do this, maybe other config
> options that get set/cleared in the process of selecting kmemleak. Can
> you do a diff between your config with /sys/... entries and the one
> without?

Hi,
  I tried but did not get to report back. One one these re-enabled my CPUs back.

@@ -3177,14 +3179,7 @@
 CONFIG_HAVE_ARCH_KGDB=y
 # CONFIG_KGDB is not set
 CONFIG_HAVE_ARCH_KMEMCHECK=y
-CONFIG_KMEMCHECK=y
-# CONFIG_KMEMCHECK_DISABLED_BY_DEFAULT is not set
-# CONFIG_KMEMCHECK_ENABLED_BY_DEFAULT is not set
-CONFIG_KMEMCHECK_ONESHOT_BY_DEFAULT=y
-CONFIG_KMEMCHECK_QUEUE_SIZE=64
-CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT=5
-CONFIG_KMEMCHECK_PARTIAL_OK=y
-# CONFIG_KMEMCHECK_BITOPS_OK is not set
+# CONFIG_KMEMCHECK is not set
 # CONFIG_TEST_STRING_HELPERS is not set
 # CONFIG_TEST_KSTRTOX is not set
 # CONFIG_STRICT_DEVMEM is not set


So, the CONFIG_KMEMCHECK disables the additional CPUs I believe. I thought it
should be traceable from the dmesg output I sent to the list. Yes, I screwed the
subject line because I did not realize a difference so far. :(

Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Emulating ECC RAM in kernel or mirroring RAM to exclude HW issues

2013-09-01 Thread Martin MOKREJŠ
Hi,
  I am trying to find out why some applications crash on my laptop.
I mostly use python and have configured it via configure --with-pydebug
so that is wraps memory allocated regions with 0xfb. That helps to realize
something overwrote that memory region. So far, it twice reported
0xfb to 0xfa transition at some logical position 5. I was told python cannot
print physical hardware address but let's assume this is a memory error
and one bit was flipped. However, sometimes other apps crash as well and
I think I tried enough to run core dumps through gdb to find out where they
crashed and it does not to an answer.
  I tried for hours memtest86+ to find an error but is never found anything
wrong. From my experience, the errors appear when the CPU is loaded and that
is not under memtest86+ started from a boot CD. I think it another reason why
memtest86+ maybe does not find the problematic bit is that it would have to
fill whole RAM with e.g. 0xfb and scan those values all remaining hours whether
they still read as 0xfb. It seems all write&read tests done by memtest86+ happen
too quickly after each other. I lack tests where the data if written into memory
and kept there for a long while (hours, days).

  Finally, I got an idea that linux kernel could emulate ECC RAM and just keep
some checksums in another region of memory. This would to find not only flipped
memory bit but even other (larger) corrupted regions of memory. I don't need
speed (running apps under valgrind/DUMA is not fast either) and I don't need
memory hotplug. Let's say this is for diagnostic purpose. I don't mind if 
somebody
says I have to sacrifice 1/2 of my precious RAM to do software memory mirroring.
Even that would be cool trick! to get around and see where is the bug hiding.
I somewhat speculate it could be just a bit overheated memory controller after
high CPU usage or the CPU or its cache gets upset and has nothing to do with 
RAM.
When it is cold, it works. But, first I need a proof that RAM is not at fault.

  I think somebody must already thought about this so I am just asking what do
you think. Maybe this is already available in some linux source tree as a
proof-of-the-concept patch. ;) That would be great. 

https://www.usenix.org/legacy/event/atc10/tech/full_papers/Li.pdf

Thank you,
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH linux-next] Prevent a coredump with a large vm_map_count from Oopsing

2013-08-31 Thread Martin MOKREJŠ
Dan Aloni wrote:
> On Sat, Aug 31, 2013 at 03:38:33PM +0200, Martin MOKREJŠ wrote:
>> Hi Dan,
>>   thank you for your work on my issue. I would like to test it on 3.10.9 
>> where
>> I faced the problem initially.
> 
> Sure, see the attached patch for 3.10.9.

Thanks, it works for my case. You can add my Reported-by: and Tested-by:. ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH linux-next] Prevent a coredump with a large vm_map_count from Oopsing

2013-08-31 Thread Martin MOKREJŠ
Hi Dan,
  thank you for your work on my issue. I would like to test it on 3.10.9 where
I faced the problem initially.

linux-3.10.9 # patch -p1 < ../patches/vm_map_count.patch
patching file fs/binfmt_elf.c
Hunk #1 succeeded at 1415 (offset -14 lines).
Hunk #2 succeeded at 1430 (offset -14 lines).
Hunk #3 succeeded at 1487 (offset -14 lines).
Hunk #4 succeeded at 1609 (offset -14 lines).
Hunk #5 succeeded at 1689 (offset -14 lines).
Hunk #6 FAILED at 1737.
Hunk #7 succeeded at 1810 (offset -14 lines).
Hunk #8 succeeded at 1854 (offset -14 lines).
Hunk #9 succeeded at 1902 (offset -14 lines).
Hunk #10 succeeded at 1970 (offset -14 lines).
Hunk #11 FAILED at 2068.
2 out of 11 hunks FAILED -- saving rejects to file fs/binfmt_elf.c.rej
#


Thank you.

Dan Aloni wrote:
> A high setting of max_map_count, and a process core-dumping with
> a large enough vm_map_count could result in an NT_FILE note not
> being written, and the kernel crashing immediately later because
> it has assumed otherwise.
> 
> Reproduction of the bug described here:
> 
> https://lkml.org/lkml/2013/8/30/50
> 
> Issue originating in 2aa362c49 (from Oct 4, 2012).
> 
> This patch make that section optional in that case.
> fill_files_note() should signify the error, and also let the info
> struct in elf_core_dump() be zero-initialized so that we can check
> for the optionally written note.
> 
> Cc'ed original signers.
> 
> Cc'ed Al Viro because it is trivially relies on his linux-next
> tree changes.
> 
> Signed-off-by: Dan Aloni 
> Cc: Al Viro 
> Cc: Denys Vlasenko 
> Cc: Andrew Morton 
> Cc: Linus Torvalds 
> ---
>  fs/binfmt_elf.c | 33 +
>  1 file changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index dc82279..e1a323a 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -1429,7 +1429,7 @@ static void fill_siginfo_note(struct memelfnote *note, 
> user_siginfo_t *csigdata,
>   *   long file_ofs
>   * followed by COUNT filenames in ASCII: "FILE1" NUL "FILE2" NUL...
>   */
> -static void fill_files_note(struct memelfnote *note)
> +static int fill_files_note(struct memelfnote *note)
>  {
>   struct vm_area_struct *vma;
>   unsigned count, size, names_ofs, remaining, n;
> @@ -1444,11 +1444,11 @@ static void fill_files_note(struct memelfnote *note)
>   names_ofs = (2 + 3 * count) * sizeof(data[0]);
>   alloc:
>   if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
> - goto err;
> + return -E2BIG;
>   size = round_up(size, PAGE_SIZE);
>   data = vmalloc(size);
>   if (!data)
> - goto err;
> + return -ENOMEM;
>  
>   start_end_ofs = data + 2;
>   name_base = name_curpos = ((char *)data) + names_ofs;
> @@ -1501,7 +1501,7 @@ static void fill_files_note(struct memelfnote *note)
>  
>   size = name_curpos - (char *)data;
>   fill_note(note, "CORE", NT_FILE, size, data);
> - err: ;
> + return 0;
>  }
>  
>  #ifdef CORE_DUMP_USE_REGSET
> @@ -1623,6 +1623,7 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
>   struct elf_prpsinfo *psinfo;
>   struct core_thread *ct;
>   unsigned int i;
> + int ret;
>  
>   info->size = 0;
>   info->thread = NULL;
> @@ -1702,8 +1703,9 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
>   fill_auxv_note(&info->auxv, current->mm);
>   info->size += notesize(&info->auxv);
>  
> - fill_files_note(&info->files);
> - info->size += notesize(&info->files);
> + ret = fill_files_note(&info->files);
> + if (!ret)
> + info->size += notesize(&info->files);
>  
>   return 1;
>  }
> @@ -1735,7 +1737,7 @@ static int write_note_info(struct elf_note_info *info,
>   return 0;
>   if (first && !writenote(&info->auxv, cprm))
>   return 0;
> - if (first && !writenote(&info->files, cprm))
> + if (first && info->files.data && !writenote(&info->files, cprm))
>   return 0;
>  
>   for (i = 1; i < info->thread_notes; ++i)
> @@ -1822,6 +1824,7 @@ static int elf_dump_thread_status(long signr, struct 
> elf_thread_status *t)
>  
>  struct elf_note_info {
>   struct memelfnote *notes;
> + struct memelfnote *notes_files;
>   struct elf_prstatus *prstatus;  /* NT_PRSTATUS */
>   struct elf_prpsinfo *psinfo;/* NT_PRPSINFO */
>   struct list_head thread_list;
> @@ -1865,6 +1868,7 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
> siginfo_t *siginfo, struct pt_regs *regs)
>  {
>   struct list_head *t;
> + int ret;
>  
>   if (!elf_note_info_init(info))
>   return 0;
> @@ -1912,9 +1916,13 @@ static int fill_note_info(struct elfhdr *elf, int 
> phdrs,
>  
>   fill_siginfo_note(info->notes + 2, &info->csigdata, siginfo);
>   fill_auxv_note(info->notes + 3, current->mm);
> - fill_files_note(info->not

Re: 3.10.9: EXT4-fs (sdb1): delayed block allocation failed for inode 163315715 at logical offset 1 with max blocks 2 with error -5

2013-08-30 Thread Martin MOKREJŠ
Theodore Ts'o wrote:
> Your SATA disk had enough errors that the ATA link was completely
> reset, and the device was detached and then reattached.  As far as
> kernel is concerned, it's a new device.

Later on I rebooted and ran smarctl:

# smartctl --test=long /dev/sdb 

As of now after two days I have:

# smartctl -a /dev/sdb 
smartctl 6.0 2012-10-10 r3643 [x86_64-linux-3.10.9-default-pciehp] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate SV35
Device Model: ST3000VX000-1CU166
Serial Number:Z1F1YB3K
LU WWN Device Id: 5 000c50 04f5930de
Firmware Version: CV22
User Capacity:3,000,592,982,016 bytes [3.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate:7200 rpm
Device is:In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:Fri Aug 30 17:32:13 2013 MEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status:  (   0) The previous self-test routine completed
without error or no self-test has ever 
been run.
Total time to complete Offline 
data collection:(   89) seconds.
Offline data collection
capabilities:(0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off 
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine 
recommended polling time:(   1) minutes.
Extended self-test routine
recommended polling time:( 326) minutes.
Conveyance self-test routine
recommended polling time:(   2) minutes.
SCT capabilities:  (0x10b9) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  
WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000f   108   099   006Pre-fail  Always   
-   19762904
  3 Spin_Up_Time0x0003   092   091   000Pre-fail  Always   
-   0
  4 Start_Stop_Count0x0032   100   100   020Old_age   Always   
-   57
  5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always   
-   0
  7 Seek_Error_Rate 0x000f   066   060   030Pre-fail  Always   
-   4287008
  9 Power_On_Hours  0x0032   096   096   000Old_age   Always   
-   3770
 10 Spin_Retry_Count0x0013   100   100   097Pre-fail  Always   
-   0
 12 Power_Cycle_Count   0x0032   100   100   020Old_age   Always   
-   49
184 End-to-End_Error0x0032   100   100   099Old_age   Always   
-   0
187 Reported_Uncorrect  0x0032   100   100   000Old_age   Always   
-   0
188 Command_Timeout 0x0032   100   099   000Old_age   Always   
-   8590065666
189 High_Fly_Writes 0x003a   001   001   000Old_age   Always   
-   464
190 Airflow_Temperature_Cel 0x0022   056   052   045Old_age   Always   
-   44 (Min/Max 25/45)
191 G-Sense_Error_Rate  0x0032   100   100   000Old_age   Always   
-   0
192 Power-Off_Retract_Count 0x0032   100   100   000Old_age   Always   
-   41
193 Load_Cycle_Count0x0032   100   100   000Old_age   Always   
-   57
194 Temperature_Celsius 0x0022   044   048   000Old_age   Always   
-   44 (0 16 0 0 0)
197 Current_Pend

Re: 3.10.9: Oops at elf_core_dump()

2013-08-29 Thread Martin MOKREJŠ
So it happened again:

$ export LD_PRELOAD=/usr/lib64/libduma.so.0.0.0
$ python  memory-corruption-test.py
DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington 
Copyright (C) 2002-2008 Hayati Ayguen , Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens 

DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington 
Copyright (C) 2002-2008 Hayati Ayguen , Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens 

Finished one record
Finished one record
Finished one record
Finished one record
Finished one record
[cut]


DUMA Aborting: mprotect() failed: Cannot allocate memory.
Check README section 'MEMORY USAGE AND EXECUTION SPEED'
  if your (Linux) system may limit the number of different page mappings per 
process
Fatal Python error: Illegal instruction

Current thread 0x7fc9c4803740:
  File "/usr/lib64/python2.7/site-packages/Bio/Blast/NCBIXML.py", line 106 in 
endElement
  File 
"/mnt/1TB/var/tmp/portage/dev-lang/python-2.7.5-r2/work/Python-2.7.5/Modules/pyexpat.c",
 line 618 in EndElement
  File "/usr/lib64/python2.7/site-packages/Bio/Blast/NCBIXML.py", line 654 in 
parse
  File "memory-corruption-test.py", line 55 in doparse
  File "memory-corruption-test.py", line 104 in main
  File "memory-corruption-test.py", line 109 in 



The stacktrace is little different but ... I think I need to find what resource 
to so that
duma can keep running watching the python binary.


[112567.987073] BUG: unable to handle kernel NULL pointer dereference at
   (null)
[112567.987684] IP: [] strlen+0x2/0x20
[112567.988282] PGD 28be2c067 PUD 3a7744067 PMD 0 
[112567.988879] Oops:  [#2] SMP 
[112567.989468] Modules linked in: iwldvm iwlwifi
[112567.990057] CPU: 0 PID: 8822 Comm: python2.7 Tainted: G  D  
3.10.9-default-pciehp #8
[112567.990655] Hardware name: Dell Inc. Vostro 3550/, BIOS A11 08/03/2012
[112567.991249] task: 8803b5eb0fd0 ti: 8803b36d4000 task.ti: 
8803b36d4000
[112567.991845] RIP: 0010:[]  [] 
strlen+0x2/0x20
[112567.992443] RSP: 0018:8803b36d59f0  EFLAGS: 00010246
[112567.993039] RAX:  RBX: 0003 RCX: 
8803b5eb0fd0
[112567.993643] RDX: 016e3610 RSI:  RDI: 

[112567.994249] RBP: 8803b36d5a08 R08: fffa R09: 

[112567.994854] R10:  R11:  R12: 
8803b36d5b00
[112567.995459] R13: 7000 R14: 0004 R15: 

[112567.996060] FS:  7fc9c4803740() GS:88041d80() 
knlGS:
[112567.996664] CS:  0010 DS:  ES:  CR0: 80050033
[112567.997268] CR2:  CR3: 0003a3b84000 CR4: 
000407f0
[112567.997884] DR0:  DR1:  DR2: 

[112567.998495] DR3:  DR6: 4ff0 DR7: 
0400
[112567.999100] Stack:
[112567.999698]  811bb0a3 8803b36d5e60 03d8 
8803b36d5c08
[112568.000315]  811bbcbd 811bb913 0002 

[112568.000931]  8803b5eb0fd0 8803b36d 0246 
000f4242
[112568.001548] Call Trace:
[112568.002156]  [] ? notesize.isra.11+0x13/0x30
[112568.002774]  [] elf_core_dump+0xbfd/0x1570
[112568.003392]  [] ? elf_core_dump+0x853/0x1570
[112568.004012]  [] ? unshare_files+0x29/0xa0
[112568.004629]  [] do_coredump+0xafc/0xff0
[112568.005247]  [] ? __sigqueue_free+0x38/0x40
[112568.005865]  [] get_signal_to_deliver+0x1c1/0x5c0
[112568.006488]  [] ? pid_vnr+0x30/0x30
[112568.007108]  [] do_signal+0x53/0x8e0
[112568.007725]  [] do_notify_resume+0x5f/0x70
[112568.008342]  [] ? trace_hardirqs_on_thunk+0x3a/0x3f
[112568.008965]  [] int_signal+0x12/0x17
[112568.009583] Code: 48 89 e5 f6 82 40 c6 84 81 20 74 15 0f 1f 44 00 00 48 83 
c0 01 0f b6 10 f6 82 40 c6 84 81 20 75 f0 5d c3 66 0f 1f 44 00 00 31 c0 <80> 3f 
00 55 48 89 e5 74 11 48 89 f8 66 90 48 83 c0 01 80 38 00 
[112568.011042] RIP  [] strlen+0x2/0x20
[112568.011748]  RSP 
[112568.012445] CR2: 00000000
[112568.013155] ---[ end trace 9d67aee555e92d76 ]---


Martin MOKREJŠ wrote:
> Got it for the first time. Actually, am doing something really unusual
> (http://bugs.python.org/issue18843).
> 
> Am looking for an answer why I suffer memory corruption in python 
> applicatuons.
> So I installed DUMA from http://duma.sourceforge.net and tried to 
> recompile&reinstall
> failing python. In previous attempt it exited and per README instructions
> I increased the vm.max_map_count value.
> 
> 
> # export LD_PRELOAD=/usr/lib64/libduma.so.0.0.0
> # sysctl -w vm.max_map_count=100
> # emerge dev-lang/python:2.7 
> DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
> Copyright (C) 2006 Michael Eddington 
> Copyright (C) 2002-2008 Hayati Ayguen , Procitec GmbH
> Co

Re: 3.10.9: Oops at elf_core_dump()

2013-08-29 Thread Martin MOKREJŠ
Got it for the first time. Actually, am doing something really unusual
(http://bugs.python.org/issue18843).

Am looking for an answer why I suffer memory corruption in python applicatuons.
So I installed DUMA from http://duma.sourceforge.net and tried to 
recompile&reinstall
failing python. In previous attempt it exited and per README instructions
I increased the vm.max_map_count value.


# export LD_PRELOAD=/usr/lib64/libduma.so.0.0.0
# sysctl -w vm.max_map_count=100
# emerge dev-lang/python:2.7 
DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington 
Copyright (C) 2002-2008 Hayati Ayguen , Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens 


 * IMPORTANT: 11 news items need reading for repository 'gentoo'.
 * Use eselect news to read news items.


 * IMPORTANT: config file '5 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington 
Copyright (C) 2002-2008 Hayati Ayguen , Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens 

' needs updating.
 * See the CONFIGURATION FILES section of the emerge
 * man page to learn how to update config files.
Calculating dependencies |
DUMA Aborting: mprotect() failed: Cannot allocate memory.
Check README section 'MEMORY USAGE AND EXECUTION SPEED'
  if your (Linux) system may limit the number of different page mappings per 
process


[and it crashed, no ctrl+c working]



Sorry do not know what to say more. I just crashed teh kernel but except
the Ooops it works so far. The core filesize is zero.
Martin


Greg KH wrote:
> On Thu, Aug 29, 2013 at 11:46:18PM +0200, Martin MOKREJŠ wrote:
>> Hi,
>>   I just got this stacktrace. Not sure whom to send it, poking throu 
>> MAINTAINERS
>> file and looking for ELF gave me nothing. ;-)
>>
>> [105670.434336] BUG: unable to handle kernel NULL pointer dereference at 
>>   (null)
>> [105670.434366] IP: [] strlen+0x2/0x20
>> [105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0 
>> [105670.434401] Oops:  [#1] SMP 
>> [105670.434413] Modules linked in: iwldvm iwlwifi
>> [105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 
>> 3.10.9-default-pciehp #8
> 
> Is this reproducable?
> 
> thanks,
> 
> greg k-h
> 

-- 
Martin Mokrejs, Ph.D.
Bioinformatics
Donovalska 1658
149 00 Prague
Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


3.10.9: Oops at elf_core_dump()

2013-08-29 Thread Martin MOKREJŠ
Hi,
  I just got this stacktrace. Not sure whom to send it, poking throu MAINTAINERS
file and looking for ELF gave me nothing. ;-)

[105670.434336] BUG: unable to handle kernel NULL pointer dereference at
   (null)
[105670.434366] IP: [] strlen+0x2/0x20
[105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0 
[105670.434401] Oops:  [#1] SMP 
[105670.434413] Modules linked in: iwldvm iwlwifi
[105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 3.10.9-default-pciehp 
#8
[105670.434451] Hardware name: Dell Inc. Vostro 3550/, BIOS A11 08/03/2012
[105670.434468] task: 88037df42f70 ti: 88018683c000 task.ti: 
88018683c000
[105670.434487] RIP: 0010:[]  [] 
strlen+0x2/0x20
[105670.434509] RSP: 0018:88018683d9f0  EFLAGS: 00010246
[105670.434523] RAX:  RBX: 0003 RCX: 
88037df42f70
[105670.434542] RDX: 016e3610 RSI:  RDI: 

[105670.434560] RBP: 88018683da08 R08:  R09: 

[105670.434579] R10:  R11: 0001 R12: 
88018683db00
[105670.434598] R13: 7000 R14: 0004 R15: 

[105670.434617] FS:  7f89b0989740() GS:88041d80() 
knlGS:
[105670.434637] CS:  0010 DS:  ES:  CR0: 80050033
[105670.434652] CR2:  CR3: 0002b4c06000 CR4: 
000407f0
[105670.434671] DR0:  DR1:  DR2: 

[105670.434690] DR3:  DR6: 0ff0 DR7: 
0400
[105670.434708] Stack:
[105670.434715]  811bb0a3 0004 03d8 
88018683dc08
[105670.434738]  811bbcbd 811bb913  
88018683db28
[105670.434762]  88037df42f70 88018683 0246 
000f4242
[105670.434785] Call Trace:
[105670.434795]  [] ? notesize.isra.11+0x13/0x30
[105670.434812]  [] elf_core_dump+0xbfd/0x1570
[105670.434828]  [] ? elf_core_dump+0x853/0x1570
[105670.434845]  [] ? do_coredump+0xe25/0xff0
[105670.434861]  [] ? trace_hardirqs_on+0xd/0x10
[105670.434878]  [] ? __sb_start_write+0xdf/0x1b0
[105670.434894]  [] ? do_coredump+0xe25/0xff0
[105670.434911]  [] ? unshare_files+0x29/0xa0
[105670.434926]  [] do_coredump+0xafc/0xff0
[105670.434943]  [] ? __sigqueue_free+0x38/0x40
[105670.434960]  [] get_signal_to_deliver+0x1c1/0x5c0
[105670.434977]  [] ? do_send_sig_info+0x61/0x90
[105670.434994]  [] do_signal+0x53/0x8e0
[105670.435008]  [] ? kill_pgrp+0x60/0x60
[105670.435025]  [] ? finish_task_switch+0x7e/0xe0
[105670.435043]  [] ? sysret_signal+0x5/0x47
[105670.435058]  [] do_notify_resume+0x5f/0x70
[105670.435074]  [] ? trace_hardirqs_on_thunk+0x3a/0x3f
[105670.435092]  [] int_signal+0x12/0x17
[105670.435106] Code: 48 89 e5 f6 82 40 c6 84 81 20 74 15 0f 1f 44 00 00 48 83 
c0 01 0f b6 10 f6 82 40 c6 84 81 20 75 f0 5d c3 66 0f 1f 44 00 00 31 c0 <80> 3f 
00 55 48 89 e5 74 11 48 89 f8 66 90 48 83 c0 01 80 38 00 
[105670.435238] RIP  [] strlen+0x2/0x20
[105670.435254]  RSP 
[105670.435843] CR2: 
[105670.439699] ---[ end trace 9d67aee555e92d75 ]---



Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.10.9: EXT4-fs (sdb1): delayed block allocation failed for inode 163315715 at logical offset 1 with max blocks 2 with error -5

2013-08-28 Thread Martin MOKREJŠ
Hugh,
  looks you are right person to ask per https://lkml.org/lkml/2012/8/23/9 

What can I do now with my system? Is it so that for unknown reason 
(misinterpreted ACPI condition?)
/dev/sdb was stopped by SandyBridge and after an hour when shell redirects of 
valgrind's STDOUT
and STDERR to a file filled up some kernel buffers (because could not write to 
/mnt/external)
the ext4 driver choked?

# cat /proc/sys/vm/laptop_mode
0
#

Have app-laptop/laptop-mode-tools-1.63-r2 on Gentoo Linux.

Thank you,
Martin

Martin MOKREJŠ wrote:
> Hi,
>   I have been running two instances of valgrind on some application on 3.10.9 
> kernel
> with a patch aiming to fix a BOS descriptor memleak (see linux-usb Subject
> "[RFC v2] usbcore: compare and release one bos descriptor in 
> usb_reset_and_verify_device()"
> but I hope it is unrelated). I enabled in the kernel some extra checks for 
> kernel
> sanity in Kernel hacking section (am looking for an answer why something 
> overwrites memory
> of my python-based application). Hence the valgrind and attempts to fortify 
> kernel a bit
> more (see attached diff since last known good .config).
> 
>   Below I show when I early in the morning connected the external SATA drive 
> drive
> and that muich later, kernel suddenly lost ability to read/write the 
> filesystem.
> I somewhat suspect that laptop-mode-tools (although configured to ignore 
> mouse/keyboard
> and usb-storage devices) somehow triggered the cause. However, still I would 
> like to see
> that something happened at the SATA level.
> 
> 
> Aug 28 04:41:05 vostro kernel: [  248.268202] ata6: exception Emask 0x10 SAct 
> 0x0 SErr 0x400 action 0xe frozen
> Aug 28 04:41:05 vostro kernel: [  248.268205] ata6: irq_stat 0x0040, 
> connection status changed
> Aug 28 04:41:05 vostro kernel: [  248.268207] ata6: SError: { DevExch }
> Aug 28 04:41:05 vostro kernel: [  248.268212] ata6: hard resetting link
> Aug 28 04:41:06 vostro kernel: [  249.009819] ata6: SATA link up 3.0 Gbps 
> (SStatus 123 SControl 300)
> Aug 28 04:41:06 vostro kernel: [  249.010951] ata6.00: ATA-8: 
> ST3000VX000-1CU166, CV22, max UDMA/133
> Aug 28 04:41:06 vostro kernel: [  249.010963] ata6.00: 5860533168 sectors, 
> multi 0: LBA48 NCQ (depth 31/32), AA
> Aug 28 04:41:06 vostro kernel: [  249.012058] ata6.00: configured for UDMA/133
> Aug 28 04:41:06 vostro kernel: [  249.029823] ata6: EH complete
> Aug 28 04:41:06 vostro kernel: [  249.030376] scsi 5:0:0:0: Direct-Access 
> ATA  ST3000VX000-1CU1 CV22 PQ: 0 ANSI: 5
> Aug 28 04:41:06 vostro kernel: [  249.032304] sd 5:0:0:0: [sdb] 5860533168 
> 512-byte logical blocks: (3.00 TB/2.72 TiB)
> Aug 28 04:41:06 vostro kernel: [  249.032306] sd 5:0:0:0: [sdb] 4096-byte 
> physical blocks
> Aug 28 04:41:06 vostro kernel: [  249.032623] sd 5:0:0:0: [sdb] Write Protect 
> is off
> Aug 28 04:41:06 vostro kernel: [  249.032625] sd 5:0:0:0: [sdb] Mode Sense: 
> 00 3a 00 00
> Aug 28 04:41:06 vostro kernel: [  249.032755] sd 5:0:0:0: [sdb] Write cache: 
> enabled, read cache: enabled, doesn't support DPO or FUA
> Aug 28 04:41:06 vostro kernel: [  249.033850] sd 5:0:0:0: Attached scsi 
> generic sg2 type 0
> Aug 28 04:41:06 vostro kernel: [  249.089816]  sdb: sdb1
> Aug 28 04:41:06 vostro kernel: [  249.091295] sd 5:0:0:0: [sdb] Attached SCSI 
> disk
> Aug 28 04:42:51 vostro kernel: [  354.033592] EXT4-fs (sdb1): mounted 
> filesystem with ordered data mode. Opts: (null)
> Aug 28 09:46:39 vostro kernel: [18604.328975] ata6: exception Emask 0x10 SAct 
> 0x0 SErr 0x409 action 0xe frozen
> Aug 28 09:46:39 vostro kernel: [18604.328985] ata6: irq_stat 0x00400040, 
> connection status changed
> Aug 28 09:46:39 vostro kernel: [18604.328992] ata6: SError: { PHYRdyChg 10B8B 
> DevExch }
> Aug 28 09:46:39 vostro kernel: [18604.329007] ata6: hard resetting link
> Aug 28 09:46:39 vostro logger: ACPI event unhandled: ac_adapter AC 0080 
> 
> Aug 28 09:46:39 vostro kernel: [18605.089011] ata6: SATA link down (SStatus 0 
> SControl 300)
> Aug 28 09:46:40 vostro laptop-mode: Laptop mode
> Aug 28 09:46:40 vostro laptop-mode: enabled, active
> Aug 28 09:46:41 vostro logger: ACPI event unhandled: battery BAT0 0080 
> 0001
> Aug 28 09:46:41 vostro kernel: [18607.033618] ata6: hard resetting link
> Aug 28 09:46:42 vostro kernel: [18607.336433] EXT4-fs (sdb1): re-mounted. 
> Opts: commit=600
> Aug 28 09:46:42 vostro kernel: [18607.381792] ata6: SATA link down (SStatus 0 
> SControl 300)
> Aug 28 09:46:42 vostro kernel: [18607.381802] ata6: limiting SATA link speed 
> to 1.5 Gbps
> Aug 28 09:46:44 vostro logger: Device 2-1.2 is blacklisted, skipping auto 
> suspend.
> Aug 28 09:46:44 vostro logger: Device 2-1.2:1.0 is blacklis

e1000 in 2.6.21.2 and even older, like 2.6.13: eth0 does not exist but eth1 does

2007-05-24 Thread Martin MOKREJŠ

Hi,
 today I had to reinstall some machine because the xfs filesystem was broken,
because under heavy load I got kernel panic complaining that some internal 
kernel
structure are broken so the filesystem was unmounted. Sorry, I had no time to 
take
a snapshot. So I recreated the filesystem and copied the installtion from 
another
cluster node. Edited the /etc/conf.d/hostanme, /etc/conf.d/net to set correct IP
address and tried to boot.
  Mysteriously, after booting back I did not get my e1000 NIC detected
(ASUS P4C800E-Deluxe motherboard). First I thought it is a kernel problem and
recompiled again the freshly attempted 2.6.21.2 compiled just before I brought
the template machine down, but even having e1000 as a module did not help.
I tried the previously working kernel binary 2.6.19.1 and still, the driver got
loaded, dmesg(1) has shown the card, its IRQ managed through ACPI, it MAC 
address,
but Gentoo init.d scripts complained no eth0 device exists still. So that could 
not
be related to a newly introduced bug in 2.6.19.1 - 2.6.21.2 range.

I tried even 2.6.13 kernel image laying a year or two on the bootable 
filesystem,
it also did not give me eth0 anymore. I tried to reload setup defaults in BIOS
(version 1019) but nothing helped me either.
 Let me emphasize ifconfig(1) does only show loopback device all the time.
Finally I found that in /proc/net/dev there is a row for "lo:" and "eth1".
Yes, the network card is recognized as eth1. Blindly running 'ifconfig eth1 $IP'
really succeeds and assigns the given IP address to my card and since then it is
shown by ifconfig(1) and dmesg(1) reported whih link speed was negotiated.
Even before that, the network switch had shown it does have a link to the 
network
card with its diode. I cannot say the status of the diode on the NIC itself,
I have hard access to the rear side, sorry.

I think there is sometimes logged wrong device name to syslog, I do not know 
why.
But it always shows eth0 after inserting the e1000 module? Removing the module
disables the IRQ for the device, re-inserting the module again claims it is
eth0 ... but it is eth1, I know now. The network card is wired into the 
motherboard,
I have 12 same boxes and never saw such a problem.


Intel(R) PRO/1000 Network Driver - version 7.3.20-k2-NAPI
...some license stuff...
ACPI: PCI Interrupt :02:01.0[A] -> GSI 18 (level, low) -> IRQ 18
PCI: Setting latency Timer of device :02.01.0 to 64
e1000: :02:01.0: e1000_probe (PCI:33MHz:32-bit) 00:13:d4:51...
e1000:eth0: e1000_probe: Intel(R) PRO/1000 Network Connection


after removing the kernel module, I get

ACPI:PCI interrupt for device :02:01.0 disabled


after assigning the IP address to eth1 I get:

e1000:eth1: e1000_watchdog: NIC link is up 100Mbps Full Duplex, .


 Further to note, in /proc/interrupts I haven't seen the promised interrupt 
line 18
as used by the e1000 module, although at that very moment it should be used 
(according
to dmesg(1)). In lspci I haven't seen anything strange, and definitely I saw 
the NIC
device.


 The machine is now running via that eth1 on the network and I am ready to do 
some
more tests. Are there some debug switches for the e1000 module?

 I rebooted the machine maybe 20x, incl. cold starts, so I believe I can 
reproduce
it still.

Thanks for hints,
martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.19.1: kernel BUG at mm/slab.c:2911!

2007-01-30 Thread Martin MOKREJŠ
Hi,
  is this a known issue? Should I bother to upgrade to 2.6.19.2 if it contains 
the fix?
Thank you any help. It might be related to NFS. The machine in question is 
NFSv3 client,
udp. And used for computations. The process which died is from torque cluster 
management
package.
 Please Cc: me in replies. Thanks.
Martin


slab: Internal list corruption detected in cache 'size-32'(84), slabp 
f3caf080(-2). Hexdump:

000: 00 00 00 00 fe ff ff ff fe ff ff ff 3a 00 00 00
010: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
020: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
030: fe ff ff ff 06 00 00 00 fe ff ff ff fe ff ff ff
040: fe ff ff ff fe ff ff ff 19 00 00 00 fe ff ff ff
050: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
060: 1c 00 00 00 3d 00 00 00 32 00 00 00 fe ff ff ff
070: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
080: fe ff ff ff 3b 00 00 00 2b 00 00 00 fe ff ff ff
090: 31 00 00 00 fe ff ff ff fe ff ff ff fe ff ff ff
0a0: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
0b0: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
0c0: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
0d0: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
0e0: fe ff ff ff fe ff ff ff fe ff ff ff 71 f0 2c 5a
0f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
100: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5
110: 71 f0 2c 5a 6d ec 14 c0 71 f0 2c 5a 6b 6b 6b 6b
120: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
130: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 71 f0 2c 5a
140: 6d ec 14 c0 a5 c2 0f 17 70 fc ca f3 fc f2 ca f3
150: 00 00 00 00 00 c0 2d f5 01 00 00 00 ff ff ff ff
160: 00 00 5a 5a fe ff ff ff a5 c2 0f 17
[ cut here ]
kernel BUG at mm/slab.c:2911!
invalid opcode:  [#1]
DEBUG_PAGEALLOC
Modules linked in:
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010092   (2.6.19.1 #2)
EIP is at check_slabp+0xa0/0xb0
eax: 0001   ebx: 016c   ecx:    edx: f470de04
esi: f3caf080   edi: c20ff080   ebp: f5753db4   esp: f5753d94
ds: 007b   es: 007b   ss: 0068
Process pbs_mom (pid: 5741, ti=f5752000 task=f5b86ae0 task.ti=f5752000)
Stack: c041c935 0017 0054 f3caf080 fffe 000b f3caf000 c20fd854 
   f5753dd8 c014fd06 c2101ef8 0004 00d0 c20ff080 c20ff080 0246 
   00d0 f5753df4 c015015e c20fc540 c014f579  f3f32000 c20fc540 
Call Trace:
 [] cache_alloc_refill+0xb1/0x1a7
 [] kmem_cache_alloc+0x49/0x67
 [] alloc_slabmgmt+0x1a/0x47
 [] cache_grow+0xbf/0x12f
 [] cache_alloc_refill+0x16d/0x1a7
 [] kmem_cache_alloc+0x49/0x67
 [] get_empty_filp+0x4e/0xd8
 [] __path_lookup_intent_open+0x17/0x75
 [] path_lookup_open+0x21/0x27
 [] open_namei+0x76/0x4ae
 [] do_filp_open+0x26/0x3b
 [] do_sys_open+0x45/0xcd
 [] sys_open+0x1a/0x1c
 [] sysenter_past_esp+0x56/0x79
 [] 0xb7fd8410
 ===
Code: 3f 3f c0 e8 b2 d2 fc ff 0f b6 04 1e c7 04 24 16 95 41 c0 43 89 44 24 04 
e8 9d d2 fc ff eb c6 c7 04 24 35 c9 41 c0 e8 8f d2 fc ff <0f> 0b 5f 0b 1b 3a 3f 
c0 83 c4 14 5b 5e 5f 5d c3 55 89 e5 57 56 
EIP: [] check_slabp+0xa0/0xb0 SS:ESP 0068:f5753d94
 
$ uname -a
Linux phylo1 2.6.19.1 #2 Wed Dec 20 21:24:45 CET 2006 i686 Intel(R) Pentium(R) 
4 CPU 3.00GHz GNU/Linux
$ 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Regression: spurious 8259A interrupt: IRQ7 appears in 2.6.19-rc6-git10

2006-11-28 Thread Martin MOKREJŠ
Hi,
  I have just tested for fun the upcoming release candidate and have
found the following difference with a 'spurious 8259A interrupt:
IRQ7' message, possibly triggered by the

--- linux-2.6.19-rc5.txt2006-11-28 19:23:54.145722821 +0100
+++ linux-2.6.19-rc6-git10.txt  2006-11-28 19:14:04.579597162 +0100
@@ -1,4 +1,4 @@
-Linux version 2.6.19-rc5 ([EMAIL PROTECTED]) (gcc version 4.1.1 (Gentoo
4.1.1-r1)) #1 Tue Nov 14 02:54:07 CET 2006
+Linux version 2.6.19-rc6-git10 ([EMAIL PROTECTED]) (gcc version 4.1.1
(Gentoo 4.1.1-r2)) #1 Tue Nov 28 17:56:27 CET 2006
 BIOS-provided physical RAM map:
  BIOS-e820:  - 0009fc00 (usable)
  BIOS-e820: 0009fc00 - 000a (reserved)
@@ -43,7 +43,7 @@
 Enabling fast FPU save and restore... done.
 Enabling unmasked SIMD FPU exception support... done.
 Initializing CPU#0
-CPU 0 irqstacks, hard=c147f000 soft=c147e000
+CPU 0 irqstacks, hard=c148 soft=c147f000
 PID hash table entries: 4096 (order: 12, 16384 bytes)
 Console: colour VGA+ 80x25
 Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo
Molnar
@@ -88,7 +88,8 @@
  hard-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
  soft-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
 hard-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
-soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
+soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |<7>spurious
8259A interrupt: IRQ7.
+
 hard-safe-A + unsafe-B #1/132:  ok  |  ok  |  ok  |
 soft-safe-A + unsafe-B #1/132:  ok  |  ok  |  ok  |
 hard-safe-A + unsafe-B #1/213:  ok  |  ok  |  ok  |




Here is the relevant part of full dmesg output from 2.6.19-rc6-git10:

Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES:8
... MAX_LOCK_DEPTH:  30
... MAX_LOCKDEP_KEYS:2048
... CLASSHASH_SIZE:   1024
... MAX_LOCKDEP_ENTRIES: 8192
... MAX_LOCKDEP_CHAINS:  8192
... CHAINHASH_SIZE:  4096
 memory used by lock dependency info: 904 kB
 per task-struct memory footprint: 1200 bytes

| Locking API testsuite:

 | spin |wlock |rlock |mutex | wsem
| rsem |

--
 A-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok
|  ok  |
 A-B-B-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok
|  ok  |
 A-B-B-C-C-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok
|  ok  |
 A-B-C-A-B-C deadlock:  ok  |  ok  |  ok  |  ok  |  ok
|  ok  |
 A-B-B-C-C-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok
|  ok  |
 A-B-C-D-B-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok
|  ok  |
 A-B-C-D-B-C-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok
|  ok  |
double unlock:  ok  |  ok  |  ok  |  ok  |  ok
|  ok  |
  initialize held:  ok  |  ok  |  ok  |  ok  |  ok
|  ok  |
 bad unlock order:  ok  |  ok  |  ok  |  ok  |  ok
|  ok  |

--
  recursive read-lock: |  ok  |
|  ok  |
   recursive read-lock #2: |  ok  |
|  ok  |
mixed read-write-lock: |  ok  |
|  ok  |
mixed write-read-lock: |  ok  |
|  ok  |

--
 hard-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
 soft-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
 hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
 soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
   sirq-safe-A => hirqs-on/12:  ok  |  ok  |  ok  |
   sirq-safe-A => hirqs-on/21:  ok  |  ok  |  ok  |
 hard-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
 soft-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
 hard-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
 soft-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
hard-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |<7>spurious
8259A interrupt: IRQ7.

hard-safe-A + unsafe-B #1/132:  ok  |  ok  |  ok  |
soft-safe-A + unsafe-B #1/132:  ok  |  ok  |  ok  |
hard-safe-A + unsafe-B #1/213:  ok  |  ok  |  ok  |
soft-safe-A + unsafe-B #1/213:  ok  |  ok  |  ok  |
...
  hard-irq read-recursion/321:  ok  |
  soft-irq read-recursion/321:  ok  |
---
Good, all 218 testcases passed! |
-


Please Cc: me in replies.
Thanks.
Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


XFS: possible recursive locking detected in 2.6.18 to 2.6.19-rc6-git10 but not 2.6.17.11

2006-11-28 Thread Martin MOKREJŠ
Hi,
  I have a looong time opened a bugreport on XFS at
http://bugzilla.kernel.org/show_bug.cgi?id=7287 and I see it still
appear in my kernel output during bootup. I guess this is one of the
relatively new kernel self-testing features introduced recently. I
just wanted to let you know about that.


=
[ INFO: possible recursive locking detected ]
2.6.19-rc6-git10 #1
-
mount/3439 is trying to acquire lock:
 (&(&ip->i_lock)->mr_lock){}, at: [] xfs_ilock+0x4a/0x68

but task is already holding lock:
 (&(&ip->i_lock)->mr_lock){}, at: [] xfs_ilock+0x4a/0x68

other info that might help us debug this:
2 locks held by mount/3439:
 #0:  (&inode->i_mutex){--..}, at: [] mutex_lock+0x8/0xa
 #1:  (&(&ip->i_lock)->mr_lock){}, at: []
xfs_ilock+0x4a/0x68

stack backtrace:
 [] show_trace_log_lvl+0x1a/0x2f
 [] show_trace+0x12/0x14
 [] dump_stack+0x19/0x1b
 [] __lock_acquire+0x106/0x94e
 [] lock_acquire+0x5c/0x79
 [] down_write+0x2b/0x44
 [] xfs_ilock+0x4a/0x68
 [] xfs_iget+0x2a0/0x5de
 [] xfs_trans_iget+0xd6/0x135
 [] xfs_ialloc+0xa7/0x41f
 [] xfs_dir_ialloc+0x6d/0x267
 [] xfs_create+0x2f4/0x5ae
 [] xfs_vn_mknod+0x127/0x242
 [] xfs_vn_create+0x12/0x14
 [] vfs_create+0x6a/0xb4
 [] open_namei+0x179/0x57a
 [] do_filp_open+0x26/0x3b
 [] do_sys_open+0x43/0xc7
 [] sys_open+0x1c/0x1e
 [] sysenter_past_esp+0x56/0x8d
 ===

I can provide more details upon request. Please Cc: me in replies.
Thanks.
Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


PCI: IRQ 0 for device 0000:00:1f.3 doesn't match PIRQ mask - try pci=usepirqmask

2005-09-01 Thread Martin MOKREJŠ
Hi,
  what does this message really mean? I did what it suggests and the "IRQ 0"
is gone then. Is that a problem in kernel or should I just use for my hardware
pci=usepirqmask when acpi=off? Should I report somewhere else? Should I care at 
all?
I use 2.6.13 kernel with the patch for pcmcia from here:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=81d4af1340badcd2100c84fbd1bfd13156de41aa

--- acpioff.txt 2005-09-01 17:52:45.787890500 +0200
+++ useirqmask.txt  2005-09-01 17:58:21.294486250 +0200
@@ -16,22 +16,23 @@
 DMI 2.3 present.
 Allocating PCI resources starting at 4000 (gap: 4000:bfff)
 Built 1 zonelists
-Kernel command line: udev root=/dev/hda2 idebus=66 console=ttyS0,57600n8 
console=tty0 pcmcia_core.pc_debug=9 pcmcia.pc_debug=9 sa11xx_core.pc_debug=9 
acpi=off
+Kernel command line: udev root=/dev/hda2 idebus=66 console=ttyS0,57600n8 
console=tty0 pcmcia_core.pc_debug=9 pcmcia.pc_debug=9 sa11xx_core.pc_debug=9 
acpi=off pci=usepirqmask
 ide_setup: idebus=66
 Unknown boot option `sa11xx_core.pc_debug=9': ignoring
 Local APIC disabled by BIOS -- you can enable it with "lapic"
 mapped APIC to d000 (01803000)
 Initializing CPU#0
 CPU 0 irqstacks, hard=c05c1000 soft=c05c
 PID hash table entries: 4096 (order: 12, 65536 bytes)
-Detected 1800.362 MHz processor.
+Detected 1800.261 MHz processor.
 Using tsc for high-res timesource
 Console: colour VGA+ 80x25
 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
 Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
 Memory: 1033480k/1048548k available (3061k kernel code, 14316k reserved, 1585k 
data, 192k init, 131044k highmem)
 Checking if this processor honours the WP bit even in supervisor mode... Ok.
-Calibrating delay using timer specific routine.. 3606.04 BogoMIPS (lpj=7212086)
+Calibrating delay using timer specific routine.. 3606.00 BogoMIPS (lpj=7212016)
 Mount-cache hash table entries: 512
 CPU: After generic identify, caps: bfebf9ff    
  
 CPU: After vendor identify, caps: bfebf9ff     
 
@@ -175,7 +176,6 @@
 PCI: Scanning behind PCI bridge :00:1e.0, config 0a0200, pass 1
 PCI: Bus scan for :00 returning with max=0a
 PCI: Using IRQ router PIIX/ICH [8086/248c] at :00:1f.0
-PCI: IRQ 0 for device :00:1f.3 doesn't match PIRQ mask - try 
pci=usepirqmask
 PCI: Found IRQ 11 for device :00:1f.3
 PCI: Sharing IRQ 11 with :00:1f.5
 PCI: Sharing IRQ 11 with :00:1f.6


The HW is ASUS L3C/S laptop. More details on the HW in 
http://bugzilla.kernel.org/show_bug.cgi?id=4889
Please cc: me in replies.
Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


overlapping MTRR regions

2005-08-24 Thread Martin MOKREJŠ
Hi,
  I tested 2.6.13-rc7 on nice server motherboard with 16GB of RAM ;)
http://www.msicomputer.com/product/p_spec.asp?model=E7520_Master-S2M&class=spd
and I see the following when acpi is enabled (haven't even tried
without):

# cat /proc/mtrr 
reg00: base=0xd000 (3328MB), size= 256MB: uncachable, count=1
reg01: base=0xe000 (3584MB), size= 512MB: uncachable, count=1
reg02: base=0x (   0MB), size=16384MB: write-back, count=1
reg03: base=0x4 (16384MB), size= 512MB: write-back, count=1
reg04: base=0x42000 (16896MB), size= 256MB: write-back, count=1
reg05: base=0xcff8 (3327MB), size= 512KB: uncachable, count=1
#

Is that correct? Please cc: me in replies.

phylo ~ # lspci
:00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub (rev 0c)
:00:00.1 Class ff00: Intel Corporation E7525/E7520 Error Reporting 
Registers (rev 0c)
:00:01.0 System peripheral: Intel Corporation E7520 DMA Controller (rev 0c)
:00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port A 
(rev 0c)
:00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port B (rev 
0c)
:00:05.0 PCI bridge: Intel Corporation E7520 PCI Express Port B1 (rev 0c)
:00:06.0 PCI bridge: Intel Corporation E7520 PCI Express Port C (rev 0c)
:00:07.0 PCI bridge: Intel Corporation E7520 PCI Express Port C1 (rev 0c)
:00:1c.0 PCI bridge: Intel Corporation 6300ESB 64-bit PCI-X Bridge (rev 02)
:00:1d.0 USB Controller: Intel Corporation 6300ESB USB Universal Host 
Controller (rev 02)
:00:1d.1 USB Controller: Intel Corporation 6300ESB USB Universal Host 
Controller (rev 02)
:00:1d.4 System peripheral: Intel Corporation 6300ESB Watchdog Timer (rev 
02)
:00:1d.5 PIC: Intel Corporation 6300ESB I/O Advanced Programmable Interrupt 
Controller (rev 02)
:00:1d.7 USB Controller: Intel Corporation 6300ESB USB2 Enhanced Host 
Controller (rev 02)
:00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 0a)
:00:1f.0 ISA bridge: Intel Corporation 6300ESB LPC Interface Controller 
(rev 02)
:00:1f.1 IDE interface: Intel Corporation 6300ESB PATA Storage Controller 
(rev 02)
:00:1f.2 IDE interface: Intel Corporation 6300ESB SATA Storage Controller 
(rev 02)
:00:1f.3 SMBus: Intel Corporation 6300ESB SMBus Controller (rev 02)
:01:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A 
(rev 09)
:01:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge B 
(rev 09)
:08:01.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY 
[Radeon 7000/VE]
:08:02.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet 
Controller
:08:03.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet 
Controller
:09:01.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet 
Controller (rev 05)
phylo ~ # lspci -v
:00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub (rev 0c)
Subsystem: Intel Corporation E7520 Memory Controller Hub
Flags: bus master, fast devsel, latency 0
Capabilities: [40] #09 [4105]

:00:00.1 Class ff00: Intel Corporation E7525/E7520 Error Reporting 
Registers (rev 0c)
Subsystem: Micro-Star International Co., Ltd.: Unknown device 3590
Flags: fast devsel

:00:01.0 System peripheral: Intel Corporation E7520 DMA Controller (rev 0c)
Subsystem: Micro-Star International Co., Ltd.: Unknown device 3594
Flags: fast devsel, IRQ 10
Memory at d000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [b0] Message Signalled Interrupts: 64bit- Queue=0/1 
Enable-

:00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port A 
(rev 0c) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=01, subordinate=03, sec-latency=0
Capabilities: [50] Power Management version 2
Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 
Enable-
Capabilities: [64] #10 [0041]

:00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port B (rev 
0c) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
Capabilities: [50] Power Management version 2
Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 
Enable-
Capabilities: [64] #10 [0041]

:00:05.0 PCI bridge: Intel Corporation E7520 PCI Express Port B1 (rev 0c) 
(prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
Capabilities: [50] Power Management version 2
Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 
Enable-
Capabilities: [64] #10 [0041]

:00:06.0 PCI bridge: Intel Corporation E7520 PCI Express Port C (rev 0c) 
(prog-if 00 [Normal decode])
 

Re: openafs is really faster on linux-2.4. than 2.6

2005-08-18 Thread Martin MOKREJŠ
Hi Con,
  thank you for anwers. It seems my main confusion was that values in 'id' and
'wa' columns in vmstat(1) output do not reflect the 2.4 kernel stats well.
The timing shows that the real time is more or less same and we could only argue
if the sys time is significantly higher on 2.6 kernel or not. So, there
ios no problem with the kernel. ;)


on 2.4.31 (read from local xfs disk to another disk with ext2 fs):

procs ---memory-- ---swap-- -io --system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy id wa
 5  0  0 252932   4172 263798000 57856 47500 1548  1884  2 27 71  0
 1  1  0 136192   4172 275471600 58368 52804 1599  1885  2 24 74  0
 2  1  0  85588   4172 280505200 61696 51600 1636  1965  5 40 55  0
 1  1  0  85376   4172 280476400 59520 53828 1623  1848  3 45 52  0
 3  1  0  85772   4172 280503600 60288 50112 1609  1875  1 33 66  0
 1  1  0  85784   4172 280436400 58240 50300 1567  1835  2 43 55  0
 2  1  0  85468   4172 280511600 61824 47516 1666  2149  4 47 50  0
 1  1  0  85956   4172 280814400 58496 49708 1570  2087  3 33 64  0
 2  1  0  84792   4172 280934400 58112 51992 1585  2058  3 27 70  0
 2  1  0  86176   4172 280783600 58240 52836 1599  2029  1 29 70  0
 2  1  0  86164   4172 280396800 62080 53500 1664  2085  2 40 58  0
 2  1  0  85152   4172 280494000 57600 51320 1572  1837  3 37 60  0
 1  1  0  85840   4172 280459200 58752 52500 1603  1917  2 48 50  0
 1  1  0  85760   4128 280483600 58368 51100 1577  1853  3 35 62  0
 0  2  0  85456   4128 280453200  3712 45740  675   289  0  6 94  0
 1  1  0  84796   4128 280589600 30592 47528 1113  1047  0 27 73  0
 1  1  0  85708   4132 280837600 52356 43612 1422  1755  0 26 74  0
 1  1  0  85852   4132 280821200 58368 46252 1528  1880  2 35 63  0
 1  1  0  86120   4136 280797200 50052 46212 1411  1740  3 30 67  0
 1  1  0  82728   4136 280827600 57472 50500 1560  1866  4 29 67  0
 3  1  0  83316   4140 280742800 56324 44780 1490  1875  2 39 59  0
 2  1  0  82332   4140 280796000 59648 48724 1578  1905  0 34 66  0


time bash -c "dd if=~mmokrejs/video/kamcatka2004.dv 
of=/mnt/ext2/kamcatka2004.dv bs=1024 count=4096000; sync"
4096000+0 records in
4096000+0 records out

real1m29.880s
user0m1.600s
sys 0m12.220s

or in another attempt:

real1m33.282s
user0m1.430s
sys 0m12.180s


on 2.6.13-rc6-git9 (read from local xfs disk to another disk with ext2 fs):

procs ---memory-- ---swap-- -io --system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy id wa
 1 10  0  85452   7628 275455200 57984 52188  876  1647  2 36  0 62
 0 10  0  85452   7380 275693200 59520 46260  862  1620  1 41  0 58
 1  9  0  85080   7424 275656400 53636 46188  838  1515  1 34  0 65
 1  9  0  85328   7476 275547200 59400 51077  897  1737  1 39  0 60
 1 10  0  85204   7528 275494000 53508 51456  844  1513  2 33  0 65
 0 10  0  85452   7580 275380400 58368 50956  879  1682  2 33  0 65
 0 10  0  85204   7636 275318000 56196 51512  869  1597  2 33  0 65
 0 10  0  85204   7688 275217600 61696 45856  906  1728  1 36  0 63
 1  9196  85080   7740 275336800 54276 51456  848  1572  1 40  0 59
 0 10196  85328   7784 275296800 59008 30752  888  1658  1 40  0 59
 0 10196  85328   7820 275337600 58368 23520  883  1659  1 33  0 66
 1 10196  85452   7856 275301600 58756 48164  922  1646  1 37  0 62
 0 10196  85328   7252 275432400 57600 49172  878  1609  1 37  0 62
 1  9196  85204   7296 275456400 56708 43176  853  1598  2 36  0 62
 1 10196  85204   7324 275449600 59392 47600  882  1674  0 41  0 59
 0 10196  85328   7288 275428000 54660 46200  831  1531  1 34  0 65
 1  9196  85452   7224 275385600 59008 45808  876  1657  2 37  0 61
 0 10196  85080   7164 275388000 57476 46144  865  1619  2 34  0 64
 0 10196  85204   7116 275348000 56968 46209  860  1658  1 32  0 67

# time bash -c "dd if=~mmokrejs/video/kamcatka2004.dv 
of=/mnt/ext2/kamcatka2004.dv bs=1024 count=4096000; sync"
4096000+0 records in
4096000+0 records out

real1m31.067s
user0m0.760s
sys 0m15.809s

and another trial:

real1m31.266s
user0m0.796s
sys 0m16.361s


>>The throughput is clearly lower on 2.6 kernel and definitely the
>>CPU is in my eyes unnecessarily blocked... Why is the CPU in the
>>wait state instead of idle (this is teh problem on 2.6 series
>>but CPU is free on 2.4 series)? That's the main problem I think at the
>>moment.
> 
> 
> There is no wait state accounted for in 

Re: openafs is really faster on linux-2.4. than 2.6

2005-08-18 Thread Martin MOKREJŠ
But that is very short and does not affect the interpretation here.
The throughput is clearly lower on 2.6 kernel and definitely the
CPU is in my eyes unnecessarily blocked... Why is the CPU in the
wait state instead of idle (this is teh problem on 2.6 series
but CPU is free on 2.4 series)? That's the main problem I think at the
moment.
M.

Con Kolivas wrote:
> On Thu, 18 Aug 2005 22:48, Martin MOKREJŠ wrote:
> 
>>I think the problem here is outside afs.
>>Just doing this dd test but writing data directly to the ext2
>>target gives same behaviour, i.e. on 2.4 kernel I see most of the
>>CPU idle but on 2.6 kernel all that CPU amount is shown as in
>>wait state. And the numbers from 2.4 kernel show higher throughput
>>compared to the 2.6 kernel (regardless the the PREEMPT or no PREEMPT
>>was used).
> 
> 
> Don't forget to include sync time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: openafs is really faster on linux-2.4. than 2.6

2005-08-18 Thread Martin MOKREJŠ
I think the problem here is outside afs.
Just doing this dd test but writing data directly to the ext2
target gives same behaviour, i.e. on 2.4 kernel I see most of the
CPU idle but on 2.6 kernel all that CPU amount is shown as in
wait state. And the numbers from 2.4 kernel show higher throughput
compared to the 2.6 kernel (regardless the the PREEMPT or no PREEMPT
was used).
M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.13-rc6 ntfs oops

2005-08-15 Thread Martin MOKREJŠ
Hi,
  I was just copying some data from ntfs partition to xfs and I got the 
following:
Does someone need more info? Briefly, no SMP but HIGHMEm 4GB, i686 P4 machine, 
32bit.
Martin

NTFS driver 2.1.23 [Flags: R/W MODULE].
NTFS volume version 3.1.
Unable to handle kernel NULL pointer dereference at virtual address 0004
 printing eip:
c02da923
*pde = 
Oops:  [#1]
PREEMPT DEBUG_PAGEALLOC
Modules linked in: ntfs usb_storage radeon drm reiserfs snd_rtctimer 
snd_seq_virmidi snd_seq_midi snd_rawmidi snd_intel8x0 snd
_ac97_codec snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss 
snd_pcm snd_timer snd_page_alloc snd_mixer_oss s
nd uhci_hcd ohci_hcd ehci_hcd intel_agp agpgart
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010202   (2.6.13-rc6) 
EIP is at generic_make_request+0x17/0x1e7
eax:    ebx:    ecx: 0001   edx: 0001
esi: d0a399e0   edi: 0001   ebp: dd145cd0   esp: dd145c50
ds: 007b   es: 007b   ss: 0068
Process pdflush (pid: 3995, threadinfo=dd144000 task=d55e2b10)
Stack: dd145c6c c0143f05 e0974a6c c290b880  c01439d7 c290b880 dd145c94 
   c014588f f7735254 f7754ef8 0046 f7735000 00011200 c290b880 00011200 
   0246 dd145cac c0145c5d c013fc9a 00011200 00011210 c29a90cc dd145cb4 
Call Trace:
 [] show_stack+0x7a/0x90
 [] show_registers+0x156/0x1ce
 [] die+0xf4/0x17e
 [] do_page_fault+0x43a/0x619
 [] error_code+0x4f/0x54
 [] submit_bio+0x55/0xd5
 [] submit_bh+0xcf/0x118
 [] write_mft_record_nolock+0x274/0x616 [ntfs]
 [] ntfs_write_inode+0x235/0x43f [ntfs]
 [] write_inode+0x49/0x4b
 [] __sync_single_inode+0xfb/0x1ff
 [] __writeback_single_inode+0x30/0x149
 [] sync_sb_inodes+0x15c/0x2a1
 [] writeback_inodes+0xec/0x11c
 [] wb_kupdate+0xb4/0x123
 [] __pdflush+0xd6/0x1e1
 [] pdflush+0x1e/0x20
 [] kthread+0x8a/0xb7
 [] kernel_thread_helper+0x5/0xb
Code: 00 00 00 0f ab 47 0c 8b 5d f4 8b 75 f8 8b 7d fc 89 ec 5d c3 55 89 e5 57 
56 89 c6 53 83 ec 74 8b 48 1c 8b 40 08 89 45 90 
c1 e9 09 <8b> 40 04 8b 50 40 8b 40 3c 0f ac d0 09 85 c0 74 54 39 c1 8b 16 
 

-- 
Martin Mokrejs
Email: 'bW9rcmVqc21Acmlib3NvbWUubmF0dXIuY3VuaS5jeg==\n'.decode('base64')
GPG key is at http://www.natur.cuni.cz/~mmokrejs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Giving developers clue how many testers verified certain kernel version

2005-07-24 Thread Martin MOKREJŠ

Hi Adrian,
 I think you don't understand me. I do report bugs and will always
do. The point was that developers could be "assured" there is possibly
no problem when people do NOT report bugs in that piece of code
because they would know that it _was_ tested by 1000 people on 357 different
HW's. And they could even check the .configs, lshw etc. Sure the people
would report a problem, but if you do NOT hear of one then there is either no
problem or nobody cared to report that or nobody tested. So you known
just nothing and you better wait some days, weeks so the patch get's lost
in lkml archives if it doesn't happend it gets into -ac or -mm.

 And that is exactly why I proposed this. Then you will know that 1000
people really cared and used that and most probably then it is reasonable
to expect there is really no bug in the code.

 Take it the other way around. You may be reluctant to commit some
patch to the official tree. ;) The guy who wrote the patch says "It was tested,
please apply". ;-) If he says the patch is lying in -mm or -ac tree for
a while - like 2 months you might be more in favor to commit, right?
If you know the patch was tested between -git5 and -git6 by 1000 people
within 5 days you wouldn't wait either, right?
Martin

Adrian Bunk wrote:

On Sun, Jul 24, 2005 at 08:45:16PM +0200, Martin MOKREJ? wrote:

well, the idea was to give you a clue how many people did NOT complain
because it either worked or they did not realize/care. The goal
was different. For example, I have 2 computers and both need current acpi
patch to work fine. I went to bugzilla and found nobody has filed such bugs
before - so I did and said it is already fixed in current acpi patch.
But you'd never know that I tested that successfully. And I don't believe
to get emails from lkml that I installed a patch and it did not break
anything. I hope you get the idea now. ;)



in your ACPI example there is a bug/problem (ACPI needs updating).

And ACPI is a good example where even 1000 success reports wouldn't help 
because a slightly different hardware or BIOS version might make the 
difference.


Usually "no bug report" indicates that something is OK.
And if you are unsure whether an unusual setup or hardware is actually 
tested, it's usually the best to ask on linux-kernel whether someone 
could test it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Giving developers clue how many testers verified certain kernel version

2005-07-24 Thread Martin MOKREJŠ

Hi Adrian,
 well, the idea was to give you a clue how many people did NOT complain
because it either worked or they did not realize/care. The goal
was different. For example, I have 2 computers and both need current acpi
patch to work fine. I went to bugzilla and found nobody has filed such bugs
before - so I did and said it is already fixed in current acpi patch.
But you'd never know that I tested that successfully. And I don't believe
to get emails from lkml that I installed a patch and it did not break
anything. I hope you get the idea now. ;)
Martin

Adrian Bunk wrote:

On Fri, Jul 22, 2005 at 03:34:09AM +0200, Martin MOKREJ? wrote:



Hi,



Hi Martin,



I think the discussion going on here in another thread about lack
of positive information on how many testers successfully tested certain
kernel version can be easily solved with real solution.

How about opening separate "project" in bugzilla.kernel.org named
kernel-testers or whatever, where whenever cvs/svn/bk gatekeepers
would release some kernel patch, would open an empty "bugreport"
for that version, say for 2.6.13-rc3-git4.

Anybody willing to join the crew who cared to download the patch
and tested the kernel would post just a single comment/follow-up
to _that_ "bugreport" with either "positive" rating or URL
of his own bugreport with some new bug. When the bug get's closed
it would be immediately obvious in the 2.6.13-rc3-git4 bug ticket
as that bug will be striked-through as closed.

Then, we could easily just browse through and see that 2.6.13-rc2
was tested by 33 fellows while 3 of them found a problem and 2 such
problems were closed since then.
...



most likely, only a small minory of the people downloading a patch would 
register at such a "project".


The important part of the work, the bug reports, can already today go to 
lnux-kernel and/or the Bugzilla.


You'd spend efforts for such a "project" that would only produce some 
numbers of questionable value.




Martin



cu
Adrian



--
Martin Mokrejs
Email: 'bW9rcmVqc21Acmlib3NvbWUubmF0dXIuY3VuaS5jeg==\n'.decode('base64')
GPG key is at http://www.natur.cuni.cz/~mmokrejs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Giving developers clue how many testers verified certain kernel version

2005-07-21 Thread Martin MOKREJŠ

Hi,

Mark Nipper wrote:

I have a different idea along these lines but not using
bugzilla.  A nice system for tracking usage of certain components
might be made by having people register using a certain e-mail
address and then submitting their .config as they try out new
versions of kernels.


Nice idea, but I still think it is of interrest on what hardware
was it tested. Maybe also 'dmesg' output would help a bit, but
I still don't know how you'd find that I have _this_ motherboard
instead of another.

Second, I'd submit sometimes 2 or even 3 tested hosts. But am
willing to use only single email, though. ;)

I think we'd need some sort of profile, the profile would contain
some HW info, like motherboard type, bios version etc. To extract
that from 'dmesg' would be a nightmare I think.

...


Just an idea.  It might require some minimum
recommendations to users willing to participate.  I know for
example that I statically compile all four I/O schedulers in all


Well, my case too. ;)

Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Giving developers clue how many testers verified certain kernel version

2005-07-21 Thread Martin MOKREJŠ

Hi,
 I think the discussion going on here in another thread about lack
of positive information on how many testers successfully tested certain
kernel version can be easily solved with real solution.

 How about opening separate "project" in bugzilla.kernel.org named
kernel-testers or whatever, where whenever cvs/svn/bk gatekeepers
would release some kernel patch, would open an empty "bugreport"
for that version, say for 2.6.13-rc3-git4.

 Anybody willing to join the crew who cared to download the patch
and tested the kernel would post just a single comment/follow-up
to _that_ "bugreport" with either "positive" rating or URL
of his own bugreport with some new bug. When the bug get's closed
it would be immediately obvious in the 2.6.13-rc3-git4 bug ticket
as that bug will be striked-through as closed.

 Then, we could easily just browse through and see that 2.6.13-rc2
was tested by 33 fellows while 3 of them found a problem and 2 such
problems were closed since then.

 I know what would be really helpfull if the testers would report
let's say motherboard type, HIGHMEM/NO-HIGHMEM, ACPI/NO-ACPI,
SMP/NO-SMP and few more hints and if teh database would keep those
having same hardware + config as a single record. It could even just
watch few lines in .config file when uploaded.

 Well I'm sure you got my point, maybe it would be easier to write
some tiny database from scratch instead of tweaking bugzilla to suit
this king of solution.
;-)
Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Oops: 2.6.13-rc2 at vma_prio_tree_remove

2005-07-13 Thread Martin MOKREJŠ

Hi,
 has anybody seen this? I have 2.6.13-rc2 kernel on intel P4 box.

Unable to handle kernel paging request at virtual address 00040034
printing eip:
c014937e
*pde = 
Oops:  [#1]
PREEMPT DEBUG_PAGEALLOC
Modules linked in: radeon drm parport_pc lp parport snd_rtctimer 
snd_seq_virmidi snd_seq_midi snd_rawmidi snd_intel8x0 snd_ac97_codec 
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_pcm 
snd_timer snd_page_alloc snd_mixer_oss snd uhci_hcd ohci_hcd ehci_hcd intel_agp 
agpgart
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010206   (2.6.13-rc2) 
EIP is at vma_prio_tree_remove+0x47/0x100

eax: dc7561fc   ebx: dc7561fc   ecx: f6b3df74   edx: f6b3df88
esi: 0004   edi: eb1b4f70   ebp: e7b1defc   esp: e7b1deec
ds: 007b   es: 007b   ss: 0068
Process make (pid: 11893, threadinfo=e7b1c000 task=f6643b10)
Stack: f6b3df88 dc7561fc dc7561fc eb1b4f70 e7b1df08 c014dcc2 f6b3df74 e7b1df1c 
  c014dd11 f1d74db4 dc7561fc dab68dd0 e7b1df48 c014fc6d   
  e7b1df38  c05a6618 0049 dab68dd0 dab68e0c f6643b10 e7b1df58 
Call Trace:

[] show_stack+0x7a/0x90
[] show_registers+0x156/0x1ce
[] die+0xf4/0x17e
[] do_page_fault+0x434/0x613
[] error_code+0x4f/0x54
[] __remove_shared_vm_struct+0x39/0x53
[] remove_vm_struct+0x35/0x95
[] exit_mmap+0x125/0x156
[] mmput+0x35/0xa0
[] exit_mm+0x88/0x109
[] do_exit+0xcd/0x42e
[] do_group_exit+0x3a/0xbe
[] sys_exit_group+0xf/0x11
[] sysenter_past_esp+0x54/0x75
Code: 48 30 85 c9 0f 85 8e 00 00 00 8d 40 28 8b 53 28 8b 48 04 89 4a 04 89 11 89 40 04 89 43 28 8b 5d f4 8b 75 f8 8b 7d fc 89 ec 5d c3 <39> 46 34 0f 85 a3 00 00 00 8b 53 30 85 d2 74 2f 8d 4e 28 8b 56 
<1>Fixing recursive fault but reboot is needed!

scheduling while atomic: make/0x0001/11893
[] dump_stack+0x17/0x19
[] schedule+0x58d/0x653
[] do_exit+0x3ae/0x42e
[] do_trap+0x0/0xb8
[] do_page_fault+0x434/0x613
[] error_code+0x4f/0x54
[] __remove_shared_vm_struct+0x39/0x53
[] remove_vm_struct+0x35/0x95
[] exit_mmap+0x125/0x156
[] mmput+0x35/0xa0
[] exit_mm+0x88/0x109
[] do_exit+0xcd/0x42e
[] do_group_exit+0x3a/0xbe
[] sys_exit_group+0xf/0x11
[] sysenter_past_esp+0x54/0x75


It happened to me just when compiling 2.6.13-rc3 kernel:

...
 LD [M]  drivers/char/drm/radeon.o
 LD [M]  drivers/char/drm/mga.o
 CC [M]  drivers/input/misc/uinput.o
Ooops!


Please cc: me in replies. Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Two 2.6.13-rc1 kernel crashes

2005-07-05 Thread Martin MOKREJŠ

Hi,
 it seems it has helped. ;)
Thanks!

Jens Axboe wrote:

On Mon, Jul 04 2005, Martin Mokrejs wrote:


Hi,
 I use on i686 architecture Gentoo linux with XFS filesystem.
Recently it happened to me 3 time that the machine locked,
although at least once sys-rq+b worked. Here is the log
from remote console. I don't remeber having such problems
with 2.6.12-rc6-git2, which was my previous testing kernel.
The problems appear under heavy load when I compile/install
some packages and maybe it's just a bad coincidence or not,
when I move my usb mouse in fvwm2 environment. The machine
locks.



You need this fix from Hugh.

--- 2.6.13-rc1/drivers/block/ll_rw_blk.c2005-06-29 11:54:08.0 
+0100
+++ linux/drivers/block/ll_rw_blk.c 2005-06-29 14:41:04.0 +0100
@@ -1917,10 +1917,9 @@ get_rq:
 * limit of requests, otherwise we could have thousands of requests
 * allocated with any setting of ->nr_requests
 */
-   if (rl->count[rw] >= (3 * q->nr_requests / 2)) {
-   spin_unlock_irq(q->queue_lock);
+   if (rl->count[rw] >= (3 * q->nr_requests / 2))
goto out;
-   }
+


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Two 2.6.13-rc1 kernel crashes

2005-07-04 Thread Martin MOKREJŠ

# grep CONFIG_4KSTACKS .config
# CONFIG_4KSTACKS is not set
#

.config is attached.
Thanks.
BTW: The .config should be almost same as for the previous kernel.
I usually copy the old-one into new source tree and do "make oldconfig".

Zwane Mwaikambo wrote:

On Mon, 4 Jul 2005, Martin Mokrejs wrote:



Hi,
 I use on i686 architecture Gentoo linux with XFS filesystem.
Recently it happened to me 3 time that the machine locked,
although at least once sys-rq+b worked. Here is the log
from remote console. I don't remeber having such problems
with 2.6.12-rc6-git2, which was my previous testing kernel.
The problems appear under heavy load when I compile/install
some packages and maybe it's just a bad coincidence or not,
when I move my usb mouse in fvwm2 environment. The machine
locks.
Any clues? Please Cc: me in replies.



Could you send your .config, and also test without CONFIG_4KSTACKS (if 
enabled)?


Thanks,
Zwane





--
Martin Mokrejs
Email: 'bW9rcmVqc21Acmlib3NvbWUubmF0dXIuY3VuaS5jeg==\n'.decode('base64')
GPG key is at http://www.natur.cuni.cz/~mmokrejs
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.13-rc1
# Mon Jul  4 01:05:29 2005
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
# CONFIG_CLEAN_COMPILE is not set
CONFIG_BROKEN=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
CONFIG_MPENTIUM4=y
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
# CONFIG_SMP is not set
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_X86_MCE_P4THERMAL=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
CONFIG_MICROCODE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y

#
# Firmware Drivers
#
CONFIG_EDD=y
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_HIGHPTE=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_EFI is not set
CONFIG_HAVE_DEC_LOCK=y
CONFIG_REGPARM=y
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_PHYSICAL_START=0x10
# CONFIG_KEXEC is not set

#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
# CONFIG_SOFTWARE_SUSPEND is not set

#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
CONFIG_ACPI_AC=y
# CONFIG_ACPI_BATTERY is not set
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG

find: /usr/src/linux-2.4.30/include/asm: Too many levels of symbolic links

2005-04-07 Thread Martin MOKREJŠ
Hi,
 again I've hit some wird problem doing "make dep" for 2.4 kernel:
I've extracted the linxu-2.4.30.tar.gz file, copied .config from
previous src-tree, ran `make oldconfig', did `make menuconfig',
and finally `make dep':
[cut]
make[2]: Leaving directory `/usr/src/linux-2.4.30/arch/i386/lib'
make[1]: Leaving directory `/usr/src/linux-2.4.30'
make update-modverfile
make[1]: Entering directory `/usr/src/linux-2.4.30'
/usr/src/linux-2.4.30/include/linux/modversions.h was not updated
make[1]: Leaving directory `/usr/src/linux-2.4.30'
scripts/mkdep -- `find /usr/src/linux-2.4.30/include/asm /usr/src/linux-2.4.30/include/linux /usr/src/linux-2.4.30/include/scsi /usr/src/linux-2.4.30/include/net /usr/src/linux-2.4.30/include/math-emu \( -name SCCS -o -name .svn \) -prune -o -follow -name \*.h ! -name modversions.h -print` > .hdepend
find: /usr/src/linux-2.4.30/include/asm: Too many levels of symbolic links
scripts/mkdep -- init/*.c > .depend
# 

Executing `find /usr/src/linux-2.4.30/include/asm 
/usr/src/linux-2.4.30/include/linux /usr/src/linux-2.4.30/include/scsi 
/usr/src/linux-2.4.30/include/net /usr/src/linux-2.4.30/include/math-emu \( 
-name SCCS -o -name .svn \) -prune -o -follow -name \*.h ! -name modversions.h 
-print` works just fine.
aquarius linux-2.4.30 # scripts/mkdep -- `find /usr/src/linux-2.4.30/include/asm /usr/src/linux-2.4.30/include/linux /usr/src/linux-2.4.30/include/scsi /usr/src/linux-2.4.30/include/net /usr/src/linux-2.4.30/include/math-emu \( -name SCCS -o -name .svn \) -prune -o -follow -name \*.h ! -name modversions.h -print`
find: /usr/src/linux-2.4.30/include/asm: Too many levels of symbolic links
mkdep: HPATH not set in environment.  Don't bypass the top level Makefile.
aquarius linux-2.4.30 # 

aquarius linux-2.4.30 # ls -la /usr/src/linux-2.4.30/include/asm
lrwxrwxrwx  1 root root 8 Apr  7 14:07 /usr/src/linux-2.4.30/include/asm -> asm-i386
aquarius linux-2.4.30 # ls -la /usr/src/linux-2.4.30/include/asm-i
asm-i386/ asm-ia64/ 
aquarius linux-2.4.30 # ls -la /usr/src/linux-2.4.30/include/asm-i386/
total 692
drwxr-xr-x   2 573 573  1741 Apr  4 03:42 .
drwxr-xr-x  28 573 573   397 Apr  7 14:07 ..
-rw-r--r--   1 573 573   764 Jun 16  1995 a.out.h
-rw-rw-r--   1 573 573  4974 Apr  4 03:42 acpi.h
-rw-r--r--   1 573 573  2528 Nov 17 12:54 apic.h
-rw-r--r--   1 573 573  9610 Aug 25  2003 apicdef.h
-rw-r--r--   1 573 573  5066 Nov 22  2001 atomic.h
-rw-r--r--   1 573 573  9568 Aug  8  2004 bitops.h
-rw-r--r--   1 573 573   409 Apr 16  1997 boot.h
[cut]
There are no symlinks under /usr/src/linux-2.4.30/include/asm-i386/

Any clues? :( Please Cc: me in replies.
--
Martin Mokrejs
Email: 'bW9rcmVqc21Acmlib3NvbWUubmF0dXIuY3VuaS5jeg==\n'.decode('base64')
GPG key is at http://www.natur.cuni.cz/~mmokrejs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-2.4.30-rc1/include/asm: Too many levels of symbolic links

2005-03-23 Thread Martin MOKREJŠ
Hi,
 I recompiled current kernel to test something, and have realized everything
went well including "make modules_install", but no System.map was generated.
So I dug around and the "make dep" did not work well:
make[2]: Leaving directory `/usr/src/linux-2.4.30-rc1/arch/i386/lib'
make[1]: Leaving directory `/usr/src/linux-2.4.30-rc1'
make update-modverfile
make[1]: Entering directory `/usr/src/linux-2.4.30-rc1'
/usr/src/linux-2.4.30-rc1/include/linux/modversions.h was updated
make[1]: Leaving directory `/usr/src/linux-2.4.30-rc1'
scripts/mkdep -- `find /usr/src/linux-2.4.30-rc1/include/asm 
/usr/src/linux-2.4.30-rc1/include/linux /usr/src/linux-2.4.30-rc1/include/scsi 
/usr/src/linux-2.4.30-rc1/include/net /usr/src/linux-2.4.30-rc1/include/math-emu 
\( -name SCCS -o -name .svn \) -prune -o -follow -name \*.h ! -name modversions.h 
-print` > .hdepend
find: /usr/src/linux-2.4.30-rc1/include/asm: Too many levels of symbolic links
scripts/mkdep -- init/*.c > .depend
aquarius linux-2.4.30-rc1 # ls -la /usr/src/linux-2.4.30-rc1/include/asm
lrwxrwxrwx  1 root root 8 Mar 23 10:21 /usr/src/linux-2.4.30-rc1/include/asm -> 
asm-i386
aquarius linux-2.4.30-rc1 # ls -la /usr/src/linux-2.4.30-rc1/include/asm-i386/
total 692
drwxr-xr-x   2  573  573  1741 Mar 23 10:20 .
drwxr-xr-x  28  573  573   397 Mar 23 10:21 ..
-rw-r--r--   1  573  573   764 Jun 16  1995 a.out.h
-rw-rw-r--   1 root root  4974 Mar 23 10:20 acpi.h
-rw-r--r--   1  573  573  2528 Nov 17 12:54 apic.h
-rw-r--r--   1  573  573  9610 Aug 25  2003 apicdef.h
-rw-r--r--   1  573  573  5066 Nov 22  2001 atomic.h
-rw-r--r--   1  573  573  9568 Aug  8  2004 bitops.h
-rw-r--r--   1  573  573   409 Apr 16  1997 boot.h
-rw-r--r--   1  573  573  5739 Aug  3  2002 bugs.h
-rw-r--r--   1  573  573  1479 Jun 13  2003 byteorder.h
[cut]
aquarius linux-2.4.30-rc1 # find --version
GNU find version 4.2.18
Features enabled: D_TYPE O_NOFOLLOW(enabled) 
aquarius linux-2.4.30-rc1 # uname -a
Linux aquarius 2.6.11.5 #2 SMP Wed Mar 23 09:56:22 CET 2005 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz GenuineIntel GNU/Linux
aquarius linux-2.4.30-rc1 # 

Executing the same command from bash session gives:
aquarius linux-2.4.30-rc1 # find /usr/src/linux-2.4.30-rc1/include/asm /usr/src/linux-2.4.30-rc1/include/linux /usr/src/linux-2.4.30-rc1/include/scsi /usr/src/linux-2.4.30-rc1/include/net /usr/src/linux-2.4.30-rc1/include/math-emu \( -name SCCS -o -name .svn \) -prune -o -follow -name \*.h ! -name modversions.h -print 
find: /usr/src/linux-2.4.30-rc1/include/asm: Too many levels of symbolic links
/usr/src/linux-2.4.30-rc1/include/linux/usb_gadget.h
/usr/src/linux-2.4.30-rc1/include/linux/usb_ch9.h
/usr/src/linux-2.4.30-rc1/include/linux/tty_ldisc.h
/usr/src/linux-2.4.30-rc1/include/linux/tty.h
/usr/src/linux-2.4.30-rc1/include/linux/sysctl.h
/usr/src/linux-2.4.30-rc1/include/linux/swap.h
/usr/src/linux-2.4.30-rc1/include/linux/stallion.h
/usr/src/linux-2.4.30-rc1/include/linux/spinlock.h
[cut]
aquarius linux-2.4.30-rc1 # ls -la /usr/src/linux-2.4.30-rc1/include
total 88
drwxr-xr-x  28  573  573   397 Mar 23 10:21 .
drwxr-xr-x  15  573  573   337 Mar 23 10:26 ..
drwxrwxr-x   3  573  573   530 Nov 17 12:54 acpi
lrwxrwxrwx   1 root root 8 Mar 23 10:21 asm -> asm-i386
drwxr-xr-x   2  573  573  1673 Feb 18  2004 asm-alpha
drwxr-xr-x  24  573  573  1871 Nov 28  2003 asm-arm
drwxr-xr-x   2  573  573  1304 Feb 18  2004 asm-cris
drwxr-xr-x   2  573  573   112 Jun 13  2003 asm-generic
drwxr-xr-x   2  573  573  1741 Mar 23 10:20 asm-i386
drwxr-xr-x   3  573  573  1646 Aug  8  2004 asm-ia64
drwxr-xr-x   2  573  573  4096 Aug  8  2004 asm-m68k
drwxr-xr-x  20  573  573  8192 Jan 19 15:10 asm-mips
drwxr-xr-x  13  573  573  4096 Jan 19 15:10 asm-mips64
drwxr-xr-x   2  573  573  1567 Nov 28  2003 asm-parisc
drwxr-xr-x   2  573  573  4096 Mar 23 10:20 asm-ppc
drwxrwxr-x   3  573  573  1744 Aug  8  2004 asm-ppc64
drwxr-xr-x   2  573  573  1418 Nov 17 12:54 asm-s390
drwxr-xr-x   2  573  573  1382 Nov 17 12:54 asm-s390x
drwxr-xr-x   2  573  573  4096 Feb 18  2004 asm-sh
drwxrwxr-x   2  573  573  1216 Aug  8  2004 asm-sh64
drwxr-xr-x   2  573  573  4096 Mar 23 10:20 asm-sparc
drwxr-xr-x   2  573  573  4096 Mar 23 10:20 asm-sparc64
drwxrwxr-x   2  573  573  1773 Mar 23 10:20 asm-x86_64
drwxr-xr-x  14  573  573 16384 Mar 23 10:26 linux
drwxr-xr-x   2  573  573   152 Nov 29  2002 math-emu
drwxr-xr-x   5  573  573   955 Mar 23 10:20 net
drwxr-xr-x   2  573  573   211 Nov 17 12:54 pcmcia
drwxr-xr-x   2  573  57383 Nov 17 12:54 scsi
drwxr-xr-x   2  573  573   432 Feb 18  2004 video
aquarius linux-2.4.30-rc1 # 

This is a gentoo box. I cannot comment on the find(1) binary features (see 
above).
Any clues?
--
Martin Mokrejs
Email: 'bW9rcmVqc21Acmlib3NvbWUubmF0dXIuY3VuaS5jeg==\n'.decode('base64')
GPG key is at http://www.natur.cuni.cz/~mmokrejs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordom

Re: Unresolved symbols in /lib/modules/2.4.28-pre2/xfree-drm/via_drv.o

2005-03-18 Thread Martin MOKREJŠ
Marcelo Tosatti wrote:
On Wed, Mar 16, 2005 at 04:21:12PM +0100, Martin MOKREJ? wrote:
Arjan van de Ven wrote:
On Wed, 2005-03-16 at 16:03 +0100, Martin MOKREJ?? wrote:

Hi,
does anyone still use 2.4 series kernel? ;)

# make dep; make bzImage; make modules
[cut]
# make modules_install
[cut]
cd /lib/modules/2.4.30-pre3-bk2; \
mkdir -p pcmcia; \
find kernel -path '*/pcmcia/*' -name '*.o' | xargs -i -r ln -sf ../{} 
pcmcia
if [ -r System.map ]; then /sbin/depmod -ae -F System.map  
2.4.30-pre3-bk2; fi
depmod: *** Unresolved symbols in 
/lib/modules/2.4.28-pre2/xfree-drm/via_drv.o

this is not the module shipped by the kernel.org kernel...
Right. Sorry that I didn't say it more clearly, but I'm installing 
2.4.30-pre3-bk2 kernel.
cd /usr/src/linux-2.4.30-pre3-bk2
make dep
make bzImage
make modules
make modules_install

and then I hit the error about some totally unrelated kernel version in 
/lib/modules? :(

Martin,
Can you find out why is depmod trying to open /lib/modules/2.4.28-pre2/ ?
I've got no clue.
Hi all,
 unfortunately I was too fast deleting the problematic /lib/modules/2.4.28-pre2
directory. Here's the log of what I've done at about the time I hit the problem:
 427  cd /usr/src
 428  ls
 429  uname -a
 430  bzip2 -dc linux-2.4.29.tar.bz2 | tar xf -
 431  cd linux-2.4.29
 432  bzip2 -dc ~mmokrejs/tmp/patch-2.4.30-pre3.bz2 | patch -p1
 433  bzip2 -dc ~mmokrejs/tmp/patch-2.4.30-pre3-bk2.bz2 | patch -p1
 434  cp ../linux-2.4.30-pre1-bk5/.config .
 435  cd ..
 436  mv linux-2.4.29 linux-2.4.30-pre3-bk2
 437  pwd
 438  cd linux-2.4.30-pre3-bk2
 439  make oldconfig
 440  make dep; make bzIMage; make modules
 441  make modules_install
 442  make modules_install
 443  ls
 444  make bzImage
 445  make modules
 446  make modules_install
 447  make bzImage
 448  make modules
 449  make modules_install
 450  gzip .config
 451  gzip -d .config.
 452  gzip -d .config.gz 
 453  cp .config .config-ok
 454  make menuconfig
 455  make dep; make bzImage; make modules
 456  make modules_install
 457  make menuconfig
 458  make dep; make bzImage; make modules
 459  make modules_install
 460  make menuconfig
 461  make dep; make bzImage; make modules
 462  make modules_install
 463  make menuconfig
 464  make dep; make bzImage; make modules
 465  make modules_install
 466  rm -rf /lib/modules/2.4.28-pre2
 467  make modules_install

The step 466 "fixed" my problem. I have repeated all the steps as show in 
the history log,
but no luck to hit the problem again. I just guess the 
/lib/modules/2.4.28-pre2/xfree-drm/via_drv.o
file came from gentoo's xfree-drm package as at that time /usr/src/linux 
pointed to that kernel version.
The System.map (/usr/src/linux-2.4.28-pre2/System.map) maybe wasn't updated, as 
I think the modules
just get compiled against configured kernel src tree.
I did in the /usr/src/linux-2.4.30-pre3-bk2 tree (while rm -rf 
/lib/modules/2.4.28-pre2 already happened):
strace make modules_install 2>&1 | grep '2.4.28'
and go nothing interresting.
Also, 
find . -type f | xargs grep 2.4.28
doesn't show anything of interrest.

# depmod --version
module-init-tools 3.1
--
Martin Mokrejs
Email: 'bW9rcmVqc21Acmlib3NvbWUubmF0dXIuY3VuaS5jeg==\n'.decode('base64')
GPG key is at http://www.natur.cuni.cz/~mmokrejs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Unresolved symbols in /lib/modules/2.4.28-pre2/xfree-drm/via_drv.o

2005-03-16 Thread Martin MOKREJŠ
Hi,
 does anyone still use 2.4 series kernel? ;)
# make dep; make bzImage; make modules
[cut]
# make modules_install
[cut]
cd /lib/modules/2.4.30-pre3-bk2; \
mkdir -p pcmcia; \
find kernel -path '*/pcmcia/*' -name '*.o' | xargs -i -r ln -sf ../{} pcmcia
if [ -r System.map ]; then /sbin/depmod -ae -F System.map  2.4.30-pre3-bk2; fi
depmod: *** Unresolved symbols in /lib/modules/2.4.28-pre2/xfree-drm/via_drv.o
depmod: via_fb_init
depmod: via_mem_free
depmod: viadrv_driver_register_fns
depmod: via_agp_init
depmod: via_mem_alloc
#
.config attached
Please cc: me in replies. ;)
Martin


.config.gz
Description: Unix tar archive


Re: memory management weirdness

2005-02-22 Thread Martin MOKREJŠ
Ingo Molnar wrote:
* Andi Kleen <[EMAIL PROTECTED]> wrote:

 Although I've not re-tested this today again, it used to help a bit to specify
mem=3548M to decrease memory used by linux (tested with AGP card plugged in, 
when
bios reported 3556MB RAM only).
 I found that removing the AGP based videoc card and using an old PCI based
video card results in bios detecting 4072MB of RAM. But still, the machine was
slow. I've tried to "cat >| /proc/mtrr" to alter the memory settings, but the
result was only a partial speedup.
 I'm not sure how to convince linux kernel to run fast again.
It's most likely a MTRR problem. Play more with them.

in particular, try to create two small tables in the same format: one
showing the e820 memory map as reported in your kernel log, and one
showing the mtrr areas. If there is any e820 area that is not write-back
cached via the mtrr mappings then that's the problem. You can also use
"mem=exactmap,..." to fix up the memory map that the BIOS provides to
Linux. Slowdowns are very often such MTRR problems. (perhaps the kernel
should report RAM areas that are not covered by MTRR write-back?)
I've just extracted the requested info from the files I've put on web.
Here it is:
  2.4.30-pre1-bk52.6.11-rc4-bk7
 - 0009fc00 (usable)   ++
0009fc00 - 000a (reserved) ++
000e8000 - 0010 (reserved) ++
0010 - de33 (usable)   ++
de33 - de34 (ACPI data)++
de34 - de3f (ACPI NVS) ++
de3f - de40 (reserved) ++
ffb8 - 0001 (reserved) ++
found SMP MP-table at 000ff780 ++
hm, page 000ff000 reserved twice.  +- ???
hm, page 0010 reserved twice.  +- ???
hm, page 000f1000 reserved twice.  +- ???
hm, page 000f2000 reserved twice.  +- ???

   
2.4.30-pre1-bk52.6.11-rc4-bk7
reg00: base=0x (   0MB), size=2048MB: write-back, count=1   +   
   +
reg01: base=0x8000 (2048MB), size=1024MB: write-back, count=1   +   
   +
reg02: base=0xc000 (3072MB), size= 256MB: write-back, count=1   +   
   +
reg03: base=0xd000 (3328MB), size= 128MB: write-back, count=1   +   
   +
reg04: base=0xd800 (3456MB), size=  64MB: write-back, count=1   +   
   +
reg05: base=0xdc00 (3520MB), size=  32MB: write-back, count=1   +   
   +
reg06: base=0xfe80 (4072MB), size=   4MB: write-combining, count=1  +   
   - !!!
reg06: base=0xf000 (3840MB), size= 128MB: write-combining, count=1  +   
   +
The 4MB area should be AGP aperture, as it was set in BIOS to 4MB only
The files on web contain concatened infor from dmes, iomem, interrupts, mtrr, 
lspci:
http://www.natur.cuni.cz/~mmokrejs/tmp/4MB/2.4.30-pre1-bk5
http://www.natur.cuni.cz/~mmokrejs/tmp/4MB/2.6.11-rc4-bk7
So, 2.6 kernel does not see AGP aperture area. What to do next? ;)
Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: memory management weirdness

2005-02-22 Thread Martin MOKREJŠ
Parag Warudkar wrote:
Hi,
 I have received no answer to my former question
(see http://marc.theaimsgroup.com/?l=linux-kernel&m=110827143716215&w=2).
I've spent some more time on that problem and have more or less confirmed
it's because of buggy bios. However, the linux kernel doesn't handle properly
such case. I've tested 2.4.30-pre1 kernel and latest 2.6.11-rc4 kernel.
The conclusion is, that once the machine has physically installed 4x1GB
DDR400 DIMM's (bios detects only 3556 or less memory as some buffers
are allocated by the Intel 875P chipset and AGP card), the linux 2.6.11*
runs up-to 18x slower than when only 2x1GB + 2x 512MB DDR memory is installed.
Can you enable profiling and then post the profile info for various cases
- slow and fast? Check out Documentation/basic_profiling.txt in the kernel
source for understanding how to do this. This might help narrow down the issue.
http://www.natur.cuni.cz/~mmokrejs/tmp/profile-2.6.11-rc4-bk7-(3|4)GB.txt
The 3GB labeled file corresponds to fast case, 4GB is ugly slow.
What can you gather from those files? I've used readprofile but also oprofile
was enabled in kernel. I've left on the web also /proc/profile snapshots along 
with
System.map file. Maybe oprofile can also be used later to extract info from 
them.
Many thanks for help!
Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


memory management weirdness

2005-02-21 Thread Martin MOKREJŠ
Hi,
 I have received no answer to my former question
(see http://marc.theaimsgroup.com/?l=linux-kernel&m=110827143716215&w=2).
I've spent some more time on that problem and have more or less confirmed
it's because of buggy bios. However, the linux kernel doesn't handle properly
such case. I've tested 2.4.30-pre1 kernel and latest 2.6.11-rc4 kernel.
The conclusion is, that once the machine has physically installed 4x1GB
DDR400 DIMM's (bios detects only 3556 or less memory as some buffers
are allocated by the Intel 875P chipset and AGP card), the linux 2.6.11*
runs up-to 18x slower than when only 2x1GB + 2x 512MB DDR memory is installed.
 Although I've not re-tested this today again, it used to help a bit to specify
mem=3548M to decrease memory used by linux (tested with AGP card plugged in, 
when
bios reported 3556MB RAM only).
 I found that removing the AGP based videoc card and using an old PCI based
video card results in bios detecting 4072MB of RAM. But still, the machine was
slow. I've tried to "cat >| /proc/mtrr" to alter the memory settings, but the
result was only a partial speedup.
 I'm not sure how to convince linux kernel to run fast again.
I suspect either the memory mapping of interrupts are the cause. Disabling
acpi did not help me initially, so I've conducted most of my tests with
acpi enabled.
 I've put dmesg, iomem, interrupt, lspci and time(1) requirements of my test
on web: http://www.natur.cuni.cz/~mmokrejs/tmp/. The differences can be seen easily
by diffing the files. All tests in http://www.natur.cuni.cz/~mmokrejs/tmp/4MB/
we carried with AGP aperture size set to 4MB, although teh video card has
128MB RAM.
 
 Later I've reverted to AGP aperture set to 128MB back and tested again:
http://www.natur.cuni.cz/~mmokrejs/tmp/128MB/.

 Finally, I put back two 512MB memory modules to have only 3GB RAM physically,
and the result is at 
http://www.natur.cuni.cz/~mmokrejs/tmp/128MB/only_phys_3GB/.
 About a week ago I tried to contact ASUS, but no answer so far from their
techinical support through some web robot.
http://vip.asus.com/eservice/techmailstatus.aspx?ID=WTM200502111723398547
I do not recommend their "greatest" and real "flag-ship" P4C800-E-Deluxe
motherboard for use with memory sizes above 3GB (although they claim 4GB
is possible). BIOS is the latest release 1.19, although 1.20.001 was tested
as well.
 
My questions to LKML people are:

1)  Could someone tell me what are the differences in
2.4.30-pre1 kernel'd dmesg and 2.6.11-rc4* dmesg outputs? For example, memory
areas "reserved twice" reported by 2.4.30-pre1. Also, differences in /proc/mtrr
under both kernels differ.
2)  How about the /proc/interrupts outputs? Aren't they too high? How about the
level/edge interrupt mappings? Would they help?
Please Cc: me in replies. Many thanks for any response, I have wasted seemingly
a lot of money on 2GB RAM. :(
martin
P.S.:
1GB DDR400 modules are Micron CL2.5 2bank 512M chip modules (64Mx8),
non-ecc, unbuffered
512MB DDR 500 modules (yes, PC4000, not PC3200 as is the max supported
by the motherboard) are Kingston HyperX modules, non-ecc, unbuffered.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


HIGHMEM slows down 2.6.11-rc3-bk7 machine

2005-02-12 Thread Martin MOKREJŠ
Hi Marcello and other gurus!
 I have just bough 4GB of RAM into my machine. Immediately, I have noticed
the machine is terribly slow on bootup. After inspecting all BIOS related
possibilities I found that the problem goes off with highmem=off. I should
note I don't see this slowdown when I have only 2 or 3GB RAM while using the
same kernel.
 The MB is ASUS P4C800-E-Deluxe with i875P chipset and ICH5R. It should
support fully 4GB, but docs say ICH5R controller might allocate something
for itself, so one should expect "less". :(
Hmm, at the best I get 3557MB of RAM when almost everything is
disabled in BIOS, mainly USB/NET/FIREWIRE/SATA stuff. Grrr.
I tested latest beta and release of bios too, but no luck.
root=/dev/sdb2 ide=reverse agp=try_unsupported console=ttyS0,57600n8 
console=tty0 vga=792 idebus=66 highmem=off
$ cat /proc/mtrr
reg00: base=0x (   0MB), size=2048MB: write-back, count=1
reg01: base=0x8000 (2048MB), size=1024MB: write-back, count=1
reg02: base=0xc000 (3072MB), size= 256MB: write-back, count=1
reg03: base=0xd000 (3328MB), size= 128MB: write-back, count=1
reg04: base=0xd800 (3456MB), size=  64MB: write-back, count=1
reg05: base=0xdc00 (3520MB), size=  32MB: write-back, count=1
reg06: base=0xf000 (3840MB), size= 128MB: write-combining, count=2
reg07: base=0xfe80 (4072MB), size=   4MB: write-combining, count=1
$
$ cat /mtrr-with-highmem 
reg00: base=0x (   0MB), size=2048MB: write-back, count=1
reg01: base=0x8000 (2048MB), size=1024MB: write-back, count=1
reg02: base=0xc000 (3072MB), size= 256MB: write-back, count=1
reg03: base=0xd000 (3328MB), size= 128MB: write-back, count=1
reg04: base=0xd800 (3456MB), size=  64MB: write-back, count=1
reg05: base=0xdc00 (3520MB), size=  32MB: write-back, count=1
reg06: base=0xf000 (3840MB), size= 128MB: write-combining, count=2
reg07: base=0xfe80 (4072MB), size=   4MB: write-combining, count=1
$

Please not that at the moment, BIOS says only 3555MB are available, so am
a bit surprised linux sees anything above 3456MB (expect that to be somehow 
used by the ICH5R beast).
see attached dmesg 2.6.11-rc3-bk7 for the cases when no highmem was/wasn't 
enabled
Here is the diff of both:
--- /dm 2005-02-13 05:27:12.328500335 +0100
+++ /dm-with-highmem2005-02-13 05:44:13.106978275 +0100
@@ -8,13 +8,13 @@
 BIOS-e820: de24 - de2f (ACPI NVS)
 BIOS-e820: de2f - de30 (reserved)
 BIOS-e820: ffb8 - 0001 (reserved)
-0MB HIGHMEM available.
+2658MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000ff780
-On node 0 totalpages: 229376
+On node 0 totalpages: 909872
  DMA zone: 4096 pages, LIFO batch:1
  Normal zone: 225280 pages, LIFO batch:16
-  HighMem zone: 0 pages, LIFO batch:1
+  HighMem zone: 680496 pages, LIFO batch:16
DMI 2.3 present.
ACPI: RSDP (v002 ACPIAM) @ 0x000f9e30
ACPI: XSDT (v001 A M I  OEMXSDT  0x1426 MSFT 0x0097) @ 0xde230100
@@ -37,19 +37,19 @@
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Built 1 zonelists
-Kernel command line: root=/dev/sda2 ide=reverse agp=try_unsupported 
console=ttyS0,57600n8 console=tty0 vga=792 idebus=66 highmem=off
+Kernel command line: root=/dev/sda2 ide=reverse agp=try_unsupported 
console=ttyS0,57600n8 console=tty0 vga=792 idebus=66
ide_setup: ide=reverse : Enabled support for IDE inverse scan order.
ide_setup: idebus=66
mapped APIC to d000 (fee0)
mapped IOAPIC to c000 (fec0)
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 65536 bytes)
-Detected 3075.740 MHz processor.
+Detected 3075.443 MHz processor.
Using pmtmr for high-res timesource
Console: colour dummy device 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
-Memory: 902348k/917504k available (3380k kernel code, 14644k reserved, 1649k 
data, 224k init, 0k highmem)
+Memory: 3602700k/3639488k available (3380k kernel code, 35972k reserved, 1649k 
data, 224k init, 2721984k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 6094.84 BogoMIPS (lpj=3047424)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
@@ -130,6 +130,7 @@
pnp: 00:07: ioport range 0x290-0x297 has been reserved
Machine check exception polling timer started.
IA-32 Microcode Update Driver: v1.14 <[EMAIL PROTECTED]>
+highmem bounce pool size: 64 pages
devfs: 2004-01-31 Richard Gooch ([EMAIL PROTECTED])
devfs: boot_options: 0x1
SGI XFS with no debug enabled
@@ -257,15 +258,15 @@
ACPI: PCI interrupt :00:1f.5[B] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device :00:1f.5 to 64
AC'97 0 analog subsections not ready
-intel8x0_measure_ac97_clock: measured 49507 usecs
+intel8x0_measure_ac97_clock: measured 50508 usecs
intel8x0: clocking to 48000
ALSA device list:
  #