Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-09 Thread Dieter Nützel

Am Samstag,  5. Mai 2001 09:13 schrieben Sie:
> > My (very) old Athlon 550 (model 1, stepping 2) show it on my MSI MS-6167
> > (AMD Irongate C4) with your 2.4.4-ac5, now :-(
>
> Manfred has a good explanation for that. Im hoping it also explains the
> VIA problem too
>
> > I am open for any test fixes...
>
> Watch this space -> <- ;)
>
> Alan

Sorry for my noise!
My problem was NOT fast_page_copy related.
It was Justin's aic7xxx 6.1.12 release.
His latest 6.1.13 (2.4.4-ac6) fixed it for me.

My MSI MS-6167 (AMD Irongate C4) is running very well with APIC (it haven't 
really have one) and ACPI (latest) enabled.

Below are some MMX copy results.

Thanks anyway.
Dieter

BTW Where can I grep the bench with MB/sec output?

SunWave1>./athlon
Athlon test program $Id: fast.c,v 1.6 2000/09/23 09:05:45 arjan Exp $
clear_page() tests
clear_page function 'warm up run'took 17396 cycles per page
clear_page function '2.4 non MMX'took 9582 cycles per page
clear_page function '2.4 MMX fallback'   took 9031 cycles per page
clear_page function '2.4 MMX version'took 7905 cycles per page
clear_page function 'faster_clear_page'  took 8237 cycles per page
clear_page function 'even_faster_clear'  took 8151 cycles per page
 
copy_page() tests
copy_page function 'warm up run' took 12565 cycles per page
copy_page function '2.4 non MMX' took 17273 cycles per page
copy_page function '2.4 MMX fallback'took 17481 cycles per page
copy_page function '2.4 MMX version' took 12507 cycles per page
copy_page function 'faster_copy' took 13641 cycles per page
copy_page function 'even_faster' took 12707 cycles per page

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-09 Thread Tom Leete

Alan Cox wrote:
> 
> > Trace; c01b956a 
> > Trace; c01b3fb5 
> > Trace; c01b9aca 
> > Trace; c01b9380 
> > Trace; c01b9940 
> > Trace; c01bd457 
> > Trace; c01b4d2a 
> > Trace; c01b5010 
> > Trace; c01b51ff 
> 
> We seem to be several layers into recursive use of the ide driver - which
> shouldnt happen. In fact if these are the same interface the second dmatable
> build would leave HWIF(drive)->sg_table wrong.
> 
> > Trace; c01866ce <__make_request+4ae/6f0>
> > Trace; c01866e6 <__make_request+4c6/6f0>
> > Trace; c01b956a 
> > Trace; c01b3fb5 

I think maybe it smells like a configuration problem, I have a pair of ATAPI
drives on the second ide which I run with SCSI emulation. I'll see if I can
get a better look, with arguments to the calls.

Thanks,
Tom

-- 
The Daemons lurk and are dumb. -- Emerson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-09 Thread Alan Cox

> Trace; 037f 
> Trace;  
> Trace;  
> Trace; 0720 

Lets ignore the crap above..

> Trace; c01b956a 
> Trace; c01b3fb5 
> Trace; c01b9aca 
> Trace; c01b9380 
> Trace; c01b9940 
> Trace; c01bd457 
> Trace; c01b4d2a 
> Trace; c01b5010 
> Trace; c01b51ff 

We seem to be several layers into recursive use of the ide driver - which 
shouldnt happen. In fact if these are the same interface the second dmatable 
build would leave HWIF(drive)->sg_table wrong.

> Trace; c01866ce <__make_request+4ae/6f0>
> Trace; c01866e6 <__make_request+4c6/6f0>
> Trace; c01b956a 
> Trace; c01b3fb5 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-09 Thread Tom Leete

Alan Cox wrote:
> 
> 
> > IIRC this thread is about boot going catatonic right after unloading
> > __initmem.
> 
> Nope. Its about memory corruptions. Your bug sounds very different
> 
> > Earlier, it looks like handle_mm_fault is being triggered from
> > fast_clear_page.
> 
> That would be messy. The other way around is sane but not that way

Indeed, I was confused. Looks like ide-dma is getting goofy somehow.

Here is a decoded trace. Typos are likely. If the problem is not obvious to
anyone, I'll switch around my serial console setup to get some better info.

Warning (Oops_read): Code line not seen, dumping what data is available

Trace; 037f 
Trace;  
Trace;  
Trace; 0720 
Trace; c01b956a 
Trace; c01b3fb5 
Trace; c01b9aca 
Trace; c01b9380 
Trace; c01b9940 
Trace; c01bd457 
Trace; c01b4d2a 
Trace; c01b5010 
Trace; c01b51ff 
Trace; c01b3430 
Trace; c0132e45 <__wait_on_buffer+75/90>
Trace; c0134026 
Trace; c018665c <__make_request+43c/6f0>
Trace; c01866ce <__make_request+4ae/6f0>
Trace; c01866e6 <__make_request+4c6/6f0>
Trace; c018665c <__make_request+43c/6f0>
Trace; c01866ce <__make_request+4ae/6f0>
Trace; c01866e6 <__make_request+4c6/6f0>
Trace; c01b956a 
Trace; c01b3fb5 
Trace; c01b9aca 
Trace; c01b9380 
Trace; c01b9940 
Trace; c01bd457 
Trace; c01b4d2a 
Trace; c01b5010 
Trace; c01b51ff 
Trace; c01b546e 
Trace; c01134d0 
Trace; c012bd2c 
Trace; c012cbd7 <__alloc_pages+87/300>
Trace; c022c12f 
Trace; c0125f4d 
Trace; c0125f58 
Trace; c022c0ca 
Trace; c0122b7d 
Trace; c012bd2c 
Trace; c0122a15 
Trace; c022c0ca 
Trace; c0222b7d 
Trace; c0225476 
Trace; c02254c3 
Trace; c0112ba9 
Trace; c01239cf 
Trace; c0112900 
Trace; c0106ddc 
Trace; c022be20 
Trace; c0112900 
Trace; c0134026 
Trace; c018665c <__make_request+43c/6f0>
Trace; c01866ce <__make_request+4ae/6f0>
Trace; c01866e6 <__make_request+4c6/6f0>
Trace; c01b956a 
Trace; c01b3fb5 
Trace; c01b9aca 
Trace; c01b9380 
Trace; c01b9940 
Trace; c01bd457 
Trace; c01b4d2a 
Trace; c01b5010 
Trace; c01b51ff 
Trace; c01b546e 
Trace; c01134d0 
Trace; c012be03 
Trace; c0124d05 <__find_get_page+35/80>
Trace; c013ae1b 
Trace; c0125d44 
Trace; c013af79 
Trace; c0122a15 
Trace; c012267d 
Trace; c0112ba9 
Trace; c022d075 
Trace; c022f396 
Trace; c0106cbf 


1 warning issued.  Results may not be reliable.

Cheers,
Tom

-- 
The Daemons lurk and are dumb. -- Emerson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-09 Thread Alan Cox

> > What still stands out is that exactly _zero_ people have reported the same
> > problem with non VIA chipset Athlons.
> 
> Not any more :-(

Still the same

> IIRC this thread is about boot going catatonic right after unloading
> __initmem.

Nope. Its about memory corruptions. Your bug sounds very different

> Earlier, it looks like handle_mm_fault is being triggered from
> fast_clear_page.

That would be messy. The other way around is sane but not that way
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-09 Thread Alan Cox

  What still stands out is that exactly _zero_ people have reported the same
  problem with non VIA chipset Athlons.
 
 Not any more :-(

Still the same

 IIRC this thread is about boot going catatonic right after unloading
 __initmem.

Nope. Its about memory corruptions. Your bug sounds very different

 Earlier, it looks like handle_mm_fault is being triggered from
 fast_clear_page.

That would be messy. The other way around is sane but not that way
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-09 Thread Tom Leete

Alan Cox wrote:
 
 
  IIRC this thread is about boot going catatonic right after unloading
  __initmem.
 
 Nope. Its about memory corruptions. Your bug sounds very different
 
  Earlier, it looks like handle_mm_fault is being triggered from
  fast_clear_page.
 
 That would be messy. The other way around is sane but not that way

Indeed, I was confused. Looks like ide-dma is getting goofy somehow.

Here is a decoded trace. Typos are likely. If the problem is not obvious to
anyone, I'll switch around my serial console setup to get some better info.

Warning (Oops_read): Code line not seen, dumping what data is available

Trace; 037f END_OF_CODE+3fcfb2ab/
Trace;  END_OF_CODE+3fcfaf2c/
Trace;  END_OF_CODE+3fcfaf2c/
Trace; 0720 END_OF_CODE+3fcfb64c/
Trace; c01b956a ide_build_dmatable+2a/120
Trace; c01b3fb5 ide_set_handler+55/60
Trace; c01b9aca ide_dmaproc+11a/210
Trace; c01b9380 ide_dma_intr+0/b0
Trace; c01b9940 dma_timer_expiry+0/70
Trace; c01bd457 do_rw_disk+257/300
Trace; c01b4d2a ide_wait_stat+7a/e0
Trace; c01b5010 start_request+160/210
Trace; c01b51ff ide_do_request+10f/340
Trace; c01b3430 ali_cleanup+10/70
Trace; c0132e45 __wait_on_buffer+75/90
Trace; c0134026 bread+16/70
Trace; c018665c __make_request+43c/6f0
Trace; c01866ce __make_request+4ae/6f0
Trace; c01866e6 __make_request+4c6/6f0
Trace; c018665c __make_request+43c/6f0
Trace; c01866ce __make_request+4ae/6f0
Trace; c01866e6 __make_request+4c6/6f0
Trace; c01b956a ide_build_dmatable+2a/120
Trace; c01b3fb5 ide_set_handler+55/60
Trace; c01b9aca ide_dmaproc+11a/210
Trace; c01b9380 ide_dma_intr+0/b0
Trace; c01b9940 dma_timer_expiry+0/70
Trace; c01bd457 do_rw_disk+257/300
Trace; c01b4d2a ide_wait_stat+7a/e0
Trace; c01b5010 start_request+160/210
Trace; c01b51ff ide_do_request+10f/340
Trace; c01b546e do_ide_request+e/20
Trace; c01134d0 schedule+200/3e0
Trace; c012bd2c free_shortage+1c/c0
Trace; c012cbd7 __alloc_pages+87/300
Trace; c022c12f fast_copy_page+f/90
Trace; c0125f4d filemap_nopage+2bd/420
Trace; c0125f58 filemap_nopage+2c8/420
Trace; c022c0ca fast_clear_page+a/60
Trace; c0122b7d handle_mm_fault+cd/e0
Trace; c012bd2c free_shortage+1c/c0
Trace; c0122a15 do_no_page+45/e0
Trace; c022c0ca fast_clear_page+a/60
Trace; c0222b7d packet_ioctl+17d/350
Trace; c0225476 do_xprt_transmit+46/3d0
Trace; c02254c3 do_xprt_transmit+93/3d0
Trace; c0112ba9 do_page_fault+2a9/450
Trace; c01239cf do_munmap+5f/280
Trace; c0112900 do_page_fault+0/450
Trace; c0106ddc error_code+34/3c
Trace; c022be20 clear_user+30/40
Trace; c0112900 do_page_fault+0/450
Trace; c0134026 bread+16/70
Trace; c018665c __make_request+43c/6f0
Trace; c01866ce __make_request+4ae/6f0
Trace; c01866e6 __make_request+4c6/6f0
Trace; c01b956a ide_build_dmatable+2a/120
Trace; c01b3fb5 ide_set_handler+55/60
Trace; c01b9aca ide_dmaproc+11a/210
Trace; c01b9380 ide_dma_intr+0/b0
Trace; c01b9940 dma_timer_expiry+0/70
Trace; c01bd457 do_rw_disk+257/300
Trace; c01b4d2a ide_wait_stat+7a/e0
Trace; c01b5010 start_request+160/210
Trace; c01b51ff ide_do_request+10f/340
Trace; c01b546e do_ide_request+e/20
Trace; c01134d0 schedule+200/3e0
Trace; c012be03 inactive_shortage+33/90
Trace; c0124d05 __find_get_page+35/80
Trace; c013ae1b search_binary_handler+17b/190
Trace; c0125d44 filemap_nopage+b4/420
Trace; c013af79 do_execve+149/200
Trace; c0122a15 do_no_page+45/e0
Trace; c012267d vmtruncate+12d/160
Trace; c0112ba9 do_page_fault+2a9/450
Trace; c022d075 rwsem_down_write_failed+65/140
Trace; c022f396 stext_lock+37a/16bc
Trace; c0106cbf system_call+33/38


1 warning issued.  Results may not be reliable.

Cheers,
Tom

-- 
The Daemons lurk and are dumb. -- Emerson
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-09 Thread Alan Cox

 Trace; 037f END_OF_CODE+3fcfb2ab/
 Trace;  END_OF_CODE+3fcfaf2c/
 Trace;  END_OF_CODE+3fcfaf2c/
 Trace; 0720 END_OF_CODE+3fcfb64c/

Lets ignore the crap above..

 Trace; c01b956a ide_build_dmatable+2a/120
 Trace; c01b3fb5 ide_set_handler+55/60
 Trace; c01b9aca ide_dmaproc+11a/210
 Trace; c01b9380 ide_dma_intr+0/b0
 Trace; c01b9940 dma_timer_expiry+0/70
 Trace; c01bd457 do_rw_disk+257/300
 Trace; c01b4d2a ide_wait_stat+7a/e0
 Trace; c01b5010 start_request+160/210
 Trace; c01b51ff ide_do_request+10f/340

We seem to be several layers into recursive use of the ide driver - which 
shouldnt happen. In fact if these are the same interface the second dmatable 
build would leave HWIF(drive)-sg_table wrong.

 Trace; c01866ce __make_request+4ae/6f0
 Trace; c01866e6 __make_request+4c6/6f0
 Trace; c01b956a ide_build_dmatable+2a/120
 Trace; c01b3fb5 ide_set_handler+55/60
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-09 Thread Tom Leete

Alan Cox wrote:
 
  Trace; c01b956a ide_build_dmatable+2a/120
  Trace; c01b3fb5 ide_set_handler+55/60
  Trace; c01b9aca ide_dmaproc+11a/210
  Trace; c01b9380 ide_dma_intr+0/b0
  Trace; c01b9940 dma_timer_expiry+0/70
  Trace; c01bd457 do_rw_disk+257/300
  Trace; c01b4d2a ide_wait_stat+7a/e0
  Trace; c01b5010 start_request+160/210
  Trace; c01b51ff ide_do_request+10f/340
 
 We seem to be several layers into recursive use of the ide driver - which
 shouldnt happen. In fact if these are the same interface the second dmatable
 build would leave HWIF(drive)-sg_table wrong.
 
  Trace; c01866ce __make_request+4ae/6f0
  Trace; c01866e6 __make_request+4c6/6f0
  Trace; c01b956a ide_build_dmatable+2a/120
  Trace; c01b3fb5 ide_set_handler+55/60

I think maybe it smells like a configuration problem, I have a pair of ATAPI
drives on the second ide which I run with SCSI emulation. I'll see if I can
get a better look, with arguments to the calls.

Thanks,
Tom

-- 
The Daemons lurk and are dumb. -- Emerson
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-09 Thread Dieter Nützel

Am Samstag,  5. Mai 2001 09:13 schrieben Sie:
  My (very) old Athlon 550 (model 1, stepping 2) show it on my MSI MS-6167
  (AMD Irongate C4) with your 2.4.4-ac5, now :-(

 Manfred has a good explanation for that. Im hoping it also explains the
 VIA problem too

  I am open for any test fixes...

 Watch this space - - ;)

 Alan

Sorry for my noise!
My problem was NOT fast_page_copy related.
It was Justin's aic7xxx 6.1.12 release.
His latest 6.1.13 (2.4.4-ac6) fixed it for me.

My MSI MS-6167 (AMD Irongate C4) is running very well with APIC (it haven't 
really have one) and ACPI (latest) enabled.

Below are some MMX copy results.

Thanks anyway.
Dieter

BTW Where can I grep the bench with MB/sec output?

SunWave1./athlon
Athlon test program $Id: fast.c,v 1.6 2000/09/23 09:05:45 arjan Exp $
clear_page() tests
clear_page function 'warm up run'took 17396 cycles per page
clear_page function '2.4 non MMX'took 9582 cycles per page
clear_page function '2.4 MMX fallback'   took 9031 cycles per page
clear_page function '2.4 MMX version'took 7905 cycles per page
clear_page function 'faster_clear_page'  took 8237 cycles per page
clear_page function 'even_faster_clear'  took 8151 cycles per page
 
copy_page() tests
copy_page function 'warm up run' took 12565 cycles per page
copy_page function '2.4 non MMX' took 17273 cycles per page
copy_page function '2.4 MMX fallback'took 17481 cycles per page
copy_page function '2.4 MMX version' took 12507 cycles per page
copy_page function 'faster_copy' took 13641 cycles per page
copy_page function 'even_faster' took 12707 cycles per page

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-08 Thread Tom Leete

Alan Cox wrote:
> 
> > the memory copy in the fast_page_copy routine.  The machine then
> > proceeded
> > not to stop at my panic, but I got my "normal" oopses.  I then had an
> 
> Ok
> 
> > idea and removed all the prefetch instructions from the beginning of the
> > routine and tried the resultin kernel.  I now have no crashes.
> > What could this mean?
> 
> I think it has to mean a hardware problem.

I don't think so, reasons below
 
> What still stands out is that exactly _zero_ people have reported the same
> problem with non VIA chipset Athlons.

Not any more :-(

Hi Alan,

IIRC this thread is about boot going catatonic right after unloading
__initmem.
I'm seeing that in 2.4.5-pre1 with Athlon stepping 2, AMD 751, MS-6195 mobo,
128M.
The machine is fine with kernels up through 2.4.4-pre3, and still works with
them.

On that gear, there is no crash. The keyboard and display are alive and
SysRq works.
I have copied the stack trace for pid=1 and the processor dump. I'm short of
time
but I have a kind typist electrifying the trace, and I'll try to generate
something
ksymoops can digest.

Here is what a quick eyeballing of System.map shows.

The code is at the end of init/main.c:init(). The processor dump shows
init() halted
in default_idle() from the sequence L6 -> init -> cpu_idle.

Trace of pid 1 shows it stuck in D state. The last addresses listed are from
filemap_nopage -> do_execve -> do_no_page -> handle_mm_fault -> __pmd_alloc
-> rwsem_down_write_failed -> stext_lock -> system_call. That looks fishy.

Earlier, it looks like handle_mm_fault is being triggered from
fast_clear_page.

I'll post the full dump soon as I have it.

Btw, above happens with both gcc-2.95.3 and gcc-3.0-[20010423] compiled
kernels.

Cheers,
Tom

-- 
The Daemons lurk and are dumb. -- Emerson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-08 Thread Arjan van de Ven

In article <[EMAIL PROTECTED]> you wrote:
> Arjan - care to unroll the tail 320 bytes of copying from the main loop ?

I'll see what I can do to make us not loose too much speed.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-08 Thread Arjan van de Ven

In article [EMAIL PROTECTED] you wrote:
 Arjan - care to unroll the tail 320 bytes of copying from the main loop ?

I'll see what I can do to make us not loose too much speed.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-08 Thread Tom Leete

Alan Cox wrote:
 
  the memory copy in the fast_page_copy routine.  The machine then
  proceeded
  not to stop at my panic, but I got my normal oopses.  I then had an
 
 Ok
 
  idea and removed all the prefetch instructions from the beginning of the
  routine and tried the resultin kernel.  I now have no crashes.
  What could this mean?
 
 I think it has to mean a hardware problem.

I don't think so, reasons below
 
 What still stands out is that exactly _zero_ people have reported the same
 problem with non VIA chipset Athlons.

Not any more :-(

Hi Alan,

IIRC this thread is about boot going catatonic right after unloading
__initmem.
I'm seeing that in 2.4.5-pre1 with Athlon stepping 2, AMD 751, MS-6195 mobo,
128M.
The machine is fine with kernels up through 2.4.4-pre3, and still works with
them.

On that gear, there is no crash. The keyboard and display are alive and
SysRq works.
I have copied the stack trace for pid=1 and the processor dump. I'm short of
time
but I have a kind typist electrifying the trace, and I'll try to generate
something
ksymoops can digest.

Here is what a quick eyeballing of System.map shows.

The code is at the end of init/main.c:init(). The processor dump shows
init() halted
in default_idle() from the sequence L6 - init - cpu_idle.

Trace of pid 1 shows it stuck in D state. The last addresses listed are from
filemap_nopage - do_execve - do_no_page - handle_mm_fault - __pmd_alloc
- rwsem_down_write_failed - stext_lock - system_call. That looks fishy.

Earlier, it looks like handle_mm_fault is being triggered from
fast_clear_page.

I'll post the full dump soon as I have it.

Btw, above happens with both gcc-2.95.3 and gcc-3.0-[20010423] compiled
kernels.

Cheers,
Tom

-- 
The Daemons lurk and are dumb. -- Emerson
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-06 Thread Jeremy

Have non-production via KT133a, will test :) (tyan mobo, 1.33ghz, tulip eth, an
idea drive, nothing really exciting, just a fast ath)

-j

John R Lenton enlightened recipients with the following on 06May2001:
> On Sat, May 05, 2001 at 08:20:56AM +0100, Alan Cox wrote:
> > Dont panic just yet. Manfred's observation could mean we hit chipset specific 
> > behaviour on prefetches. 
> 
> OK - Please let me know when to start.

-- 
---
  heffner at darkness.net
   Darkness Network Engineering
   PGP public key available on request
My thoughts and opinions represent no one but myself
---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-06 Thread John R Lenton

On Sat, May 05, 2001 at 08:20:56AM +0100, Alan Cox wrote:
> Dont panic just yet. Manfred's observation could mean we hit chipset specific 
> behaviour on prefetches. 

OK - Please let me know when to start.

-- 
John Lenton ([EMAIL PROTECTED]) -- Random fortune:
BOFH excuse #349:

Stray Alpha Particles from memory packaging caused Hard Memory Error on Server.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-06 Thread John R Lenton

On Sat, May 05, 2001 at 08:20:56AM +0100, Alan Cox wrote:
 Dont panic just yet. Manfred's observation could mean we hit chipset specific 
 behaviour on prefetches. 

OK - Please let me know when to start.

-- 
John Lenton ([EMAIL PROTECTED]) -- Random fortune:
BOFH excuse #349:

Stray Alpha Particles from memory packaging caused Hard Memory Error on Server.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-06 Thread Jeremy

Have non-production via KT133a, will test :) (tyan mobo, 1.33ghz, tulip eth, an
idea drive, nothing really exciting, just a fast ath)

-j

John R Lenton enlightened recipients with the following on 06May2001:
 On Sat, May 05, 2001 at 08:20:56AM +0100, Alan Cox wrote:
  Dont panic just yet. Manfred's observation could mean we hit chipset specific 
  behaviour on prefetches. 
 
 OK - Please let me know when to start.

-- 
---
  heffner at darkness.net
   Darkness Network Engineering
   PGP public key available on request
My thoughts and opinions represent no one but myself
---
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-05 Thread Jeremy

Quick note.  I *AM* seeing this problem on a Tyan S2390B which has the
Via KT133A chipset on it.

AMD Athlon 1.33ghz
2x256m DIMMs
Linux 2.4.4-ac5

I haven't done the ksymoops conversions yet, but please let me know if you'd
like anything else.  But basically, it looks exactly like what all the IWILL
owners are seeing.

Any other tyan S2390B users?

thx, -j


-- 
---
  heffner at darkness.net
   Darkness Network Engineering
   PGP public key available on request
My thoughts and opinions represent no one but myself
---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-05 Thread Alan Cox

> > one of them to a new mb with a non-VIA chipset (Asus A7A266), and it boot=
> ed the
> > first Athlon kernel I tried (2.4.4).  No other changes to .config, same
> > processor as before, same memory, same disks, same video, same case, same=
>  power
> > cord, you name it.
> 
> damn. I guess the saving of 200$ on the MSI has probably been
> 300$ down the drain :(

Dont panic just yet. Manfred's observation could mean we hit chipset specific 
behaviour on prefetches. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-05 Thread Alan Cox

> My (very) old Athlon 550 (model 1, stepping 2) show it on my MSI MS-6167 (AMD 
> Irongate C4) with your 2.4.4-ac5, now :-(

Manfred has a good explanation for that. Im hoping it also explains the 
VIA problem too

> I am open for any test fixes...

Watch this space -> <- ;)

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-05 Thread John R Lenton

On Sat, May 05, 2001 at 12:10:06AM +0600, Bobby D. Bryant wrote:
> They do boot PIII kernels reliably for all those variants, though they still
> suffer occasional oopses, hangs, or crashes (as discussed in other threads).

and as happens with my SMP pIII VIA-based boxed (and I've finally
fixed the memory, so I no longer get the oopses, just solid
hardware hangs).

> However (and here's the part I haven't mentioned before), yesterday I switched
> one of them to a new mb with a non-VIA chipset (Asus A7A266), and it booted the
> first Athlon kernel I tried (2.4.4).  No other changes to .config, same
> processor as before, same memory, same disks, same video, same case, same power
> cord, you name it.

damn. I guess the saving of 200$ on the MSI has probably been
300$ down the drain :(

-- 
John Lenton ([EMAIL PROTECTED]) -- Random fortune:
If you treat people right they will treat you right -- 90% of the time.
-- Franklin Delano Roosevelt

 PGP signature


Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-05 Thread John R Lenton

On Sat, May 05, 2001 at 12:10:06AM +0600, Bobby D. Bryant wrote:
 They do boot PIII kernels reliably for all those variants, though they still
 suffer occasional oopses, hangs, or crashes (as discussed in other threads).

and as happens with my SMP pIII VIA-based boxed (and I've finally
fixed the memory, so I no longer get the oopses, just solid
hardware hangs).

 However (and here's the part I haven't mentioned before), yesterday I switched
 one of them to a new mb with a non-VIA chipset (Asus A7A266), and it booted the
 first Athlon kernel I tried (2.4.4).  No other changes to .config, same
 processor as before, same memory, same disks, same video, same case, same power
 cord, you name it.

damn. I guess the saving of 200$ on the MSI has probably been
300$ down the drain :(

-- 
John Lenton ([EMAIL PROTECTED]) -- Random fortune:
If you treat people right they will treat you right -- 90% of the time.
-- Franklin Delano Roosevelt

 PGP signature


Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-05 Thread Alan Cox

 My (very) old Athlon 550 (model 1, stepping 2) show it on my MSI MS-6167 (AMD 
 Irongate C4) with your 2.4.4-ac5, now :-(

Manfred has a good explanation for that. Im hoping it also explains the 
VIA problem too

 I am open for any test fixes...

Watch this space - - ;)

Alan

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-05 Thread Alan Cox

  one of them to a new mb with a non-VIA chipset (Asus A7A266), and it boot=
 ed the
  first Athlon kernel I tried (2.4.4).  No other changes to .config, same
  processor as before, same memory, same disks, same video, same case, same=
  power
  cord, you name it.
 
 damn. I guess the saving of 200$ on the MSI has probably been
 300$ down the drain :(

Dont panic just yet. Manfred's observation could mean we hit chipset specific 
behaviour on prefetches. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-05 Thread Jeremy

Quick note.  I *AM* seeing this problem on a Tyan S2390B which has the
Via KT133A chipset on it.

AMD Athlon 1.33ghz
2x256m DIMMs
Linux 2.4.4-ac5

I haven't done the ksymoops conversions yet, but please let me know if you'd
like anything else.  But basically, it looks exactly like what all the IWILL
owners are seeing.

Any other tyan S2390B users?

thx, -j


-- 
---
  heffner at darkness.net
   Darkness Network Engineering
   PGP public key available on request
My thoughts and opinions represent no one but myself
---
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Bobby D. Bryant

Aaron Tiensivu wrote:

> > What still stands out is that exactly _zero_ people have reported the same
> > problem with non VIA chipset Athlons.
>
> This might be grasping at straws [...] This could be (total conjecture)

> related somehow to the corruption bugs they are admitting to in

> the 686B although they are blaming the SB Live now.

Just another data point (the news is in the final paragraph):

I recently built two near-twin systems using Athlon 1.2's and VIA chipsets
(EPoX 8KTA3), and have *never* been able to get either to boot an
Athlon-optimized kernel, having tried 2.4.0, 2.4.2, 2.4.4, and about 5
different -ac* variants of 2.4.3.

They do boot PIII kernels reliably for all those variants, though they still
suffer occasional oopses, hangs, or crashes (as discussed in other threads).

However (and here's the part I haven't mentioned before), yesterday I switched
one of them to a new mb with a non-VIA chipset (Asus A7A266), and it booted the
first Athlon kernel I tried (2.4.4).  No other changes to .config, same
processor as before, same memory, same disks, same video, same case, same power
cord, you name it.

Bobby Bryant
Austin, Texas


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Joseph Carter

On Sat, May 05, 2001 at 03:51:13PM +1200, Chris Wedgwood wrote:
> I don't see how they figure, but in case there was any doubt I
> have a VIA KT133A/686B board (Abit KT7A) and don't experience
> anything resembling disk corruption unless the box crashes for
> some other reason.  I do seem to be experiencing AGP problems in
> spades, but my disks at least are fine.
> 
> I too seem no disk problems whatsoever (nothing really interesting
> there, many people do not) but am also seeing AGP problems.
> 
> In fact, I had to disable AGP to stop X locking the box hard... yet
> agpgart and the video driver (NVidia[1]) both claim to support the
> chipset -- does anyone actually have this working?)

Not an option with the Radeon unfortunately.  At least, not yet.  Whenever
I find the solution (recently a bunch of people have suggested a bunch of
things to try on dri-devel - thanks guys!) I'll post to that list what
fixed it since I know I am not the only person seeing this kind of
problem.  I think some of the guys are looking into improving the docs a
bit, so maybe if I find it soon the problem and workaround will get
documented.  =)

-- 
Joseph Carter <[EMAIL PROTECTED]>Free software developer

 kb: I demand integrity and honesty in those who i do business with
 i know my demands are unreasonable, but a guy can dream, can't he?


 PGP signature


Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Seth Goldberg

Chris Wedgwood wrote:
> 
> On Fri, May 04, 2001 at 05:26:57PM -0700, Joseph Carter wrote:
> 
> I don't see how they figure, but in case there was any doubt I
> have a VIA KT133A/686B board (Abit KT7A) and don't experience
> anything resembling disk corruption unless the box crashes for
> some other reason.  I do seem to be experiencing AGP problems in
> spades, but my disks at least are fine.
> 
> I too seem no disk problems whatsoever (nothing really interesting
> there, many people do not) but am also seeing AGP problems.
> 
> In fact, I had to disable AGP to stop X locking the box hard... yet
> agpgart and the video driver (NVidia[1]) both claim to support the
> chipset -- does anyone actually have this working?)

  My IWILL (KT133A) + GeForce 256 are working fine over AGP.

  --S
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Dieter Nützel

> What still stands out is that exactly _zero_ people have reported the same
> problem with non VIA chipset Athlons.

Sorry Alan, but...

My (very) old Athlon 550 (model 1, stepping 2) show it on my MSI MS-6167 (AMD 
Irongate C4) with your 2.4.4-ac5, now :-(
Even with or without apm/acpi enabled.
It freezes during "Freeing unused kernel memory: 172k freed".
Never saw this before.

I am open for any test fixes...

-Dieter

SuSE 7.1 (glibc-2.2, gcc-2.95.2)

Linux version 2.4.4 (root@SunWave1) (gcc version 2.95.2 19991024 (release)) 
#1 Sun Apr 29 02:30:34 CEST 2001
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009fc00 (usable)
 BIOS-e820: 0009fc00 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 13ff (usable)
 BIOS-e820: 13ff - 13ff3000 (ACPI NVS)
 BIOS-e820: 13ff3000 - 1400 (ACPI data)
 BIOS-e820:  - 0001 (reserved)
Scan SMP from c000 for 1024 bytes.
Scan SMP from c009fc00 for 1024 bytes.
Scan SMP from c00f for 65536 bytes.
Scan SMP from c009f800 for 4096 bytes.
On node 0 totalpages: 81904
zone(0): 4096 pages.
zone(1): 77808 pages.
zone(2): 0 pages.
mapped APIC to e000 (01555000)

SunWave1>cat /proc/cpuinfo
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 6
model   : 1
model name  : AMD-K7(tm) Processor
stepping: 2
cpu MHz : 548.950
cache size  : 512 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 1
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov 
pat mmx syscall mmxext 3dnowext 3dnow
bogomips: 1094.45

-- 
Dieter Nützel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
Cognitive Systems Group
Vogt-Kölln-Straße 30
D-22527 Hamburg, Germany

email: [EMAIL PROTECTED]
@home: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Joseph Carter

On Fri, May 04, 2001 at 06:26:14PM -0400, Aaron Tiensivu wrote:
> This might be grasping at straws I remember VIA problem in the "good old
> days" of Socket 7 with CPU/PCI Prefetches and especially Read-around-Write
> settings that would cause issues like we're seeing with the Athlon
> pre-fetches. This could be (total conjecture) related somehow to the
> corruption bugs they are admitting to in the 686B although they are blaming
> the SB Live now.

I don't see how they figure, but in case there was any doubt I have a VIA
KT133A/686B board (Abit KT7A) and don't experience anything resembling
disk corruption unless the box crashes for some other reason.  I do seem
to be experiencing AGP problems in spades, but my disks at least are fine.

-- 
Joseph Carter <[EMAIL PROTECTED]>Free software developer

<_Anarchy_> Argh.. who's handing out the paper bags  8)


 PGP signature


Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Aaron Tiensivu


> What still stands out is that exactly _zero_ people have reported the same
> problem with non VIA chipset Athlons.

This might be grasping at straws I remember VIA problem in the "good old
days" of Socket 7 with CPU/PCI Prefetches and especially Read-around-Write
settings that would cause issues like we're seeing with the Athlon
pre-fetches. This could be (total conjecture) related somehow to the
corruption bugs they are admitting to in the 686B although they are blaming
the SB Live now.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Alan Cox

> prefetch 320(%0) can fetch memory behind the end of the source page.
> Perhaps it accesses memory in the ISA hole, or beyond the end of memory?
> Could you post the e820 map from dmesg?
> 
> It's possible to build manually a memory map.
> Could you build one with wide margins from "dangerous" areas? (untested:
> mem=exactmap mem=620k@0 mem=M@1M)
> 
> Then boot with prefetch enabled.

That might not be the actual bug but for rev 1 Athlon it is a real bug. The
first step athlons have an unfortunate problem in that will prefetch
memory marked uncachable and corrupt their caches with it.

Arjan - care to unroll the tail 320 bytes of copying from the main loop ?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Alan Cox

> the memory copy in the fast_page_copy routine.  The machine then
> proceeded
> not to stop at my panic, but I got my "normal" oopses.  I then had an

Ok

> idea and removed all the prefetch instructions from the beginning of the
> routine and tried the resultin kernel.  I now have no crashes.
> What could this mean?

I think it has to mean a hardware problem.

> Here is a nother patch just so you can keep me honest if I
> made another mistake:

There is a mistake but you wont trigger it. It is no longer 26 bytes 8)
That patch is only used when the prefetchw faults with an illegal instruction
and is done so you can boot an athlon kernel on a lesser cpu

The prefetch instructions hint to the CPU what memory we will access very soon.
The primary effect of that is that we hit full theoretical memory bandwidth
when copying pages. It doesnt really change execution behaviour in any other
way which then does rather point to cpu or other hardware problem. The very
early athlons had prefetch bugs but we would not trigger those and no reporters
have such an early CPU.

What still stands out is that exactly _zero_ people have reported the same
problem with non VIA chipset Athlons.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Brian Gerst

Seth Goldberg wrote:
> 
> Hi,
> 
>  After removing my head from my a**, I revised the code that checks
> the memory copy in the fast_page_copy routine.  The machine then
> proceeded
> not to stop at my panic, but I got my "normal" oopses.  I then had an
> idea and removed all the prefetch instructions from the beginning of the
> routine and tried the resultin kernel.  I now have no crashes.
> What could this mean?

What are your "normal" oopses?

--

Brian Gerst
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Seth Goldberg



On Fri, 4 May 2001, Manfred Spraul wrote:

| > ---
| > >   __asm__ __volatile__ (
| > 158c157
| > <   "3: movw $0x1AEB, 1b\n"
| > ---
| > >   "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
| > 166c165
| > < */
| > ---
| > >
| > 170c169
| > <   "1: nop\n" /* prefetch 320(%0)\n" */
| > ---
| > >   "1: prefetch 320(%0)\n" 
| > -
| >   Please let me know if that makes sense :).
| 
| Very interesting.
| You've removed only the prefetch 320(%0), not the other prefetch
| instructions?

  No, I have removed them all -- I just commented the block above it
completely out :). (maybe you didn't see the whole patch?)

  --Seth



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Manfred Spraul

> ---
> >   __asm__ __volatile__ (
> 158c157
> <   "3: movw $0x1AEB, 1b\n"
> ---
> >   "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
> 166c165
> < */
> ---
> >
> 170c169
> <   "1: nop\n" /* prefetch 320(%0)\n" */
> ---
> >   "1: prefetch 320(%0)\n" 
> -
>   Please let me know if that makes sense :).

Very interesting.
You've removed only the prefetch 320(%0), not the other prefetch
instructions?

prefetch 320(%0) can fetch memory behind the end of the source page.
Perhaps it accesses memory in the ISA hole, or beyond the end of memory?
Could you post the e820 map from dmesg?

It's possible to build manually a memory map.
Could you build one with wide margins from "dangerous" areas? (untested:
mem=exactmap mem=620k@0 mem=M@1M)

Then boot with prefetch enabled.
--
Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Seth Goldberg

Hi,
 
 After removing my head from my a**, I revised the code that checks
the memory copy in the fast_page_copy routine.  The machine then
proceeded
not to stop at my panic, but I got my "normal" oopses.  I then had an
idea and removed all the prefetch instructions from the beginning of the
routine and tried the resultin kernel.  I now have no crashes.
What could this mean?

Here is a nother patch just so you can keep me honest if I
made another mistake:

-
diff -r ./arch/i386/lib/mmx.c ../lin2/linux/arch/i386/lib/mmx.c
149,150c149
<
< /*__asm__ __volatile__ (
---
>   __asm__ __volatile__ (
158c157
<   "3: movw $0x1AEB, 1b\n"
---
>   "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
166c165
< */
---
>
170c169
<   "1: nop\n" /* prefetch 320(%0)\n" */
---
>   "1: prefetch 320(%0)\n" 
-

  Please let me know if that makes sense :).

  Thank you,
   Seth
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Seth Goldberg

Hi,
 
 After removing my head from my a**, I revised the code that checks
the memory copy in the fast_page_copy routine.  The machine then
proceeded
not to stop at my panic, but I got my normal oopses.  I then had an
idea and removed all the prefetch instructions from the beginning of the
routine and tried the resultin kernel.  I now have no crashes.
What could this mean?

Here is a nother patch just so you can keep me honest if I
made another mistake:

-
diff -r ./arch/i386/lib/mmx.c ../lin2/linux/arch/i386/lib/mmx.c
149,150c149

 /*__asm__ __volatile__ (
---
   __asm__ __volatile__ (
158c157
   3: movw $0x1AEB, 1b\n
---
   3: movw $0x1AEB, 1b\n /* jmp on 26 bytes */
166c165
 */
---

170c169
   1: nop\n /* prefetch 320(%0)\n */
---
   1: prefetch 320(%0)\n 
-

  Please let me know if that makes sense :).

  Thank you,
   Seth
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Manfred Spraul

 ---
__asm__ __volatile__ (
 158c157
3: movw $0x1AEB, 1b\n
 ---
3: movw $0x1AEB, 1b\n /* jmp on 26 bytes */
 166c165
  */
 ---
 
 170c169
1: nop\n /* prefetch 320(%0)\n */
 ---
1: prefetch 320(%0)\n 
 -
   Please let me know if that makes sense :).

Very interesting.
You've removed only the prefetch 320(%0), not the other prefetch
instructions?

prefetch 320(%0) can fetch memory behind the end of the source page.
Perhaps it accesses memory in the ISA hole, or beyond the end of memory?
Could you post the e820 map from dmesg?

It's possible to build manually a memory map.
Could you build one with wide margins from dangerous areas? (untested:
mem=exactmap mem=620k@0 mem=your mem in MB-2M@1M)

Then boot with prefetch enabled.
--
Manfred
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Seth Goldberg



On Fri, 4 May 2001, Manfred Spraul wrote:

|  ---
| __asm__ __volatile__ (
|  158c157
| 3: movw $0x1AEB, 1b\n
|  ---
| 3: movw $0x1AEB, 1b\n /* jmp on 26 bytes */
|  166c165
|   */
|  ---
|  
|  170c169
| 1: nop\n /* prefetch 320(%0)\n */
|  ---
| 1: prefetch 320(%0)\n 
|  -
|Please let me know if that makes sense :).
| 
| Very interesting.
| You've removed only the prefetch 320(%0), not the other prefetch
| instructions?

  No, I have removed them all -- I just commented the block above it
completely out :). (maybe you didn't see the whole patch?)

  --Seth



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Brian Gerst

Seth Goldberg wrote:
 
 Hi,
 
  After removing my head from my a**, I revised the code that checks
 the memory copy in the fast_page_copy routine.  The machine then
 proceeded
 not to stop at my panic, but I got my normal oopses.  I then had an
 idea and removed all the prefetch instructions from the beginning of the
 routine and tried the resultin kernel.  I now have no crashes.
 What could this mean?

What are your normal oopses?

--

Brian Gerst
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Alan Cox

 the memory copy in the fast_page_copy routine.  The machine then
 proceeded
 not to stop at my panic, but I got my normal oopses.  I then had an

Ok

 idea and removed all the prefetch instructions from the beginning of the
 routine and tried the resultin kernel.  I now have no crashes.
 What could this mean?

I think it has to mean a hardware problem.

 Here is a nother patch just so you can keep me honest if I
 made another mistake:

There is a mistake but you wont trigger it. It is no longer 26 bytes 8)
That patch is only used when the prefetchw faults with an illegal instruction
and is done so you can boot an athlon kernel on a lesser cpu

The prefetch instructions hint to the CPU what memory we will access very soon.
The primary effect of that is that we hit full theoretical memory bandwidth
when copying pages. It doesnt really change execution behaviour in any other
way which then does rather point to cpu or other hardware problem. The very
early athlons had prefetch bugs but we would not trigger those and no reporters
have such an early CPU.

What still stands out is that exactly _zero_ people have reported the same
problem with non VIA chipset Athlons.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Alan Cox

 prefetch 320(%0) can fetch memory behind the end of the source page.
 Perhaps it accesses memory in the ISA hole, or beyond the end of memory?
 Could you post the e820 map from dmesg?
 
 It's possible to build manually a memory map.
 Could you build one with wide margins from dangerous areas? (untested:
 mem=exactmap mem=620k@0 mem=your mem in MB-2M@1M)
 
 Then boot with prefetch enabled.

That might not be the actual bug but for rev 1 Athlon it is a real bug. The
first step athlons have an unfortunate problem in that will prefetch
memory marked uncachable and corrupt their caches with it.

Arjan - care to unroll the tail 320 bytes of copying from the main loop ?

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Aaron Tiensivu


 What still stands out is that exactly _zero_ people have reported the same
 problem with non VIA chipset Athlons.

This might be grasping at straws I remember VIA problem in the good old
days of Socket 7 with CPU/PCI Prefetches and especially Read-around-Write
settings that would cause issues like we're seeing with the Athlon
pre-fetches. This could be (total conjecture) related somehow to the
corruption bugs they are admitting to in the 686B although they are blaming
the SB Live now.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Joseph Carter

On Fri, May 04, 2001 at 06:26:14PM -0400, Aaron Tiensivu wrote:
 This might be grasping at straws I remember VIA problem in the good old
 days of Socket 7 with CPU/PCI Prefetches and especially Read-around-Write
 settings that would cause issues like we're seeing with the Athlon
 pre-fetches. This could be (total conjecture) related somehow to the
 corruption bugs they are admitting to in the 686B although they are blaming
 the SB Live now.

I don't see how they figure, but in case there was any doubt I have a VIA
KT133A/686B board (Abit KT7A) and don't experience anything resembling
disk corruption unless the box crashes for some other reason.  I do seem
to be experiencing AGP problems in spades, but my disks at least are fine.

-- 
Joseph Carter [EMAIL PROTECTED]Free software developer

_Anarchy_ Argh.. who's handing out the paper bags  8)


 PGP signature


Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Dieter Nützel

 What still stands out is that exactly _zero_ people have reported the same
 problem with non VIA chipset Athlons.

Sorry Alan, but...

My (very) old Athlon 550 (model 1, stepping 2) show it on my MSI MS-6167 (AMD 
Irongate C4) with your 2.4.4-ac5, now :-(
Even with or without apm/acpi enabled.
It freezes during Freeing unused kernel memory: 172k freed.
Never saw this before.

I am open for any test fixes...

-Dieter

SuSE 7.1 (glibc-2.2, gcc-2.95.2)

Linux version 2.4.4 (root@SunWave1) (gcc version 2.95.2 19991024 (release)) 
#1 Sun Apr 29 02:30:34 CEST 2001
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009fc00 (usable)
 BIOS-e820: 0009fc00 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 13ff (usable)
 BIOS-e820: 13ff - 13ff3000 (ACPI NVS)
 BIOS-e820: 13ff3000 - 1400 (ACPI data)
 BIOS-e820:  - 0001 (reserved)
Scan SMP from c000 for 1024 bytes.
Scan SMP from c009fc00 for 1024 bytes.
Scan SMP from c00f for 65536 bytes.
Scan SMP from c009f800 for 4096 bytes.
On node 0 totalpages: 81904
zone(0): 4096 pages.
zone(1): 77808 pages.
zone(2): 0 pages.
mapped APIC to e000 (01555000)

SunWave1cat /proc/cpuinfo
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 6
model   : 1
model name  : AMD-K7(tm) Processor
stepping: 2
cpu MHz : 548.950
cache size  : 512 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 1
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov 
pat mmx syscall mmxext 3dnowext 3dnow
bogomips: 1094.45

-- 
Dieter Nützel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
Cognitive Systems Group
Vogt-Kölln-Straße 30
D-22527 Hamburg, Germany

email: [EMAIL PROTECTED]
@home: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Seth Goldberg

Chris Wedgwood wrote:
 
 On Fri, May 04, 2001 at 05:26:57PM -0700, Joseph Carter wrote:
 
 I don't see how they figure, but in case there was any doubt I
 have a VIA KT133A/686B board (Abit KT7A) and don't experience
 anything resembling disk corruption unless the box crashes for
 some other reason.  I do seem to be experiencing AGP problems in
 spades, but my disks at least are fine.
 
 I too seem no disk problems whatsoever (nothing really interesting
 there, many people do not) but am also seeing AGP problems.
 
 In fact, I had to disable AGP to stop X locking the box hard... yet
 agpgart and the video driver (NVidia[1]) both claim to support the
 chipset -- does anyone actually have this working?)

  My IWILL (KT133A) + GeForce 256 are working fine over AGP.

  --S
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Joseph Carter

On Sat, May 05, 2001 at 03:51:13PM +1200, Chris Wedgwood wrote:
 I don't see how they figure, but in case there was any doubt I
 have a VIA KT133A/686B board (Abit KT7A) and don't experience
 anything resembling disk corruption unless the box crashes for
 some other reason.  I do seem to be experiencing AGP problems in
 spades, but my disks at least are fine.
 
 I too seem no disk problems whatsoever (nothing really interesting
 there, many people do not) but am also seeing AGP problems.
 
 In fact, I had to disable AGP to stop X locking the box hard... yet
 agpgart and the video driver (NVidia[1]) both claim to support the
 chipset -- does anyone actually have this working?)

Not an option with the Radeon unfortunately.  At least, not yet.  Whenever
I find the solution (recently a bunch of people have suggested a bunch of
things to try on dri-devel - thanks guys!) I'll post to that list what
fixed it since I know I am not the only person seeing this kind of
problem.  I think some of the guys are looking into improving the docs a
bit, so maybe if I find it soon the problem and workaround will get
documented.  =)

-- 
Joseph Carter [EMAIL PROTECTED]Free software developer

hop kb: I demand integrity and honesty in those who i do business with
hop i know my demands are unreasonable, but a guy can dream, can't he?


 PGP signature


Re: REVISED: Experimentation with Athlon and fast_page_copy

2001-05-04 Thread Bobby D. Bryant

Aaron Tiensivu wrote:

  What still stands out is that exactly _zero_ people have reported the same
  problem with non VIA chipset Athlons.

 This might be grasping at straws [...] This could be (total conjecture)

 related somehow to the corruption bugs they are admitting to in

 the 686B although they are blaming the SB Live now.

Just another data point (the news is in the final paragraph):

I recently built two near-twin systems using Athlon 1.2's and VIA chipsets
(EPoX 8KTA3), and have *never* been able to get either to boot an
Athlon-optimized kernel, having tried 2.4.0, 2.4.2, 2.4.4, and about 5
different -ac* variants of 2.4.3.

They do boot PIII kernels reliably for all those variants, though they still
suffer occasional oopses, hangs, or crashes (as discussed in other threads).

However (and here's the part I haven't mentioned before), yesterday I switched
one of them to a new mb with a non-VIA chipset (Asus A7A266), and it booted the
first Athlon kernel I tried (2.4.4).  No other changes to .config, same
processor as before, same memory, same disks, same video, same case, same power
cord, you name it.

Bobby Bryant
Austin, Texas


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/