I would be intersted seeing this code with none quota code in it.
Right now I don't run the quota patch.

As well I had a crash with the 2.4.20-ac2-ctx16 as well.
Only diff is that the bug happens at a far different row now but still
in sched.c and seems to still be the same reason. However I got 4 days
out of it instead of only 2 as previously.
Consider the discussions about semaphors and soft irq the little
assembler I know and understand  (been over 9 years and I was no good
at it) this seems to have some merit to it and sounds like a plausible
explanation to it.

bug trace to follow

ksymoops 2.4.5 on i686 2.4.20-ac2-ctx16.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.20-ac2-ctx16/ (default)
     -m /boot/System.map-2.4.20-ac2-ctx16 (default)

kernel BUG at sched.c:990!
invalid operand: 0000
CPU:    0
EIP:    0010:[<c0115f61>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010202
eax: 00000002   ebx: c02b8220   ecx: c02b8224   edx: ffffffff
esi: dce8c000   edi: dce8d9b4   ebp: dce8d98c   esp: dce8d974
ds: 0018   es: 0018   ss: 0018
Process httpd (pid: 1894, stackpage=dce0d000)
Stack: 00000000 00000000 00000000 c02b8220 dce8c000 dce8d9b4 eee70bc0 c026af4c
       00000212 c02b8220 e8d7b840 e8a3aa40 c026ae29 c02b8220 dce8d9b4 ffffffff
       c02b8224 c02b8224 dce8c000 00000002 e38cfb40 c0125e8a f090f000 c0244488
Call Trace:    [<c026af4c>] [<c026ae29>] [<c0125e8a>] {<c0244488>] [<c0138fef>]
  [<c0241687>] [<c0244a4f>] [<c0138fef>] [<c01288c0>] [<c0128a19>] [<c01c231e>]
  [<c01f2a65>] [<c01f34c8>] [<c02119b8>] [<c024831d>] [<c024876e>] [<c024181c>]
  [<c0241ba3>] [<c024205a>] [<c0229bb0>] [<c0229cfb>] [<c021ce43>] [<c0229bb0>]
  [<c022994d>] [<c0229bb0>] [<c0229f00>] [<c0216631>] [<c021675d>] [<c021686a>]
  [<c011da94>] [<c010a47e>] [<c010c9d8>] [<c0269226>] [<c02121f1>] [<c0124fd4>]
  [<c0145445>] [<c0146568>] [<c013a523>] [<c013a91a>] [<c0108db7>]
Code: 0f 0b de 03 10 92 27 c0 e9 29 fd ff ff 89 f6 55 89 e5 57 89


>>EIP; c0115f61 <schedule+261/270>   <=====
>>ebx; c02b8220 <uts_sem+0/20>
>>ecx; c02b8224 <uts_sem+4/20>
>>edx; ffffffff <END_OF_CODE+f704694/????>
>>esi; dce8c000 <_end+1cb4d03c/304d90bc>
>>edi; dce8d9b4 <_end+1cb4e9f0/304d90bc>
>>ebp; dce8d98c <_end+1cb4e9c8/304d90bc>
>>esp; dce8d974 <_end+1cb4e9b0/304d90bc>

Trace; c026af4c <rwsem_down_failed_common+4c/344d>
Trace; c026ae29 <rwsem_down_write_failed+29/40>
Trace; c0125e8a <.text.lock.sys+aa/160>
Trace; c0241687 <tcp_v4_syn_recv_sock+47/180>
Trace; c0244a4f <tcp_check_req+1cf/420>
Trace; c0138fef <page_add_rmap+2f/80>
Trace; c01288c0 <do_no_page+e0/1b0>
Trace; c0128a19 <handle_mm_fault+89/150>
Trace; c01c231e <intr_handler+be/f0>
Trace; c01f2a65 <ide_build_sglist+95/180>
Trace; c01f34c8 <__ide_dma_begin+38/50>
Trace; c02119b8 <sock_def_readable+58/60>
Trace; c024831d <udp_queue_rcv_skb+15d/180>
Trace; c024876e <udp_rcv+1de/330>
Trace; c024181c <tcp_v4_hnd_req+5c/170>
Trace; c0241ba3 <tcp_v4_do_rcv+133/190>
Trace; c024205a <tcp_v4_rcv+45a/510>
Trace; c0229bb0 <ip_local_deliver_finish+0/170>
Trace; c0229cfb <ip_local_deliver_finish+14b/170>
Trace; c021ce43 <nf_hook_slow+b3/180>
Trace; c0229bb0 <ip_local_deliver_finish+0/170>
Trace; c022994d <ip_local_deliver+4d/70>
Trace; c0229bb0 <ip_local_deliver_finish+0/170>
Trace; c0229f00 <ip_rcv_finish+1e0/260>
Trace; c0216631 <netif_receive_skb+111/1d0>
Trace; c021675d <process_backlog+6d/110>
Trace; c021686a <net_rx_action+6a/100>
Trace; c011da94 <do_softirq+94/a0>
Trace; c010a47e <do_IRQ+9e/a0>
Trace; c010c9d8 <call_do_IRQ+5/d>
Trace; c0269226 <memcpy+26/60>
Trace; c02121f1 <__kfree_skb+101/160>
Trace; c0124fd4 <sys_newuname+c4/100>
Trace; c0145445 <path_release+15/40>
Trace; c0146568 <open_namei+238/5c0>
Trace; c013a523 <filp_open+43/70>
Trace; c013a91a <sys_open+8a/a0>
Trace; c0108db7 <system_call+33/38>

Code;  c0115f61 <schedule+261/270>
00000000 <_EIP>:
Code;  c0115f61 <schedule+261/270>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c0115f63 <schedule+263/270>
   2:   de 03                     fiadd  (%ebx)
Code;  c0115f65 <schedule+265/270>
   4:   10 92 27 c0 e9 29         adc    %dl,0x29e9c027(%edx)
Code;  c0115f6b <schedule+26b/270>
   a:   fd                        std
Code;  c0115f6c <schedule+26c/270>
   b:   ff                        (bad)
Code;  c0115f6d <schedule+26d/270>
   c:   ff 89 f6 55 89 e5         decl   0xe58955f6(%ecx)
Code;  c0115f73 <__wake_up+3/70>
  12:   57                        push   %edi
Code;  c0115f74 <__wake_up+4/70>
  13:   89 00                     mov    %eax,(%eax)

 <0>Kernel Panic: Aiee, killing interrupt handler!

Tuesday, March 18, 2003, 10:35:39 AM, you wrote:

JS> At 14:49 on Tue 18/03/03, [EMAIL PROTECTED] masquerading as 'Herbert Poetzl' wrote:
>> okay, I should rephrase:
>> 
>> - it is obvious that (as the BUG message states)
>>   schedule() is called from an (soft)irq, which
>>   in turn bails out (correct behaviour ;)
>> 
>> - I can follow your deduction that (if the crash
>>   is only ctx related, what seems very likely)
>>   the schedule() got triggered by some semaphore 
>>   acquiring, which failed (for whatever reason)
>>   
>> - what I was unable to trace so far is the path, 
>>   on which sys_assign_ip_info (for example)
>>   gets called from a softirq ...
>>   
>> tcp_create_openreq_child -> sys_assign_ip_info
>> tcp_v4/6_syn_recv_sock -> tcp_create_openreq_child
>> tcp_check_req -> syn_recv_sock()
>> tcp_v4/6_hnd_req -> tcp_check_req
>> tcp_v4/6_do_rcv -> tcp_v4/6_hnd_req
>> tcp_v4/6_rcv -> tcp_v4/6_do_rcv
>> 
>> ip_mr_input -> ip_local_deliver (multicast)
>> 
>> process_backlog -> netif_receive_skb

JS> The argument in a nutshell is:

JS>   *) do_softirq() occurs in the backtrace
 
JS>         *) semaphores, which are not allowed at IRQ level, are spotted near
JS>            the oops
 
JS>   *) assuming that the backtrace is continuous
 
JS>   =) this indicates that semaphores are being used illegally

JS>  =>) replace semaphores with spinlocks or redesign

JS> I want to say 'dammit, it works and it's "obvious"', but you're quite
JS> correct, this is not rigorous enough!

JS> Another decoded oops (from the email I sent to the list and cc'd you on
JS> last Thursday) replete with tortuous ascii code path:

JS> //----------------------------------------------------------------------

JS>  X rwsem_down_write_failed   
JS>  ? .text.lock.sys            
JS>  ^ tcp_create_openreq_child  <== bingo
JS>  ^ tcp_v4_syn_recv_sock
JS>  \ tcp_check_req -<-<-<-<-_  <== only called by tcp_v4_hnd_req
JS>                            \     
JS>    __alloc_pages           |
JS>    do_no_page              ^
JS>    __kfree_skb             |
JS>    ipt_do_table            ^
JS>    ipt_do_table            |
JS>    ip_local_deliver_finish ^
JS>    ipt_do_table            |
JS>    netif_rx                ^
JS>                            |
JS>  / tcp_v4_hnd_req -->->->-/ 
JS>  ^ tcp_v4_do_rcv
JS>  \ tcp_v4_rcv -<-<-<-<-<-_  <== Setup as static struct inet_protocol
JS>                           \     tcp_protocol:handler in protocol.c
JS>                            \
JS>    ip_local_deliver_finish  \
JS>    ip_local_deliver_finish   |
JS>    nf_hook_slow              |
JS>                              |
JS>  / ip_local_deliver_finish -/  <== Calls struct inet_protocol*ipprot->
JS>  ^                                                          handler() 
JS>  \ ip_local_deliver  -<-_   <== route.c in various places sets 
JS>                          \        struct rtable * rth->u.dst.input =
JS>                           \                          ip_local_deliver;
JS>                            \ 
JS>    ip_local_deliver_finish  |
JS>    ip_rcv_finish            |
JS>    nf_iterate              /
JS>    nf_hook_slow           /
JS>                          /
JS>  / ip_rcv_finish ->->->-/  <= calls ip_route_input() which links rth
JS>  ^                            above to skb, then calls skb->dst->
JS>  |                                                        input(skb);
JS>  ^
JS>  \ ip_rcv -<-<-<-<-<-<-_   <= ip_rcv() set as struct packet_type
JS>                               \     ip_packet_type->func in ip_output.c
JS>                                                                                    
       ^                     
JS>    ip_rcv_finish        |  
JS>                         ^ 
JS>                                                                                    
       |
JS>  / netif_receive_skb --/   <= calls struct packet_type * pt_prev->func()
JS>  ^ process_backlog
JS>  ^ net_rx_action             
JS>  \ do_softirq                <== dev.c's net_dev_init() calls:
JS>                                  open_softirq(NET_RX_SOFTIRQ,
JS>                                                                                    
                                              net_rx_action, NULL);
JS>    .text.lock.af_inet
JS>    inet_stream_connect
JS>    sys_connect
JS>    sock_map_fd
JS>    sys_socket
JS>    sys_socketcall
JS>    sys_write
JS>    do_page_fault

JS> //----------------------------------------------------------------------

JS> Enough?

JS> The answer had better be 'yes' 'cos that's your lot for now :)


JS> Regards,
JS> Jonathan





Best regards,
 Eje Gustafsson                       mailto:[EMAIL PROTECTED]
---
The Family Entertainment Network      http://www.fament.com
Phone : 620-231-7777                  Fax   : 620-231-4066
eBay UserID : macahan
          - Your Full Time Professionals -

---
[This E-mail scanned for viruses by Declude Virus]

Reply via email to