As far as I can tell, pni_map_entry() only allocates space for a
single additional entry at a time, and hence should never recurse more
than once.  I.e. pni_map_ensure should work the first time and prevent
further recursion.

On Wed, Jul 2, 2014 at 7:32 AM, Alan Conway <acon...@redhat.com> wrote:
> On Tue, 2014-07-01 at 07:15 -0400, Michael Goulish wrote:
>> Yes!
>> Great idea --
>> I will attempt.
>
> I would put #ifndef NDEBUG around this code. We will never test it but
> someday on a vital production server at our biggest customer, somebody
> will use a map with 33 levels of nesting. I can guarantee it.
>
> It would be even better to do proper loop detection , i.e. check if
> you've seen the same map before. That would affect performance but I
> think the effect would be negligible for maps with a "normal" amount of
> nesting.
>
>>
>>
>>
>> ----- Original Message -----
>>
>>     [ 
>> https://issues.apache.org/jira/browse/PROTON-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048701#comment-14048701
>>  ]
>>
>> Rafael H. Schloming commented on PROTON-625:
>> --------------------------------------------
>>
>> I think the easiest way to track down this bug would be to put some sort of 
>> detection inside of pni_map_entry and if it recurses more than some limit, 
>> e.g. 32 times or something, then print out a representation of the maps 
>> internal structure. It might also help to use a debug build so you have line 
>> numbers. Is that something you feel comfortable trying? You should be able 
>> to find the relevant code around line 551 of object.c.
>>
>> > Biggest Backtrace Ever!
>> > -----------------------
>> >
>> >                 Key: PROTON-625
>> >                 URL: https://issues.apache.org/jira/browse/PROTON-625
>> >             Project: Qpid Proton
>> >          Issue Type: Bug
>> >          Components: proton-c
>> >    Affects Versions: 0.7
>> >            Reporter: michael goulish
>> >
>> > I am saving all my stuff so I can repro on demand.
>> > It doesn't happen every time, but it's about 50%.
>> > ------------------------------------------
>> > On one box, I have a dispatch router.
>> > On the other box, I have 10 clients: 5 Messenger-based receivers, and 5 
>> > qpid-messaging-based senders.
>> > Each client will handle 100 addresses, of the form "mick/0" ... "mick/1" 
>> > ... & c.
>> > 100 messages will be sent to each address.
>> > I start the 5 receivers first.  They start OK.  Dispatch router happy & 
>> > stable.
>> > Wait a few seconds.
>> > I start the 5 senders, from a bash script.
>> > The first sender is already sending when the 2nd, 3rd, 4th start.
>> > After a few of them start,but before all have finished starting,  a few 
>> > seconds into the script, the crash occurs.  ( If they all start up 
>> > successfully, no crash. )
>> > The crash occurs in the dispatch router.
>> > Here is the biggest backtrace ever:
>> > #0  0x0000003cf9879ad1 in _int_malloc (av=0x7f101c000020, bytes=16384) at 
>> > malloc.c:4383
>> > #1  0x0000003cf987a911 in __libc_malloc (bytes=16384) at malloc.c:3664
>> > #2  0x00000039c6c1650a in pni_map_allocate () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #3  0x00000039c6c16a3a in pni_map_ensure () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #4  0x00000039c6c16c45 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #5  0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #6  0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #7  0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #8  0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #9  0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #10 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #11 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #12 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #13 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #14 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > .
>> > .
>> > .
>> > .
>> > #93549 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93550 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93551 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93552 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93553 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93554 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93555 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93556 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93557 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93558 0x00000039c6c16c64 in pni_map_entry () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93559 0x00000039c6c16dc0 in pn_map_put () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93560 0x00000039c6c17226 in pn_hash_put () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93561 0x00000039c6c2a643 in pn_delivery_map_push () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93562 0x00000039c6c2c44b in pn_do_transfer () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93563 0x00000039c6c24385 in pn_dispatch_frame () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93564 0x00000039c6c2448f in pn_dispatcher_input () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93565 0x00000039c6c2d68b in pn_input_read_amqp () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93566 0x00000039c6c3011a in pn_io_layer_input_passthru () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93567 0x00000039c6c3011a in pn_io_layer_input_passthru () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93568 0x00000039c6c2d275 in transport_consume () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93569 0x00000039c6c304cd in pn_transport_process () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93570 0x00000039c6c3e40c in pn_connector_process () from 
>> > /usr/lib64/libqpid-proton.so.2
>> > #93571 0x00007f1060c60460 in process_connector () from 
>> > /home/mick/dispatch/build/libqpid-dispatch.so.0
>> > #93572 0x00007f1060c61017 in thread_run () from 
>> > /home/mick/dispatch/build/libqpid-dispatch.so.0
>> > #93573 0x0000003cf9c07851 in start_thread (arg=0x7f1052bfd700) at 
>> > pthread_create.c:301
>> > #93574 0x0000003cf98e890d in clone () at 
>> > ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.2#6252)
>
>

Reply via email to