25.05.2010 02:01, Steven Dake wrote:
> On 05/24/2010 02:51 PM, Vladislav Bogdanov wrote:
>> Hi all,
>>
>> Sorry for being out of "References", just subscribed.
>>
>>   
>>> On Fri, 2010-05-21 at 16:19 +0200, Alain.Moulle wrote:
>>>     
>>>> Hi
>>>>
>>>> These new releases of corosync do not start successfully on RHEL5 :
>>>> corosync-1.2.2-1.1.el5
>>>> corosynclib-1.2.2-1.1.el5
>>>> I 've joined the messages trace.
>>>>
>>>> whereas on same machines, these old ones works fine :
>>>> corosync-1.2.1-1.el5
>>>> corosynclib-1.2.1-1.el5
>>>>
>>>> I've reinstalled these old ones and it works fine again.
>>>> And ... I can't test furthermore with the new releases before around 10
>>>> days.
>>>>
>>>> Regards
>>>> Alain
>>>>
>>>>        
>>> Building from sources on rhel5, corosync starts properly.  I didn't give
>>> pacemaker a go.
>>>
>>> could you provide more information:
>>> 1) where did you download the corosync rpms
>>> 2) Which version of RHEL are you running
>>>
>>> Then I can look into reproducing
>>>      
>> I confirm that both 1.2.2 and 1.2.3 segfault on CentOS 5.5 when
>> pacemaker is enabled (this is critical, corosync alone starts just fine).
>> Tried with both clusterlabs 1.2.2-1.1 RPM and home-brew 1.2.3 RPM.
>>
>> Segfault is originated from exec/logsys.c:760, in strlen(rec->buffer)
>>
>> Can't post gdb output, console buffer is lost yet due to urgent
>> downgrade.
>>
>>    
> 
> Reproducible?

100%

Need to note that arch is x86_64.

> 
> can you run corosync-fplay and send the list the output.

Not much info there.

Starting replay: head [1311] tail [0]
rec=[1] Log Message=Corosync Cluster Engine ('1.2.3'): started and ready
to provide service.
rec=[2] Log Message=Corosync built-in features: nss rdma
rec=[3] Log Message=Successfully read main configuration file
'/etc/corosync/corosync.conf'.
rec=[4] Log Message=Token Timeout (3000 ms) retransmit timeout (294 ms)
rec=[5] Log Message=token hold (225 ms) retransmits before loss (10 retrans)
rec=[6] Log Message=join (60 ms) send_join (0 ms) consensus (3600 ms)
merge (200 ms)
rec=[7] Log Message=downcheck (1000 ms) fail to recv const (50 msgs)
rec=[8] Log Message=seqno unchanged const (30 rotations) Maximum network
MTU 1402
rec=[9] Log Message=window size per rotation (50 messages) maximum
messages per rotation (20 messages)
rec=[10] Log Message=send threads (0 threads)
rec=[11] Log Message=RRP token expired timeout (294 ms)
rec=[12] Log Message=RRP token problem counter (2000 ms)
rec=[13] Log Message=RRP threshold (10 problem count)
rec=[14] Log Message=RRP mode set to passive.
rec=[15] Log Message=heartbeat_failures_allowed (0)
rec=[16] Log Message=max_network_delay (50 ms)
rec=[17] Log Message=HeartBeat is Disabled. To enable set
heartbeat_failures_allowed > 0
rec=[18] Log Message=Initializing transport (UDP/IP).
rec=[19] Log Message=Initializing transmit/receive security: libtomcrypt
SOBER128/SHA1HMAC (mode 0).
rec=[20] Log Message=Initializing transport (UDP/IP).
rec=[21] Log Message=Initializing transmit/receive security: libtomcrypt
SOBER128/SHA1HMAC (mode 0).
rec=[22] Log Message=you are using ipc api v2
rec=[23] Log Message=Receive multicast socket recv buffer size (262142
bytes).
rec=[24] Log Message=Transmit multicast socket send buffer size (262142
bytes).
rec=[25] Log Message=The network interface [10.5.250.2] is now up.
rec=[26] Log Message=Created or loaded sequence id 296.10.5.250.2 for
this ring.
rec=[27] Log Message=info: process_ais_conf: Reading configure
rec=[28] Log Message=info: config_find_init: Local handle:
2730409743423111170 for logging
rec=[29] Log Message=info: config_find_next: Processing additional
logging options...
rec=[30] Log Message=info: get_config_opt: Found 'off' for option: debug
rec=[31] Log Message=info: get_config_opt: Defaulting to 'off' for
option: to_file
rec=[32] Log Message=info: get_config_opt: Found 'yes' for option: to_syslog
rec=[33] Log Message=info: get_config_opt: Defaulting to 'daemon' for
option: syslog_facility
rec=[34] Log Message=info: config_find_init: Local handle:
5880381755227111427 for service
rec=[35] Log Message=info: config_find_next: Processing additional
service options...
rec=[36] Log Message=info: get_config_opt: Defaulting to 'pcmk' for
option: clustername
rec=[37] Log Message=info: get_config_opt: Defaulting to 'no' for
option: use_logd
rec=[38] Log Message=info: get_config_opt: Defaulting to 'no' for
option: use_mgmtd
rec=[39] Log Message=info: pcmk_startup: CRM: Initialized
rec=[40] Log Message=Logging: Initialized pcmk_startup
rec=[41] Log Message=info: pcmk_startup: Maximum core file size is:
18446744073709551615
rec=[42] Log Message=info: pcmk_startup: Service: 9
Finishing replay: records found [42]

Hmm...
I'm wrong.
There IS some info.
rec id 42 didn't show on stderr if I enable later.

> 
> Please send your conf file.
# cat /etc/corosync/corosync.conf
compatibility: none

totem {
  version: 2
  token: 3000
  token_retransmits_before_loss_const: 10
  join: 60
#  consensus: 1500
#  vsftype: none
  max_messages: 20
  clear_node_high_bit: yes
#  secauth: on
  threads: 0
  rrp_mode: passive
  interface {
    ringnumber: 0
    bindnetaddr: 10.5.250.0
    mcastaddr: 239.94.1.1
    mcastport: 5405
  }
  interface {
    ringnumber: 1
    bindnetaddr: 10.5.4.0
    mcastaddr: 239.94.2.1
    mcastport: 5405
  }
}
logging {
        fileline: off
        to_stderr: no
        to_logfile: no
        to_syslog: yes
        logfile: /tmp/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

amf {
        mode: disabled
}

service {
    name: pacemaker
    ver:  0
}

aisexec {
    user:   root
    group:  root
}


Here is gdb backtrace

# stty -tostop
# gdb `which corosync`
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/corosync...Reading symbols from
/usr/lib/debug/usr/sbin/corosync.debug...done.
done.
(gdb) set args -f
(gdb) run
Starting program: /usr/sbin/corosync -f
[Thread debugging using libthread_db enabled]
[New Thread 0x40a00940 (LWP 30752)]
[New Thread 0x40a18fe0 (LWP 30753)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x40a00940 (LWP 30752)]
0x00000033812797c0 in strlen () from /lib64/libc.so.6
(gdb) bt
#0  0x00000033812797c0 in strlen () from /lib64/libc.so.6
#1  0x00002aaaaace4c6b in logsys_worker_thread (data=<value optimized
out>) at logsys.c:760
#2  0x0000003381a0673d in start_thread () from /lib64/libpthread.so.0
#3  0x00000033812d3d1d in clone () from /lib64/libc.so.6
(gdb) bt full
#0  0x00000033812797c0 in strlen () from /lib64/libc.so.6
No symbol table info available.
#1  0x00002aaaaace4c6b in logsys_worker_thread (data=<value optimized
out>) at logsys.c:760
        rec = 0x2aaaaaee5cc8
        dropped = 0
#2  0x0000003381a0673d in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00000033812d3d1d in clone () from /lib64/libc.so.6
No symbol table info available.
(gdb) info threads
  3 Thread 0x40a18fe0 (LWP 30753)  0x0000003381a0d48e in
__lll_lock_wait_private () from /lib64/libpthread.so.0
* 2 Thread 0x40a00940 (LWP 30752)  0x00000033812797c0 in strlen () from
/lib64/libc.so.6
  1 Thread 0x2aaaab0f3a60 (LWP 30749)  0x0000003380a145f2 in strcmp ()
from /lib64/ld-linux-x86-64.so.2

Best,
Vladislav

> 
> Thanks
> -steve
> 
>> Best,
>> Vladislav
>> _______________________________________________
>> Openais mailing list
>> Openais@lists.linux-foundation.org
>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>    
> 

_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to