On 06/07/2011 04:44 AM, william felipe_welter wrote: > More two questions.. The patch for mmap calls will be on the mainly > development for all archs ? > Any problems if i send this patch's for Debian project ? >
These patches will go into the maintenance branches You can send them to whoever you like ;) Regards -steve > 2011/6/3 Steven Dake <[email protected]>: >> On 06/02/2011 08:16 PM, william felipe_welter wrote: >>> Well, >>> >>> Now with this patch, the pacemakerd process starts and up his other >>> process ( crmd, lrmd, pengine....) but after the process pacemakerd do >>> a fork, the forked process pacemakerd dies due to "signal 10, Bus >>> error".. And on the log, the process of pacemark ( crmd, lrmd, >>> pengine....) cant connect to open ais plugin (possible because the >>> "death" of the pacemakerd process). >>> But this time when the forked pacemakerd dies, he generates a coredump. >>> >>> gdb -c "/usr/var/lib/heartbeat/cores/root/ pacemakerd 7986" -se >>> /usr/sbin/pacemakerd : >>> GNU gdb (GDB) 7.0.1-debian >>> Copyright (C) 2009 Free Software Foundation, Inc. >>> License GPLv3+: GNU GPL version 3 or later >>> <http://gnu.org/licenses/gpl.html> >>> This is free software: you are free to change and redistribute it. >>> There is NO WARRANTY, to the extent permitted by law. Type "show copying" >>> and "show warranty" for details. >>> This GDB was configured as "sparc-linux-gnu". >>> For bug reporting instructions, please see: >>> <http://www.gnu.org/software/gdb/bugs/>... >>> Reading symbols from /usr/sbin/pacemakerd...done. >>> Reading symbols from /usr/lib64/libuuid.so.1...(no debugging symbols >>> found)...done. >>> Loaded symbols for /usr/lib64/libuuid.so.1 >>> Reading symbols from /usr/lib/libcoroipcc.so.4...done. >>> Loaded symbols for /usr/lib/libcoroipcc.so.4 >>> Reading symbols from /usr/lib/libcpg.so.4...done. >>> Loaded symbols for /usr/lib/libcpg.so.4 >>> Reading symbols from /usr/lib/libquorum.so.4...done. >>> Loaded symbols for /usr/lib/libquorum.so.4 >>> Reading symbols from /usr/lib64/libcrmcommon.so.2...done. >>> Loaded symbols for /usr/lib64/libcrmcommon.so.2 >>> Reading symbols from /usr/lib/libcfg.so.4...done. >>> Loaded symbols for /usr/lib/libcfg.so.4 >>> Reading symbols from /usr/lib/libconfdb.so.4...done. >>> Loaded symbols for /usr/lib/libconfdb.so.4 >>> Reading symbols from /usr/lib64/libplumb.so.2...done. >>> Loaded symbols for /usr/lib64/libplumb.so.2 >>> Reading symbols from /usr/lib64/libpils.so.2...done. >>> Loaded symbols for /usr/lib64/libpils.so.2 >>> Reading symbols from /lib/libbz2.so.1.0...(no debugging symbols >>> found)...done. >>> Loaded symbols for /lib/libbz2.so.1.0 >>> Reading symbols from /usr/lib/libxslt.so.1...(no debugging symbols >>> found)...done. >>> Loaded symbols for /usr/lib/libxslt.so.1 >>> Reading symbols from /usr/lib/libxml2.so.2...(no debugging symbols >>> found)...done. >>> Loaded symbols for /usr/lib/libxml2.so.2 >>> Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done. >>> Loaded symbols for /lib/libc.so.6 >>> Reading symbols from /lib/librt.so.1...(no debugging symbols found)...done. >>> Loaded symbols for /lib/librt.so.1 >>> Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done. >>> Loaded symbols for /lib/libdl.so.2 >>> Reading symbols from /lib/libglib-2.0.so.0...(no debugging symbols >>> found)...done. >>> Loaded symbols for /lib/libglib-2.0.so.0 >>> Reading symbols from /usr/lib/libltdl.so.7...(no debugging symbols >>> found)...done. >>> Loaded symbols for /usr/lib/libltdl.so.7 >>> Reading symbols from /lib/ld-linux.so.2...(no debugging symbols >>> found)...done. >>> Loaded symbols for /lib/ld-linux.so.2 >>> Reading symbols from /lib/libpthread.so.0...(no debugging symbols >>> found)...done. >>> Loaded symbols for /lib/libpthread.so.0 >>> Reading symbols from /lib/libm.so.6...(no debugging symbols found)...done. >>> Loaded symbols for /lib/libm.so.6 >>> Reading symbols from /usr/lib/libz.so.1...(no debugging symbols >>> found)...done. >>> Loaded symbols for /usr/lib/libz.so.1 >>> Reading symbols from /lib/libpcre.so.3...(no debugging symbols >>> found)...done. >>> Loaded symbols for /lib/libpcre.so.3 >>> Reading symbols from /lib/libnss_compat.so.2...(no debugging symbols >>> found)...done. >>> Loaded symbols for /lib/libnss_compat.so.2 >>> Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done. >>> Loaded symbols for /lib/libnsl.so.1 >>> Reading symbols from /lib/libnss_nis.so.2...(no debugging symbols >>> found)...done. >>> Loaded symbols for /lib/libnss_nis.so.2 >>> Reading symbols from /lib/libnss_files.so.2...(no debugging symbols >>> found)...done. >>> Loaded symbols for /lib/libnss_files.so.2 >>> Core was generated by `pacemakerd'. >>> Program terminated with signal 10, Bus error. >>> #0 cpg_dispatch (handle=17861288972693536769, dispatch_types=7986) at >>> cpg.c:339 >>> 339 switch (dispatch_data->id) { >>> (gdb) bt >>> #0 cpg_dispatch (handle=17861288972693536769, dispatch_types=7986) at >>> cpg.c:339 >>> #1 0xf6f100f0 in ?? () >>> #2 0xf6f100f4 in ?? () >>> Backtrace stopped: previous frame identical to this frame (corrupt stack?) >>> >>> >>> >>> I take a look at the cpg.c and see that the dispatch_data was aquired >>> by coroipcc_dispatch_get (that was defined on lib/coroipcc.c) >>> function: >>> >>> do { >>> error = coroipcc_dispatch_get ( >>> cpg_inst->handle, >>> (void **)&dispatch_data, >>> timeout); >>> >>> >>> >> >> Try the recent patch sent to fix alignment. >> >> Regards >> -steve >> >>> >>> Resumed log: >>> ... >>> un 02 23:12:20 corosync [CPG ] got mcast request on 0x62500 >>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering f to 10 >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 10 >>> to pending delivery queue >>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including f >>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 10 >>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: start_child: >>> Forked child 7991 for process lrmd >>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: >>> update_node_processes: Node xxxxxxxxxx now has process list: >>> 00000000000000000000000000100112 (was >>> 00000000000000000000000000100102) >>> Jun 02 23:12:20 corosync [CPG ] got mcast request on 0x62500 >>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering 10 to 11 >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 11 >>> to pending delivery queue >>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 11 >>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: start_child: >>> Forked child 7992 for process attrd >>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: >>> update_node_processes: Node xxxxxxxxxx now has process list: >>> 00000000000000000000000000101112 (was >>> 00000000000000000000000000100112) >>> Jun 02 23:12:20 corosync [CPG ] got mcast request on 0x62500 >>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering 11 to 12 >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 12 >>> to pending delivery queue >>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 12 >>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: start_child: >>> Forked child 7993 for process pengine >>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: >>> update_node_processes: Node xxxxxxxxxx now has process list: >>> 00000000000000000000000000111112 (was >>> 00000000000000000000000000101112) >>> Jun 02 23:12:20 corosync [CPG ] got mcast request on 0x62500 >>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering 12 to 13 >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 13 >>> to pending delivery queue >>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 13 >>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: start_child: >>> Forked child 7994 for process crmd >>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: >>> update_node_processes: Node xxxxxxxxxx now has process list: >>> 00000000000000000000000000111312 (was >>> 00000000000000000000000000111112) >>> Jun 02 23:12:20 corosync [CPG ] got mcast request on 0x62500 >>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: main: Starting mainloop >>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering 13 to 14 >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 14 >>> to pending delivery queue >>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 14 >>> Jun 02 23:12:20 corosync [CPG ] got mcast request on 0x62500 >>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering 14 to 15 >>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 15 >>> to pending delivery queue >>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 15 >>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info: Invoked: >>> /usr/lib64/heartbeat/stonithd >>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info: >>> crm_log_init_worker: Changed active directory to >>> /usr/var/lib/heartbeat/cores/root >>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info: get_cluster_type: >>> Cluster type is: 'openais'. >>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info: >>> crm_cluster_connect: Connecting to cluster infrastructure: classic >>> openais (with plugin) >>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info: >>> init_ais_connection_classic: Creating connection to our Corosync >>> plugin >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: crm_log_init_worker: >>> Changed active directory to /usr/var/lib/heartbeat/cores/hacluster >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: retrieveCib: Reading >>> cluster configuration from: /usr/var/lib/heartbeat/crm/cib.xml >>> (digest: /usr/var/lib/heartbeat/crm/cib.xml.sig) >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: WARN: retrieveCib: Cluster >>> configuration not found: /usr/var/lib/heartbeat/crm/cib.xml >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: WARN: readCibXmlFile: Primary >>> configuration corrupt or unusable, trying backup... >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: get_last_sequence: >>> Series file /usr/var/lib/heartbeat/crm/cib.last does not exist >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile: Backup >>> file /usr/var/lib/heartbeat/crm/cib-99.raw not found >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: WARN: readCibXmlFile: >>> Continuing with an empty configuration. >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk] >>> <cib epoch="0" num_updates="0" admin_epoch="0" >>> validate-with="pacemaker-1.2" > >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk] >>> <configuration > >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk] >>> <crm_config /> >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk] >>> <nodes /> >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk] >>> <resources /> >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk] >>> <constraints /> >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk] >>> </configuration> >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk] >>> <status /> >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk] >>> </cib> >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: validate_with_relaxng: >>> Creating RNG parser context >>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info: >>> init_ais_connection_classic: Connection to our AIS plugin (9) failed: >>> Doesn't exist (12) >>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: CRIT: main: Cannot sign >>> in to the cluster... terminating >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: info: Invoked: >>> /usr/lib64/heartbeat/crmd >>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: info: Invoked: >>> /usr/lib64/heartbeat/pengine >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: info: crm_log_init_worker: >>> Changed active directory to /usr/var/lib/heartbeat/cores/hacluster >>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: info: crm_log_init_worker: >>> Changed active directory to /usr/var/lib/heartbeat/cores/hacluster >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: info: main: CRM Hg Version: >>> e872eeb39a5f6e1fdb57c3108551a5353648c4f4 >>> >>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: debug: main: Checking for >>> old instances of pengine >>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: debug: >>> init_client_ipc_comms_nodispatch: Attempting to talk on: >>> /usr/var/run/crm/pengine >>> Jun 02 23:12:20 xxxxxxxxxx lrmd: [7991]: info: enabling coredumps >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: info: crmd_init: Starting crmd >>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: debug: >>> init_client_ipc_comms_nodispatch: Could not init comms on: >>> /usr/var/run/crm/pengine >>> Jun 02 23:12:20 xxxxxxxxxx lrmd: [7991]: debug: main: run the loop... >>> Jun 02 23:12:20 xxxxxxxxxx lrmd: [7991]: info: Started. >>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: debug: main: Init server comms >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: s_crmd_fsa: Processing >>> I_STARTUP: [ state=S_STARTING cause=C_STARTUP origin=crmd_init ] >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: do_fsa_action: >>> actions:trace: // A_LOG >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: do_fsa_action: >>> actions:trace: // A_STARTUP >>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: info: main: Starting pengine >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: do_startup: >>> Registering Signal Handlers >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: do_startup: Creating >>> CIB and LRM objects >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: do_fsa_action: >>> actions:trace: // A_CIB_START >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: >>> init_client_ipc_comms_nodispatch: Attempting to talk on: >>> /usr/var/run/crm/cib_rw >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: >>> init_client_ipc_comms_nodispatch: Could not init comms on: >>> /usr/var/run/crm/cib_rw >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: cib_native_signon_raw: >>> Connection to command channel failed >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: >>> init_client_ipc_comms_nodispatch: Attempting to talk on: >>> /usr/var/run/crm/cib_callback >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: >>> init_client_ipc_comms_nodispatch: Could not init comms on: >>> /usr/var/run/crm/cib_callback >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: cib_native_signon_raw: >>> Connection to callback channel failed >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: cib_native_signon_raw: >>> Connection to CIB failed: connection failed >>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: cib_native_signoff: >>> Signing out of the CIB Service >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: activateCibXml: >>> Triggering CIB write for start op >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: startCib: CIB >>> Initialization completed successfully >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: get_cluster_type: >>> Cluster type is: 'openais'. >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: crm_cluster_connect: >>> Connecting to cluster infrastructure: classic openais (with plugin) >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: >>> init_ais_connection_classic: Creating connection to our Corosync >>> plugin >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: >>> init_ais_connection_classic: Connection to our AIS plugin (9) failed: >>> Doesn't exist (12) >>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: CRIT: cib_init: Cannot sign in >>> to the cluster... terminating >>> Jun 02 23:12:21 corosync [CPG ] exit_fn for conn=0x62500 >>> Jun 02 23:12:21 corosync [TOTEM ] mcasted message added to pending queue >>> Jun 02 23:12:21 corosync [TOTEM ] Delivering 15 to 16 >>> Jun 02 23:12:21 corosync [TOTEM ] Delivering MCAST message with seq 16 >>> to pending delivery queue >>> Jun 02 23:12:21 corosync [CPG ] got procleave message from cluster >>> node 1377289226 >>> Jun 02 23:12:21 corosync [TOTEM ] releasing messages up to and including 16 >>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: Invoked: >>> /usr/lib64/heartbeat/attrd >>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: crm_log_init_worker: >>> Changed active directory to /usr/var/lib/heartbeat/cores/hacluster >>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: main: Starting up >>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: get_cluster_type: >>> Cluster type is: 'openais'. >>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: crm_cluster_connect: >>> Connecting to cluster infrastructure: classic openais (with plugin) >>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: >>> init_ais_connection_classic: Creating connection to our Corosync >>> plugin >>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: >>> init_ais_connection_classic: Connection to our AIS plugin (9) failed: >>> Doesn't exist (12) >>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: ERROR: main: HA Signon failed >>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: main: Cluster connection >>> active >>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: main: Accepting >>> attribute updates >>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: ERROR: main: Aborting startup >>> Jun 02 23:12:21 xxxxxxxxxx crmd: [7994]: debug: >>> init_client_ipc_comms_nodispatch: Attempting to talk on: >>> /usr/var/run/crm/cib_rw >>> Jun 02 23:12:21 xxxxxxxxxx crmd: [7994]: debug: >>> init_client_ipc_comms_nodispatch: Could not init comms on: >>> /usr/var/run/crm/cib_rw >>> Jun 02 23:12:21 xxxxxxxxxx crmd: [7994]: debug: cib_native_signon_raw: >>> Connection to command channel failed >>> Jun 02 23:12:21 xxxxxxxxxx crmd: [7994]: debug: >>> init_client_ipc_comms_nodispatch: Attempting to talk on: >>> /usr/var/run/crm/cib_callback >>> ... >>> >>> >>> 2011/6/2 Steven Dake <[email protected]>: >>>> On 06/01/2011 11:05 PM, william felipe_welter wrote: >>>>> I recompile my kernel without hugetlb .. and the result are the same.. >>>>> >>>>> My test program still resulting: >>>>> PATH=/dev/shm/teste123XXXXXX >>>>> page size=20000 >>>>> fd=3 >>>>> ADDR_ORIG:0xe000a000 ADDR:0xffffffff >>>>> Erro >>>>> >>>>> And Pacemaker still resulting because the mmap error: >>>>> Could not initialize Cluster Configuration Database API instance error 2 >>>>> >>>> >>>> Give the patch I posted recently a spin - corosync WFM with this patch >>>> on sparc64 with hugetlb set. Please report back results. >>>> >>>> Regards >>>> -steve >>>> >>>>> For make sure that i have disable the hugetlb there is my /proc/meminfo: >>>>> MemTotal: 33093488 kB >>>>> MemFree: 32855616 kB >>>>> Buffers: 5600 kB >>>>> Cached: 53480 kB >>>>> SwapCached: 0 kB >>>>> Active: 45768 kB >>>>> Inactive: 28104 kB >>>>> Active(anon): 18024 kB >>>>> Inactive(anon): 1560 kB >>>>> Active(file): 27744 kB >>>>> Inactive(file): 26544 kB >>>>> Unevictable: 0 kB >>>>> Mlocked: 0 kB >>>>> SwapTotal: 6104680 kB >>>>> SwapFree: 6104680 kB >>>>> Dirty: 0 kB >>>>> Writeback: 0 kB >>>>> AnonPages: 14936 kB >>>>> Mapped: 7736 kB >>>>> Shmem: 4624 kB >>>>> Slab: 39184 kB >>>>> SReclaimable: 10088 kB >>>>> SUnreclaim: 29096 kB >>>>> KernelStack: 7088 kB >>>>> PageTables: 1160 kB >>>>> Quicklists: 17664 kB >>>>> NFS_Unstable: 0 kB >>>>> Bounce: 0 kB >>>>> WritebackTmp: 0 kB >>>>> CommitLimit: 22651424 kB >>>>> Committed_AS: 519368 kB >>>>> VmallocTotal: 1069547520 kB >>>>> VmallocUsed: 11064 kB >>>>> VmallocChunk: 1069529616 kB >>>>> >>>>> >>>>> 2011/6/1 Steven Dake <[email protected]>: >>>>>> On 06/01/2011 07:42 AM, william felipe_welter wrote: >>>>>>> Steven, >>>>>>> >>>>>>> cat /proc/meminfo >>>>>>> ... >>>>>>> HugePages_Total: 0 >>>>>>> HugePages_Free: 0 >>>>>>> HugePages_Rsvd: 0 >>>>>>> HugePages_Surp: 0 >>>>>>> Hugepagesize: 4096 kB >>>>>>> ... >>>>>>> >>>>>> >>>>>> It definitely requires a kernel compile and setting the config option to >>>>>> off. I don't know the debian way of doing this. >>>>>> >>>>>> The only reason you may need this option is if you have very large >>>>>> memory sizes, such as 48GB or more. >>>>>> >>>>>> Regards >>>>>> -steve >>>>>> >>>>>>> Its 4MB.. >>>>>>> >>>>>>> How can i disable hugetlb ? ( passing CONFIG_HUGETLBFS=n at boot to >>>>>>> kernel ?) >>>>>>> >>>>>>> 2011/6/1 Steven Dake <[email protected] <mailto:[email protected]>> >>>>>>> >>>>>>> On 06/01/2011 01:05 AM, Steven Dake wrote: >>>>>>> > On 05/31/2011 09:44 PM, Angus Salkeld wrote: >>>>>>> >> On Tue, May 31, 2011 at 11:52:48PM -0300, william felipe_welter >>>>>>> wrote: >>>>>>> >>> Angus, >>>>>>> >>> >>>>>>> >>> I make some test program (based on the code coreipcc.c) and i >>>>>>> now i sure >>>>>>> >>> that are problems with the mmap systems call on sparc.. >>>>>>> >>> >>>>>>> >>> Source code of my test program: >>>>>>> >>> >>>>>>> >>> #include <stdlib.h> >>>>>>> >>> #include <sys/mman.h> >>>>>>> >>> #include <stdio.h> >>>>>>> >>> >>>>>>> >>> #define PATH_MAX 36 >>>>>>> >>> >>>>>>> >>> int main() >>>>>>> >>> { >>>>>>> >>> >>>>>>> >>> int32_t fd; >>>>>>> >>> void *addr_orig; >>>>>>> >>> void *addr; >>>>>>> >>> char path[PATH_MAX]; >>>>>>> >>> const char *file = "teste123XXXXXX"; >>>>>>> >>> size_t bytes=10024; >>>>>>> >>> >>>>>>> >>> snprintf (path, PATH_MAX, "/dev/shm/%s", file); >>>>>>> >>> printf("PATH=%s\n",path); >>>>>>> >>> >>>>>>> >>> fd = mkstemp (path); >>>>>>> >>> printf("fd=%d \n",fd); >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> addr_orig = mmap (NULL, bytes, PROT_NONE, >>>>>>> >>> MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> addr = mmap (addr_orig, bytes, PROT_READ | PROT_WRITE, >>>>>>> >>> MAP_FIXED | MAP_SHARED, fd, 0); >>>>>>> >>> >>>>>>> >>> printf("ADDR_ORIG:%p ADDR:%p\n",addr_orig,addr); >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> if (addr != addr_orig) { >>>>>>> >>> printf("Erro"); >>>>>>> >>> } >>>>>>> >>> } >>>>>>> >>> >>>>>>> >>> Results on x86: >>>>>>> >>> PATH=/dev/shm/teste123XXXXXX >>>>>>> >>> fd=3 >>>>>>> >>> ADDR_ORIG:0x7f867d8e6000 ADDR:0x7f867d8e6000 >>>>>>> >>> >>>>>>> >>> Results on sparc: >>>>>>> >>> PATH=/dev/shm/teste123XXXXXX >>>>>>> >>> fd=3 >>>>>>> >>> ADDR_ORIG:0xf7f72000 ADDR:0xffffffff >>>>>>> >> >>>>>>> >> Note: 0xffffffff == MAP_FAILED >>>>>>> >> >>>>>>> >> (from man mmap) >>>>>>> >> RETURN VALUE >>>>>>> >> On success, mmap() returns a pointer to the mapped area. >>>>>>> On >>>>>>> >> error, the value MAP_FAILED (that is, (void *) -1) is >>>>>>> returned, >>>>>>> >> and errno is set appropriately. >>>>>>> >> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> But im wondering if is really needed to call mmap 2 times ? >>>>>>> What are the >>>>>>> >>> reason to call the mmap 2 times, on the second time using the >>>>>>> address of the >>>>>>> >>> first? >>>>>>> >>> >>>>>>> >>> >>>>>>> >> Well there are 3 calls to mmap() >>>>>>> >> 1) one to allocate 2 * what you need (in pages) >>>>>>> >> 2) maps the first half of the mem to a real file >>>>>>> >> 3) maps the second half of the mem to the same file >>>>>>> >> >>>>>>> >> The point is when you write to an address over the end of the >>>>>>> >> first half of memory it is taken care of the the third mmap which >>>>>>> maps >>>>>>> >> the address back to the top of the file for you. This means you >>>>>>> >> don't have to worry about ringbuffer wrapping which can be a >>>>>>> headache. >>>>>>> >> >>>>>>> >> -Angus >>>>>>> >> >>>>>>> > >>>>>>> > interesting this mmap operation doesn't work on sparc linux. >>>>>>> > >>>>>>> > Not sure how I can help here - Next step would be a follow up >>>>>>> with the >>>>>>> > sparc linux mailing list. I'll do that and cc you on the message >>>>>>> - see >>>>>>> > if we get any response. >>>>>>> > >>>>>>> > http://vger.kernel.org/vger-lists.html >>>>>>> > >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> 2011/5/31 Angus Salkeld <[email protected] >>>>>>> <mailto:[email protected]>> >>>>>>> >>> >>>>>>> >>>> On Tue, May 31, 2011 at 06:25:56PM -0300, william felipe_welter >>>>>>> wrote: >>>>>>> >>>>> Thanks Steven, >>>>>>> >>>>> >>>>>>> >>>>> Now im try to run on the MCP: >>>>>>> >>>>> - Uninstall the pacemaker 1.0 >>>>>>> >>>>> - Compile and install 1.1 >>>>>>> >>>>> >>>>>>> >>>>> But now i have problems to initialize the pacemakerd: Could >>>>>>> not >>>>>>> >>>> initialize >>>>>>> >>>>> Cluster Configuration Database API instance error 2 >>>>>>> >>>>> Debbuging with gdb i see that the error are on the confdb.. >>>>>>> most >>>>>>> >>>> specificaly >>>>>>> >>>>> the errors start on coreipcc.c at line: >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> >>>>> 448 if (addr != addr_orig) { >>>>>>> >>>>> 449 goto error_close_unlink; <- enter here >>>>>>> >>>>> 450 } >>>>>>> >>>>> >>>>>>> >>>>> Some ideia about what can cause this ? >>>>>>> >>>>> >>>>>>> >>>> >>>>>>> >>>> I tried porting a ringbuffer (www.libqb.org >>>>>>> <http://www.libqb.org>) to sparc and had the same >>>>>>> >>>> failure. >>>>>>> >>>> There are 3 mmap() calls and on sparc the third one keeps >>>>>>> failing. >>>>>>> >>>> >>>>>>> >>>> This is a common way of creating a ring buffer, see: >>>>>>> >>>> >>>>>>> >>>>>>> http://en.wikipedia.org/wiki/Circular_buffer#Exemplary_POSIX_Implementation >>>>>>> >>>> >>>>>>> >>>> I couldn't get it working in the short time I tried. It's >>>>>>> probably >>>>>>> >>>> worth looking at the clib implementation to see why it's >>>>>>> failing >>>>>>> >>>> (I didn't get to that). >>>>>>> >>>> >>>>>>> >>>> -Angus >>>>>>> >>>> >>>>>>> >>>>>>> Note, we sorted this out we believe. Your kernel has hugetlb >>>>>>> enabled, >>>>>>> probably with 4MB pages. This requires corosync to allocate 4MB >>>>>>> pages. >>>>>>> >>>>>>> Can you verify your hugetlb settings? >>>>>>> >>>>>>> If you can turn this option off, you should have atleast a working >>>>>>> corosync. >>>>>>> >>>>>>> Regards >>>>>>> -steve >>>>>>> >>>> >>>>>>> >>>> _______________________________________________ >>>>>>> >>>> Pacemaker mailing list: [email protected] >>>>>>> <mailto:[email protected]> >>>>>>> >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>> >>>> >>>>>>> >>>> Project Home: http://www.clusterlabs.org >>>>>>> >>>> Getting started: >>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>> >>>> Bugs: >>>>>>> >>>> >>>>>>> >>>>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>>>>>> >>>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> -- >>>>>>> >>> William Felipe Welter >>>>>>> >>> ------------------------------ >>>>>>> >>> Consultor em Tecnologias Livres >>>>>>> >>> [email protected] >>>>>>> <mailto:[email protected]> >>>>>>> >>> www.4linux.com.br <http://www.4linux.com.br> >>>>>>> >> >>>>>>> >>> _______________________________________________ >>>>>>> >>> Openais mailing list >>>>>>> >>> [email protected] >>>>>>> <mailto:[email protected]> >>>>>>> >>> https://lists.linux-foundation.org/mailman/listinfo/openais >>>>>>> >> >>>>>>> >> >>>>>>> >> _______________________________________________ >>>>>>> >> Pacemaker mailing list: [email protected] >>>>>>> <mailto:[email protected]> >>>>>>> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>> >> >>>>>>> >> Project Home: http://www.clusterlabs.org >>>>>>> >> Getting started: >>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>> >> Bugs: >>>>>>> >>>>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>>>>>> > >>>>>>> > _______________________________________________ >>>>>>> > Openais mailing list >>>>>>> > [email protected] >>>>>>> <mailto:[email protected]> >>>>>>> > https://lists.linux-foundation.org/mailman/listinfo/openais >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pacemaker mailing list: [email protected] >>>>>>> <mailto:[email protected]> >>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>> >>>>>>> Project Home: http://www.clusterlabs.org >>>>>>> Getting started: >>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>> Bugs: >>>>>>> >>>>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> William Felipe Welter >>>>>>> ------------------------------ >>>>>>> Consultor em Tecnologias Livres >>>>>>> [email protected] <mailto:[email protected]> >>>>>>> www.4linux.com.br <http://www.4linux.com.br> >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pacemaker mailing list: [email protected] >>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>> >>>>>>> Project Home: http://www.clusterlabs.org >>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>> Bugs: >>>>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> >> >> > > > _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
