What about your system logs? Things under /var/log like system/dmesg/daemon etc.
Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: [email protected] web: www.campaignmonitor.com On 7 February 2014 23:47, KingBob <[email protected]> wrote: > Hello, > > I hope someone can assist with this -- whenever I start Elasticsearch with > the bootstrap.mlockall option enabled my entire server stops responding > after a minute or two. > > There is nothing obvious in the log file, even with DEBUG enabled. I > can't find any errors in my system logs either, so I'm at a bit of a loss > as to what the issue is. > > When I disable the bootstrap.mlockall option ES runs fine. > > JRE -- jre1.7.0_51 > ES -- elasticsearch-0.90.11 > OS -- SLES11SP3 (VMware) > RAM -- 8GB > PAGING -- 5GB > ES_HEAP_SIZE -- 4GB (50% actual) > > I'm building the first of a 3-node cluster. > I intended to get everything working here, then clone the VM to create the > remaining 2 nodes. > > The "elastic" user has the following ulimits: > > core file size (blocks, -c) unlimited > data seg size (kbytes, -d) unlimited > scheduling priority (-e) 0 > file size (blocks, -f) unlimited > pending signals (-i) 63931 > max locked memory (kbytes, -l) unlimited > max memory size (kbytes, -m) 6965324 > open files (-n) 65535 > pipe size (512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time priority (-r) 0 > stack size (kbytes, -s) unlimited > cpu time (seconds, -t) unlimited > max user processes (-u) 63931 > virtual memory (kbytes, -v) 6960400 > file locks (-x) unlimited > > Here is a DEBUG log from startup to hang (at which point I have to > hard-reset the server): > > [2014-02-07 00:49:44,463][INFO ][node ] [log-els01] > version[0.90.11], pid[3894], build[11da1ba/2014-02-03T15:27:39Z] > [2014-02-07 00:49:44,463][INFO ][node ] [log-els01] > initializing ... > [2014-02-07 00:49:44,464][DEBUG][node ] [log-els01] > using home [/opt/elasticsearch], config [/opt/elasticsearch/config], data > [[/var/data/elasticsearch]], logs [/opt/elasticsearch/logs], work > [/opt/elasticsearch/work], plugins [/opt/elasticsearch/plugins] > [2014-02-07 00:49:44,472][INFO ][plugins ] [log-els01] > loaded [], sites [] > [2014-02-07 00:49:44,483][DEBUG][common.compress.lzf ] using > [UnsafeChunkDecoder] decoder > [2014-02-07 00:49:44,489][DEBUG][env ] [log-els01] > using node location [[/var/data/elasticsearch/logstash-cluster/nodes/0]], > local_node_id [0] > [2014-02-07 00:49:45,587][DEBUG][threadpool ] [log-els01] > creating thread_pool [generic], type [cached], keep_alive [30s] > [2014-02-07 00:49:45,595][DEBUG][threadpool ] [log-els01] > creating thread_pool [index], type [fixed], size [1], queue_size [200] > [2014-02-07 00:49:45,597][DEBUG][threadpool ] [log-els01] > creating thread_pool [bulk], type [fixed], size [1], queue_size [50] > [2014-02-07 00:49:45,598][DEBUG][threadpool ] [log-els01] > creating thread_pool [get], type [fixed], size [1], queue_size [1k] > [2014-02-07 00:49:45,598][DEBUG][threadpool ] [log-els01] > creating thread_pool [search], type [fixed], size [3], queue_size [1k] > [2014-02-07 00:49:45,598][DEBUG][threadpool ] [log-els01] > creating thread_pool [suggest], type [fixed], size [1], queue_size [1k] > [2014-02-07 00:49:45,598][DEBUG][threadpool ] [log-els01] > creating thread_pool [percolate], type [fixed], size [1], queue_size [1k] > [2014-02-07 00:49:45,598][DEBUG][threadpool ] [log-els01] > creating thread_pool [management], type [scaling], min [1], size [5], > keep_alive [5m] > [2014-02-07 00:49:45,599][DEBUG][threadpool ] [log-els01] > creating thread_pool [flush], type [scaling], min [1], size [1], keep_alive > [5m] > [2014-02-07 00:49:45,599][DEBUG][threadpool ] [log-els01] > creating thread_pool [merge], type [scaling], min [1], size [1], keep_alive > [5m] > [2014-02-07 00:49:45,599][DEBUG][threadpool ] [log-els01] > creating thread_pool [refresh], type [scaling], min [1], size [1], > keep_alive [5m] > [2014-02-07 00:49:45,599][DEBUG][threadpool ] [log-els01] > creating thread_pool [warmer], type [scaling], min [1], size [1], > keep_alive [5m] > [2014-02-07 00:49:45,599][DEBUG][threadpool ] [log-els01] > creating thread_pool [snapshot], type [scaling], min [1], size [1], > keep_alive [5m] > [2014-02-07 00:49:45,600][DEBUG][threadpool ] [log-els01] > creating thread_pool [optimize], type [fixed], size [1], queue_size [null] > [2014-02-07 00:49:45,638][DEBUG][transport.netty ] [log-els01] > using worker_count[2], port[9300-9400], bind_host[null], > publish_host[null], compress[false], connect_timeout[30s], > connections_per_node[2/3/6/1/1], receive_predictor[512kb->512kb] > [2014-02-07 00:49:45,664][DEBUG][discovery.zen.ping.multicast] [log-els01] > using group [224.2.2.4], with port [54328], ttl [3], and address [null] > [2014-02-07 00:49:45,667][DEBUG][discovery.zen.ping.unicast] [log-els01] > using initial hosts [], with concurrent_connects [10] > [2014-02-07 00:49:45,668][DEBUG][discovery.zen ] [log-els01] > using ping.timeout [3s], master_election.filter_client [true], > master_election.filter_data [false] > [2014-02-07 00:49:45,669][DEBUG][discovery.zen.elect ] [log-els01] > using minimum_master_nodes [2] > [2014-02-07 00:49:45,669][DEBUG][discovery.zen.fd ] [log-els01] > [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3] > [2014-02-07 00:49:45,678][DEBUG][discovery.zen.fd ] [log-els01] > [node ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3] > [2014-02-07 00:49:45,704][DEBUG][monitor.jvm ] [log-els01] > enabled [true], last_gc_enabled [false], interval [1s], gc_threshold > [{old=GcThreshold{name='old', warnThreshold=10000, infoThreshold=5000, > debugThreshold=2000}, default=GcThreshold{name='default', > warnThreshold=10000, infoThreshold=5000, debugThreshold=2000}, > young=GcThreshold{name='young', warnThreshold=1000, infoThreshold=700, > debugThreshold=400}}] > [2014-02-07 00:49:46,212][DEBUG][monitor.os ] [log-els01] > Using probe [org.elasticsearch.monitor.os.SigarOsProbe@7139edf6] with > refresh_interval [1s] > [2014-02-07 00:49:46,217][DEBUG][monitor.process ] [log-els01] > Using probe [org.elasticsearch.monitor.process.SigarProcessProbe@50a744be] > with refresh_interval [1s] > [2014-02-07 00:49:46,223][DEBUG][monitor.jvm ] [log-els01] > Using refresh_interval [1s] > [2014-02-07 00:49:46,228][DEBUG][monitor.network ] [log-els01] > Using probe [org.elasticsearch.monitor.network.SigarNetworkProbe@4ae0a5dc] > with refresh_interval [5s] > [2014-02-07 00:49:46,235][DEBUG][monitor.network ] [log-els01] > net_infofailed to get Network Interface Info [log-els01: log-els01: Name or > service not known] > [2014-02-07 00:49:46,238][DEBUG][monitor.fs ] [log-els01] > Using probe [org.elasticsearch.monitor.fs.SigarFsProbe@5aa22f2c] with > refresh_interval [1s] > [2014-02-07 00:49:46,570][DEBUG][indices.store ] [log-els01] > using indices.store.throttle.type [MERGE], with > index.store.throttle.max_bytes_per_sec [20mb] > [2014-02-07 00:49:46,578][DEBUG][cache.memory ] [log-els01] > using bytebuffer cache with small_buffer_size [1kb], large_buffer_size > [1mb], small_cache_size [10mb], large_cache_size [500mb], direct [true] > [2014-02-07 00:49:46,583][DEBUG][script ] [log-els01] > using script cache with max_size [500], expire [null] > [2014-02-07 00:49:46,588][DEBUG][cluster.routing.allocation.decider] > [log-els01] using node_concurrent_recoveries [2], > node_initial_primaries_recoveries [4] > [2014-02-07 00:49:46,588][DEBUG][cluster.routing.allocation.decider] > [log-els01] using [cluster.routing.allocation.allow_rebalance] with > [indices_all_active] > [2014-02-07 00:49:46,588][DEBUG][cluster.routing.allocation.decider] > [log-els01] using [cluster_concurrent_rebalance] with [2] > [2014-02-07 00:49:46,591][DEBUG][gateway.local ] [log-els01] > using initial_shards [quorum], list_timeout [30s] > [2014-02-07 00:49:46,735][DEBUG][indices.recovery ] [log-els01] > using max_bytes_per_sec[20mb], concurrent_streams [3], file_chunk_size > [512kb], translog_size [512kb], translog_ops [1000], and compress [true] > [2014-02-07 00:49:46,801][DEBUG][http.netty ] [log-els01] > using max_chunk_size[8kb], max_header_size[8kb], > max_initial_line_length[4kb], max_content_length[100mb], > receive_predictor[512kb->512kb] > [2014-02-07 00:49:46,805][DEBUG][indices.memory ] [log-els01] > using index_buffer_size [408.7mb], with min_shard_index_buffer_size [4mb], > max_shard_index_buffer_size [512mb], shard_inactive_time [30m] > [2014-02-07 00:49:46,806][DEBUG][indices.cache.filter ] [log-els01] > using [node] weighted filter cache with size [20%], actual_size [817.5mb], > expire [null], clean_interval [1m] > [2014-02-07 00:49:46,807][DEBUG][indices.fielddata.cache ] [log-els01] > using size [-1] [-1b], expire [null] > [2014-02-07 00:49:46,819][DEBUG][gateway.local.state.meta ] [log-els01] > using gateway.local.auto_import_dangled [YES], with > gateway.local.dangling_timeout [2h] > [2014-02-07 00:49:46,819][DEBUG][gateway.local.state.meta ] [log-els01] > took 0s to load state > [2014-02-07 00:49:46,819][DEBUG][gateway.local.state.shards] [log-els01] > took 0s to load started shards state > [2014-02-07 00:49:46,824][DEBUG][bulk.udp ] [log-els01] > using enabled [false], host [null], port [9700-9800], bulk_actions [1000], > bulk_size [5mb], flush_interval [5s], concurrent_requests [4] > [2014-02-07 00:49:46,826][DEBUG][cluster.routing.allocation.decider] > [log-els01] using node_concurrent_recoveries [2], > node_initial_primaries_recoveries [4] > [2014-02-07 00:49:46,826][DEBUG][cluster.routing.allocation.decider] > [log-els01] using [cluster.routing.allocation.allow_rebalance] with > [indices_all_active] > [2014-02-07 00:49:46,826][DEBUG][cluster.routing.allocation.decider] > [log-els01] using [cluster_concurrent_rebalance] with [2] > [2014-02-07 00:49:46,827][DEBUG][cluster.routing.allocation.decider] > [log-els01] using node_concurrent_recoveries [2], > node_initial_primaries_recoveries [4] > [2014-02-07 00:49:46,827][DEBUG][cluster.routing.allocation.decider] > [log-els01] using [cluster.routing.allocation.allow_rebalance] with > [indices_all_active] > [2014-02-07 00:49:46,827][DEBUG][cluster.routing.allocation.decider] > [log-els01] using [cluster_concurrent_rebalance] with [2] > [2014-02-07 00:49:46,836][INFO ][node ] [log-els01] > initialized > [2014-02-07 00:49:46,836][INFO ][node ] [log-els01] > starting ... > [2014-02-07 00:49:46,850][DEBUG][netty.channel.socket.nio.SelectorUtil] > Using select timeout of 500 > [2014-02-07 00:49:46,850][DEBUG][netty.channel.socket.nio.SelectorUtil] > Epoll-bug workaround enabled = false > [2014-02-07 00:49:46,890][DEBUG][transport.netty ] [log-els01] > Bound to address [/0:0:0:0:0:0:0:0:9300] > [2014-02-07 00:49:46,893][INFO ][transport ] [log-els01] > bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/ > 172.22.123.224:9300]} > [2014-02-07 00:49:49,912][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:49:52,913][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:49:55,914][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:49:58,915][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:01,916][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:04,918][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:07,919][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:10,920][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:13,921][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:16,900][WARN ][discovery ] [log-els01] > waited for 30s and no initial state was set by the discovery > [2014-02-07 00:50:16,900][INFO ][discovery ] [log-els01] > logstash-cluster/ZFdVtQwxQLuPCArpBV6bew > [2014-02-07 00:50:16,900][DEBUG][gateway ] [log-els01] > can't wait on start for (possibly) reading state from gateway, will do it > asynchronously > [2014-02-07 00:50:16,907][INFO ][http ] [log-els01] > bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/ > 172.22.123.224:9200]} > [2014-02-07 00:50:16,908][INFO ][node ] [log-els01] > started > [2014-02-07 00:50:16,922][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:19,929][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:22,931][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:25,962][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:28,963][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:31,965][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:35,003][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:38,025][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:41,048][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:44,051][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > [2014-02-07 00:50:47,053][DEBUG][discovery.zen ] [log-els01] > filtered ping responses: (filter_client[true], filter_data[false]) {none} > > Am I missing something obvious? Can I do anything to get more debug > information? > > Thanks. > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/68ed18b2-805f-43ce-bb0b-c4259c7004ef%40googlegroups.com > . > For more options, visit https://groups.google.com/groups/opt_out. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ZJbf5oVTVTcUk5gDA3WsyjPLcw0UEsMo_jZB%3D83shsPw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
