Here is /etc/default/elasticsearch # Run Elasticsearch as this user ID and group ID #ES_USER=elasticsearch #ES_GROUP=elasticsearch
# Heap Size (defaults to 256m min, 1g max) ES_HEAP_SIZE=512m # Heap new generation #ES_HEAP_NEWSIZE= # max direct memory #ES_DIRECT_SIZE= # Maximum number of open files, defaults to 65535. MAX_OPEN_FILES=65535 # Maximum locked memory size. Set to "unlimited" if you use the # bootstrap.mlockall option in elasticsearch.yml. You must also set # ES_HEAP_SIZE. MAX_LOCKED_MEMORY=unlimited # Maximum number of VMA (Virtual Memory Areas) a process can own #MAX_MAP_COUNT=262144 # Elasticsearch log directory #LOG_DIR=/var/log/elasticsearch # Elasticsearch data directory #DATA_DIR=/var/lib/elasticsearch # Elasticsearch work directory #WORK_DIR=/tmp/elasticsearch # Elasticsearch configuration directory #CONF_DIR=/etc/elasticsearch # Elasticsearch configuration file (elasticsearch.yml) #CONF_FILE=/etc/elasticsearch/elasticsearch.yml # Additional Java OPTS #ES_JAVA_OPTS= # Configure restart on package upgrade (true, every other setting will lead to not restarting) #RESTART_ON_UPGRADE=true I also see the same setting in /etc/init.d/elasticsearch. Do you know which file takes priority? And what a good size would be? On Tuesday, September 9, 2014 11:32:19 AM UTC-4, vineeth mohan wrote: > > Hello Joshua , > > I am not sure which variable you are referring to on the memory settings > in the config file , please paste the comment and config. > I usually change the config from init.d script. > > Best approach would be to bulk index say 10,000 feeds in sync mode , wait > until is everything is indexed and then proceed to the next batch. > I am not sure about the java API , but long back i used to curl to this > stats API and see how much request was rejected. > > Thanks > Vineeth > > On Tue, Sep 9, 2014 at 8:58 PM, Joshua P <[email protected] > <javascript:>> wrote: > >> You also said you wouldn't recommend indexing that much information at >> once. How would you suggest breaking it up and what status should I look >> for before doing another batch? I have to come up with some process that is >> repeatable and mostly automated. >> >> On Tuesday, September 9, 2014 11:12:59 AM UTC-4, Joshua P wrote: >>> >>> Thanks for the reply, Vineeth! >>> >>> What's a practical heap size? I've seen some people saying they set it >>> to 30gb but this confuses me because in the /etc/default/elasticsearch >>> file, the comment suggests the max is only 1gb? >>> >>> I'll look into the threadpool issue. Is there a Java API for monitoring >>> Cluster Node health? Can you point me at an example or give me a link to >>> that? >>> >>> Thanks! >>> >>> On Tuesday, September 9, 2014 10:52:35 AM UTC-4, vineeth mohan wrote: >>>> >>>> Hello Joshuva , >>>> >>>> I have a feeling this has something to do with the threadpool. >>>> There is a limit on number of feeds to be queued for indexing. >>>> >>>> Try increasing the size of threadpool queue of index and bulk to a >>>> large number. >>>> Also through cluster node API on threadpool, you can see if any request >>>> has failed. >>>> Monitor this API for any failed request due to large volume. >>>> >>>> Threadpool - http://www.elasticsearch.org/guide/en/elasticsearch/ >>>> reference/current/modules-threadpool.html >>>> Threadpool stats - http://www.elasticsearch.org/guide/en/elasticsearch/ >>>> reference/current/cluster-nodes-stats.html >>>> >>>> Having said that , i wont recommend bulk indexing that much information >>>> at a time and 512 MB is not going to help much. >>>> >>>> Thanks >>>> Vineeth >>>> >>>> On Tue, Sep 9, 2014 at 7:48 PM, Joshua P <[email protected]> wrote: >>>> >>>>> Hi there! >>>>> >>>>> I'm trying to do a one-time index of about 800,000 records into an >>>>> instance of elasticsearch. But I'm having a bit of trouble. It >>>>> continually >>>>> fails around 200,000 records. Looking at in the Elasticsearch Head >>>>> Plugin, >>>>> my index goes offline and becomes unrecoverable. >>>>> >>>>> For now, I have it running on a VM on my personal machine. >>>>> >>>>> VM Config: >>>>> Ubuntu Server 14.04 64-Bit >>>>> 8 GB RAM >>>>> 2 Processors >>>>> 32 GB SSD >>>>> >>>>> Java >>>>> java version "1.7.0_65" >>>>> OpenJDK Runtime Environment (IcedTea 2.5.1) >>>>> (7u65-2.5.1-4ubuntu1~0.14.04.2) >>>>> OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode) >>>>> >>>>> Elasticsearch is using mostly the defaults. This is the output of: >>>>> curl http://localhost:9200/_nodes/process?pretty >>>>> { >>>>> "cluster_name" : "property_transaction_data", >>>>> "nodes" : { >>>>> "KlFkO_qgSOKmV_jjj5xeVw" : { >>>>> "name" : "Marvin Flumm", >>>>> "transport_address" : "inet[/192.168.133.131:9300]", >>>>> "host" : "ubuntu-es", >>>>> "ip" : "127.0.1.1", >>>>> "version" : "1.3.2", >>>>> "build" : "dee175d", >>>>> "http_address" : "inet[/192.168.133.131:9200]", >>>>> "process" : { >>>>> "refresh_interval_in_millis" : 1000, >>>>> "id" : 1092, >>>>> "max_file_descriptors" : 65535, >>>>> "mlockall" : true >>>>> } >>>>> } >>>>> } >>>>> } >>>>> >>>>> I adjusted ES_HEAP_SIZE to 512mb. >>>>> >>>>> I'm using the following code to pull data from SQL Server and index >>>>> it. >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/elasticsearch/f94f96d4-8c3f-462f-bdcf-df717cbc6269% >>>>> 40googlegroups.com >>>>> <https://groups.google.com/d/msgid/elasticsearch/f94f96d4-8c3f-462f-bdcf-df717cbc6269%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/0dcac495-a071-4644-9349-109071fb1855%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/0dcac495-a071-4644-9349-109071fb1855%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b439af3d-69b0-4301-bf07-22b37767a17c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
