Re: Bulk Indexing Problems

vineeth mohan Tue, 09 Sep 2014 07:53:07 -0700

Hello Joshuva ,

I have a feeling this has something to do with the threadpool.
There is a limit on number of feeds to be queued for indexing.


Try increasing the size of threadpool queue of index and bulk to a large
number.
Also through cluster node API on threadpool, you can see if any request has
failed.
Monitor this API for any failed request due to large volume.

Threadpool -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-threadpool.html
Threadpool stats -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html

Having said that , i wont recommend bulk indexing that much information at
a time and 512 MB is not going to help much.

Thanks
          Vineeth

On Tue, Sep 9, 2014 at 7:48 PM, Joshua P <[email protected]> wrote:

> Hi there!
>
> I'm trying to do a one-time index of about 800,000 records into an
> instance of elasticsearch. But I'm having a bit of trouble. It continually
> fails around 200,000 records. Looking at in the Elasticsearch Head Plugin,
> my index goes offline and becomes unrecoverable.
>
> For now, I have it running on a VM on my personal machine.
>
> VM Config:
> Ubuntu Server 14.04 64-Bit
> 8 GB RAM
> 2 Processors
> 32 GB SSD
>
> Java
> java version "1.7.0_65"
> OpenJDK Runtime Environment (IcedTea 2.5.1) (7u65-2.5.1-4ubuntu1~0.14.04.2)
> OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
>
> Elasticsearch is using mostly the defaults. This is the output of:
> curl http://localhost:9200/_nodes/process?pretty
> {
>   "cluster_name" : "property_transaction_data",
>   "nodes" : {
>     "KlFkO_qgSOKmV_jjj5xeVw" : {
>       "name" : "Marvin Flumm",
>       "transport_address" : "inet[/192.168.133.131:9300]",
>       "host" : "ubuntu-es",
>       "ip" : "127.0.1.1",
>       "version" : "1.3.2",
>       "build" : "dee175d",
>       "http_address" : "inet[/192.168.133.131:9200]",
>       "process" : {
>         "refresh_interval_in_millis" : 1000,
>         "id" : 1092,
>         "max_file_descriptors" : 65535,
>         "mlockall" : true
>       }
>     }
>   }
> }
>
> I adjusted ES_HEAP_SIZE to 512mb.
>
> I'm using the following code to pull data from SQL Server and index it.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f94f96d4-8c3f-462f-bdcf-df717cbc6269%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/f94f96d4-8c3f-462f-bdcf-df717cbc6269%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3DH1TFTnf41gB43tQkLghVXbD5K6_qXUcCD1PVqWfOhLQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Bulk Indexing Problems

Reply via email to