Hi,

I have a strange problem with ES. It's running on a cluster with 64 cores 
etc, so I don't think the power of the hardware is the issue.

I want to index a lot of documents with elasticsearch-hadoop.
After some problems I now have everything into place and it seems to work 
fine.

So I wrote a simple pig script which loads all the files (~500) and stores 
them into an ES index.
However, after ~22h the job failed, because of connection problems between 
the nodes.
But during that time, there wasn't any heavy usage of network bandwidth or 
other ressources.

After that I tried to run the pig script only for one document so I know 
what is indexed and what is missing.
After about 3 documents indexed well doing this, the jobs started to fail 
again, due to network problems although there wasn't any significant load.

I observed that even after the indexing jobs stopped, there was stuff 
happening with the index. The number of documents kept growing for quite 
some time and the translog operations went up and down being mostly at 
about half a million. 

For me this looks like the index takes more time indexing than the pig 
script takes for writing into the index and after some time somewhere a 
buffer gets too full.

Is this possible? I would expect, that in this case elasticsearch-hadoop 
should get throttled.

The only documentation about the translog is what I found here: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-translog.html
which I find a bit little. I still don't know what implications the number 
of translog operations has.

On the linked page it says, I could increase the numbers when doing bulk 
indexing but I don't understand how this would help.
Also what's TPS?

Best regards
Julius

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d17e1231-da99-4bc2-b019-806046ffd34e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to