You should switch to using bulk indexing instead of indexing an individual documents. Also, considering switching off the refresh interval (set it to -1) for the duration of your bulk indexing.
Cheers, Ivan On Mon, Aug 4, 2014 at 3:08 AM, Dennis de Boer <[email protected]> wrote: > > Sure, along with some additional info. > > - I use the Java API within a Grails application > - I use Elasticsearch version 0.90.5 > > > ### code to create a Transportclient. > > Code is executed on server B, pointing to ES instance on Server A > I tried to increase the timeout to check if this would help > anything.....which isn't the case. > > Settings settings = ImmutableSettings.settingsBuilder() > .put("cluster.name", "prodcluster") > .put("discovery.zen.ping.multicast.enabled", false) > .put("discovery.zen.ping.unicast.hosts", ["10.0.184.47"]) > .put("client.transport.ping_timeout","25s") > .put("client.transport.nodes_sampler_interval","25s") > .build(); > > Client transportClient = new TransportClient(settings) > .addTransportAddress(new > InetSocketTransportAddress("10.0.184.47", 9300)); > > > > ### then a lot of feed parsing magic happens, e.g. checking which products > are new/updated and converting them into a structure which I can use for my > application > > *Every* record is then inserted using this piece of code > > UpdateResponse response = client.prepareUpdate(indexName, > documentTypeName, product.internal_id) > .setDoc(product) > .setUpsert(newProduct) > .setId(product.internal_id) > .execute() > .actionGet(); > > This update function is called more then 100.000 times. Once for every > product in all the product feeds. > Sometimes after ~2000 or ~3000 records I receive a NoNodeAvailableException > > > > Op maandag 4 augustus 2014 10:31:02 UTC+2 schreef Dennis de Boer: > >> Hi all, >> >> Hope you can give me some pointers on this topic. I'm trying to figure >> out what is going wrong in my setup/config but I cannot figure it out. >> >> I have two servers. Server A hosts a public website with the >> elasticsearch index. >> Server B retrieves XML productfeeds , parses the feeds, and >> adds/deletes/updates these products into the ES index on server A using a >> TransportClient. >> >> The problem: >> - this process takes place on 3:00 am (supposedly a quite time) >> - 9 out of 10 days I'll get a NoNodeAvailableException on a random point >> during the indexing of the records. >> - When I run the process during daytime (e.g. 10:00 am), everything works >> fine. >> >> My guess: >> In the access logs I see that a lot of bots are crawling my site around >> 3:00 am. Since the exception occurs randomly and only when the site is >> busy, it has to be a threading/connection problem >> >> >> My question: >> How can I debug this problem to figure out if it is a >> threading/jvm/memory/connection problem. I want to see some actual proof >> instead of guessing around. >> Any debug settings or plugins I can try to monitor the ES nodes? >> >> Any tips or pointers are most appreciated. >> Dennis >> >> >> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/1d206366-b13e-479e-8bbe-d79b867ad64b%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/1d206366-b13e-479e-8bbe-d79b867ad64b%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDGHFGWXT2OYDsY8cSiTSRUnc%2BSM8zq6rx0Dgif52tC1Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
