Hey, can you create a github issue for this (and add as much info as possible)? And maybe try with elasticsearch 1.1 as well? Are you changes involving deletes as well?
Thanks! --Alex On Thu, Apr 10, 2014 at 5:01 PM, Morus Walter < [email protected]> wrote: > Hi, > > I'm currently trying to build an elasticsearch index > and experience some trouble. > > Indexing is based on database data and basically has three > steps: > a) index all database content > b) index incremental changes that happened during step a > (until almost all are done) > c) permanently index incremental changes > > b and c are different as b is part of the generation of a new > index, c is permanent index maintainance. > > For two of three indices this works fine. > For the third, which is the biggest and most complicated > (using child documents), step a works fine but after a handful > of updates in step b elasticsearch crashes the index and it > becomes unusable. The same happens if I run step c leaving out b. > > My indexer then dies from a http timeout. > > My first thought was, that there might be issues in the incremental > indexers (b and/or c). > However if I run the same indexer against a small partial version > of that index, everything works fine. > > The size of the full index is ~ 12 Mio documents and 7.2 GB. > I also tried a smaller index having just 3 Mio documents and > 1.4 GB size, no luck. > > There are some indexing operations where I - perhaps naivly - assumed > elasticsearch would take of the difficulties: > * it is possible that documents are deleted, that do not exist > seems to work fine, I get a 'not found' in these cases > * it is possbile that child documents are indexed where the parent > does not exist > I do not see errors in that situation. I did not check if the > documents are created, I would be fine with either rejecting them > or adding them. Index corruption is not so great though. > * it is possible that child documents are deleted where no document > with that parent id exists, both having or not having child documents > with that parent > > I tried to minimize these cases without effect on the crashes. > I cannot fully avoid them without searching first, which I so far > wanted to avoid. > But the same conditions can occur in the case of incremental indexing > on top of the small partial index (having some 30k documents) > and I see no problems there. > > I was mostly using ES 1.0.1 but finally tried 1.0.2 as well, > I started with two instances (on two different > servers) and one replica but reduced that to one instance and no > replica in order to take out complications from replication. > This did not have any effect. So I added the 2nd instance again and > now have two instances no replica. The index has 6 shards, three on > each instance. > Each instance has 8 GB of memory and 64k filescriptors configured. > The machines have 16 GB of memory. > > The OS is linux, ubuntu 10.4. JVM is java version "1.7.0_51" > Java(TM) SE Runtime Environment (build 1.7.0_51-b13) > Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode). > > The indexers are written in ruby using the elasticsearch gem. > > The index corruption shows up as some shards in the state UNASSIGNED. > I had some luck with indexes beeing fixed on server restart but > in other cases (one replica, no replica but only one instance) > the failure seemed to be unfixable. > > See below for the initial error messages. > I do not see any errors in the response messages for the indexing > and deletion request. > In the past the es server went into a state producing huge amounts > (>1 GB) of error messages, in my latest tests (with smaller indices) > this did not happen (there is difference in the number of replica as > well). > > > Where else could I look, to understand why the shard is failing? > Any explanations or at least guesses what might go wrong? > > best > Morus > > > PS: > The initial error messages look like > [2014-04-10 15:58:14,463][WARN ][index.merge.scheduler ] > [pjpp-production mas ter] [candidates_v0004][5] failed to merge > org.apache.lucene.store.AlreadyClosedException: this IndexReader is > closed at > org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:252) at > org.apache.lucene.index.CompositeReader.getContext(CompositeReader.ja > va:102) at > org.apache.lucene.index.CompositeReader.getContext(CompositeReader.ja > va:56) at > org.apache.lucene.index.IndexReader.leaves(IndexReader.java:502) at > org.elasticsearch.index.search.child.DeleteByQueryWrappingFilter.cont > ains(DeleteByQueryWrappingFilter.java:122) at > org.elasticsearch.index.search.child.DeleteByQueryWrappingFilter.getD > ocIdSet(DeleteByQueryWrappingFilter.java:81) at > org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDoc > IdSet(ApplyAcceptedDocsFilter.java:45) at > org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(Con > stantScoreQuery.java:142) at > org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy.fil > teredScorer(FilteredQuery.java:533) at > org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:13 3) > at > org.apache.lucene.search.QueryWrapperFilter$1.iterator(QueryWrapperFi > lter.java:59) at > org.apache.lucene.index.BufferedUpdatesStream.applyQueryDeletes(Buffe > redUpdatesStream.java:546) at > org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates( > BufferedUpdatesStream.java:284) > at > org.apache.lucene.index.IndexWriter._mergeInit(IndexWriter.java:3844) > at org.apache.lucene.index.IndexWriter.mergeInit(IndexWriter.java:3806) > at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3659) at > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMe > rgeScheduler.java:405) at > org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(Trac > kingConcurrentMergeScheduler.java:107) at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Conc > urrentMergeScheduler.java:482) [2014-04-10 15:58:14,464][WARN > ][index.engine.internal ] [pjpp-production mas ter] > [candidates_v0004][5] failed engine > org.apache.lucene.index.MergePolicy$MergeException: > org.apache.lucene.store.Alre adyClosedException: this IndexReader is > closed at > org.elasticsearch.index.merge.scheduler.ConcurrentMergeSchedulerProvi > > der$CustomConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler > Provider.java:109) at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Conc > urrentMergeScheduler.java:518) Caused by: > org.apache.lucene.store.AlreadyClosedException: this IndexReader is > closed at > org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:252) at > > org.apache.lucene.index.CompositeReader.getContext(CompositeReader.java:102) > at > org.apache.lucene.index.CompositeReader.getContext(CompositeReader.java:56) > at org.apache.lucene.index.IndexReader.leaves(IndexReader.java:502) at > > org.elasticsearch.index.search.child.DeleteByQueryWrappingFilter.contains(DeleteByQueryWrappingFilter.java:122) > at > > org.elasticsearch.index.search.child.DeleteByQueryWrappingFilter.getDocIdSet(DeleteByQueryWrappingFilter.java:81) > at > > org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(ApplyAcceptedDocsFilter.java:45) > at > > org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:142) > at > > org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy.filteredScorer(FilteredQuery.java:533) > at > org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:133) > at > > org.apache.lucene.search.QueryWrapperFilter$1.iterator(QueryWrapperFilter.java:59) > at > > org.apache.lucene.index.BufferedUpdatesStream.applyQueryDeletes(BufferedUpdatesStream.java:546) > at > > org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:284) > at > org.apache.lucene.index.IndexWriter._mergeInit(IndexWriter.java:3844) > at org.apache.lucene.index.IndexWriter.mergeInit(IndexWriter.java:3806) > at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3659) at > > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405) > at > > org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107) > at > > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482) > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/20140410170122.24d09521%40tucholsky.experteer.muc > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM9FCYssPQvoX2gJ2U%2By6%3DOmuHrprT1_8WrqOom-fPhF%3DQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
