Agree with the original poster that none of the existing solutions are ideal. Making it simpler and safer to roll out revised mappings would be a huge win if your use case involves incremental revisions/refinements to your indexing strategies. A lossless solution would especially benefit the case where ES is being used as the primary data source (an option we have been considering), since you really don't want to drop a record in that case.
On Monday, February 24, 2014 9:20:56 AM UTC-5, JoeZ99 wrote: > > How about, while the scan is being done, let updates go to the old index > but with an extra field? Once the alias points to the new index, it's just > a query to fetch the fields with that new field from the old index and then > reindex then into the new one. If the alias changing/new index creation is > unsuccessful , then update old index to remove that new field. > > On Friday, February 21, 2014 3:11:52 AM UTC-5, Andrew Kane wrote: >> >> I tried to post a reply yesterday but it looks like it never made it. >> >> Thank you all for the quick replies. Here's a slightly better >> explanation of where I believe the race condition occurs. >> >> When the scan/scroll starts, the alias is still pointing to the old >> index, so updates go to the old index. Let's say you update Document 1. If >> the scroll/scan has already passed Document 1, the new index never sees the >> update. The three solutions you mentioned Nik are to either: >> >> 1. Keep track of updates manually [tedious] >> 2. Pause the jobs that perform the updates [out of sync] >> 3. Send updates to both indexes [also tedious] >> >> However, none of these seem ideal. >> >> - Andrew >> >> On Tuesday, February 18, 2014 8:41:18 PM UTC-8, Andrew Kane wrote: >>> >>> Hi, >>> >>> I've followed the documentation for zero-downtime mapping changes and it >>> works great. >>> http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/ >>> >>> However, there is a (pretty big) race condition with this approach - >>> while reindexing, changes may not make it to the new index. I've looked >>> all over and haven't found a single solution to address this. The best >>> attempt I've seen is to buffer updates, but this is tedious and still >>> leaves a race condition (with a smaller window). My initial thoughts were >>> to create a write alias that points to the old and new indices and use >>> versioning. However, there is no way to write to multiple indices >>> atomically. >>> >>> It seems like this issue should affect most Elasticsearch users (whether >>> they realize it or not). Does anyone have a good solution to this? >>> >>> Thanks, >>> Andrew >>> >>> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/670c8443-3706-4dd0-a57d-d2e9fcac9ce1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
