I tried to reply earlier but seems Google lost that reply. My suggestion would be to create a v1_new index that has the same mappings as v1. When you are ready to migrate to v2, change indexing to go to v1_new, change searches to cover v1 and v1_new (alias or query string), copy v1 to v2, change indexing to go to v2, and searches to go to v2, copy v1_new to v2. This will allow you to index while copying while being able to easily identify the new documents.
If you are ok with only searching new documents for a while then you can start indexing to v2, change search to v2, and start the copy. If you are ok with only searching old documents for the duration of the transfer start indexing to v2, do the copy, then change search to v2. The last option is to leave indexing and search on v1, do the copy to v2, switch indexing and search to v2, do another copy from v1, and finally optimize. This has alot of potential problems. It will essentially create a deleted version of all your documents, so the optimize is needed to correct that. Also if your indexing is adding updates, and not just new documents, then the second copy from v1 might overwrite some of those updates, not good. If it were me and I was not ok with the 2nd or 3rd option I would defintely go route 1. On Wednesday, March 11, 2015 at 10:47:59 AM UTC-6, mzrth_7810 wrote: > > Hey everyone, > > I have a question about rebuilding an index. After reading the > elasticsearch guide and various topics here I've found that the best > practice for rebuilding an index without any downtime is by using aliases. > However, there are certain steps and processes around that, which I seek > advice for. First I'm going to take you through an example scenario, and > then I'll have some questions. > > For example, you have "workshop_index_v1", with an alias "workshop". The > "workshop_index_v1" has a type called "guitar" which has three properties > with the following mapping: > > "identifier" : "string" > "make" : "string" > "model" : "string" > > Lets assume there is a lot of data in workshop_index_v1/guitar at the > moment, which has been populated from a separate database. > > Now, I need to modify the mapping, because I've changed the source data, I > would like get rid of the "identifier" property, so my mapping becomes: > > "make" : "string" > "model" : "string" > > As we all know elasticsearch does not allow you to remove a property in > the mapping directly, you inevitably have to rebuild the index, which is > fine in my case. > > So now a few things came to mind when I thought how to do this: > > - Create another index "workshop_index_v2", populate it with the data > in "workshop_index_v1" using scroll and scan with the bulk API and later > remove "workshop_index_v1" and add "workshop_index_v2" to the alias. > - This will not work because the incorrect mapping(or a field value in > the incorrect mapping) is already present in "workshop_index_v1", I do > not > want to copy everything as is. > - Create another index "workshop_index_v2", populate it with the data > from the original source > - This works > > One of the big issues here is, what happens to write requests while the > new index is being rebuilt. > > As you can only write to one index, which one do you write to, the old one > or the new one, or both? > > I feel, that writing to the new one, would work. I am beginner when it > comes to elasticsearch, any advice regarding any of this would be greatly > appreciated. > > Best regards > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3b2d4361-1145-4f77-921a-c7be38e5bfa5%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
