Weird that was the post I made yesterday morning that just now hit the list after vanishing.
On Thu, Mar 12, 2015 at 10:21 AM, <[email protected]> wrote: > I switched to using aliases about a year ago and I love it. I am able to > rebuild in the background and make a clean cutover once the process > completes. > > Here are a couple of thoughts for your situation. > > First create a second index that has the same format as your original. > When you are ready to start creating your final index, stop indexing to > your original and start indexing into this new index. Queries to both > indexes can be accomplished using a new alias, or by modifying the requests > to include both. Now you can transfer the bulk of your data from > workshop_index_v1 to workshop_index_v2 while workshop_index_v1 new > continues to collect the new documents. Once the initial scan and scroll > completes, you can cut over to workshop_index_v2 and run a scan and scroll > against the v1_new index, which should be relatively small and allow you to > quickly transfer those into your v2 schema. > > The alternative is to run the scan and scroll twice against the v1 index. > Once to build the v2 index, at which point you cut to v2. The second time > to pick up any documents that were added after you started your initial > scan and scroll. This is a less than ideal scenario, will take longer, and > will result in an index with many deletes, without additional steps to > check to see if documents already exist. If you have a timestamp in your > documents, you might be able to make this reasonable. You will certainly > want to optimize after you complete this process. > > The only downside to writing to the new one, is which one do you query > during the transition. If you write to the v2 index, queries to v1 will > not show new data, while queries to v2 will only show new data until the > migration progresses. Queries that span both may be complicated as the > mappings are different, if that is not the case then yes this is the easy > way. If you are ok with one of the caveats, then by all means this is the > simplest route. > > Aaron > > On Wednesday, March 11, 2015 at 10:47:59 AM UTC-6, mzrth_7810 wrote: >> >> Hey everyone, >> >> I have a question about rebuilding an index. After reading the >> elasticsearch guide and various topics here I've found that the best >> practice for rebuilding an index without any downtime is by using aliases. >> However, there are certain steps and processes around that, which I seek >> advice for. First I'm going to take you through an example scenario, and >> then I'll have some questions. >> >> For example, you have "workshop_index_v1", with an alias "workshop". The >> "workshop_index_v1" has a type called "guitar" which has three properties >> with the following mapping: >> >> "identifier" : "string" >> "make" : "string" >> "model" : "string" >> >> Lets assume there is a lot of data in workshop_index_v1/guitar at the >> moment, which has been populated from a separate database. >> >> Now, I need to modify the mapping, because I've changed the source data, >> I would like get rid of the "identifier" property, so my mapping becomes: >> >> "make" : "string" >> "model" : "string" >> >> As we all know elasticsearch does not allow you to remove a property in >> the mapping directly, you inevitably have to rebuild the index, which is >> fine in my case. >> >> So now a few things came to mind when I thought how to do this: >> >> - Create another index "workshop_index_v2", populate it with the data >> in "workshop_index_v1" using scroll and scan with the bulk API and later >> remove "workshop_index_v1" and add "workshop_index_v2" to the alias. >> - This will not work because the incorrect mapping(or a field value >> in the incorrect mapping) is already present in "workshop_index_v1", >> I do >> not want to copy everything as is. >> - Create another index "workshop_index_v2", populate it with the data >> from the original source >> - This works >> >> One of the big issues here is, what happens to write requests while the >> new index is being rebuilt. >> >> As you can only write to one index, which one do you write to, the old >> one or the new one, or both? >> >> I feel, that writing to the new one, would work. I am beginner when it >> comes to elasticsearch, any advice regarding any of this would be greatly >> appreciated. >> >> Best regards >> > -- > You received this message because you are subscribed to a topic in the > Google Groups "elasticsearch" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/elasticsearch/U40jRfvA-ZM/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/c1a1f011-4d4f-4dba-b7f5-6899d4fe671e%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/c1a1f011-4d4f-4dba-b7f5-6899d4fe671e%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAF9vEEqcvg3TMdXXjFDgRxhzfPFKnc0-TDx1ocp5t9tEby0b2w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
