Weird that was the post I made yesterday morning that just now hit the list
after vanishing.

On Thu, Mar 12, 2015 at 10:21 AM, <[email protected]> wrote:

> I switched to using aliases about a year ago and I love it.  I am able to
> rebuild in the background and make a clean cutover once the process
> completes.
>
> Here are a couple of thoughts for your situation.
>
> First create a second index that has the same format as your original.
> When you are ready to start creating your final index, stop indexing to
> your original and start indexing into this new index.  Queries to both
> indexes can be accomplished using a new alias, or by modifying the requests
> to include both.  Now you can transfer the bulk of your data from
> workshop_index_v1 to workshop_index_v2 while workshop_index_v1 new
> continues to collect the new documents.  Once the initial scan and scroll
> completes, you can cut over to workshop_index_v2 and run a scan and scroll
> against the v1_new index, which should be relatively small and allow you to
> quickly transfer those into your v2 schema.
>
> The alternative is to run the scan and scroll twice against the v1 index.
> Once to build the v2 index, at which point you cut to v2.  The second time
> to pick up any documents that were added after you started your initial
> scan and scroll.  This is a less than ideal scenario, will take longer, and
> will result in an index with many deletes, without additional steps to
> check to see if documents already exist.  If you have a timestamp in your
> documents, you might be able to make this reasonable.  You will certainly
> want to optimize after you complete this process.
>
> The only downside to writing to the new one, is which one do you query
> during the transition.  If you write to the v2 index, queries to v1 will
> not show new data, while queries to v2 will only show new data until the
> migration progresses.  Queries that span both may be complicated as the
> mappings are different, if that is not the case then yes this is the easy
> way.  If you are ok with one of the caveats, then by all means this is the
> simplest route.
>
> Aaron
>
> On Wednesday, March 11, 2015 at 10:47:59 AM UTC-6, mzrth_7810 wrote:
>>
>> Hey everyone,
>>
>> I have a question about rebuilding an index. After reading the
>> elasticsearch guide and various topics here I've found that the best
>> practice for rebuilding an index without any downtime is by using aliases.
>> However, there are certain steps and processes around that, which I seek
>> advice for. First I'm going to take you through an example scenario, and
>> then I'll have some questions.
>>
>> For example, you have "workshop_index_v1", with an alias "workshop". The
>> "workshop_index_v1" has a type called "guitar" which has three properties
>> with the following mapping:
>>
>> "identifier" : "string"
>> "make" : "string"
>> "model" : "string"
>>
>> Lets assume there is a lot of data in workshop_index_v1/guitar at the
>> moment, which has been populated from a separate database.
>>
>> Now, I need to modify the mapping, because I've changed the source data,
>> I would like get rid of the "identifier" property, so my mapping becomes:
>>
>> "make" : "string"
>> "model" : "string"
>>
>> As we all know elasticsearch does not allow you to remove a property in
>> the mapping directly, you inevitably have to rebuild the index, which is
>> fine in my case.
>>
>> So now a few things came to mind when I thought how to do this:
>>
>>    - Create another index "workshop_index_v2", populate it with the data
>>    in "workshop_index_v1" using scroll and scan with the bulk API and later
>>    remove "workshop_index_v1" and add "workshop_index_v2" to the alias.
>>    - This will not work because the incorrect mapping(or a field value
>>       in the incorrect mapping) is already present in  "workshop_index_v1", 
>> I do
>>       not want to copy everything as is.
>>    - Create another index "workshop_index_v2", populate it with the data
>>    from the original source
>>       - This works
>>
>> One of the big issues here is, what happens to write requests while the
>> new index is being rebuilt.
>>
>> As you can only write to one index, which one do you write to, the old
>> one or the new one, or both?
>>
>> I feel, that writing to the new one, would work. I am beginner when it
>> comes to elasticsearch, any advice regarding any of this would be greatly
>> appreciated.
>>
>> Best regards
>>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/U40jRfvA-ZM/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/c1a1f011-4d4f-4dba-b7f5-6899d4fe671e%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/c1a1f011-4d4f-4dba-b7f5-6899d4fe671e%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAF9vEEqcvg3TMdXXjFDgRxhzfPFKnc0-TDx1ocp5t9tEby0b2w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to