On Thu, Jun 12, 2014 at 9:31 AM, Ralph Meijer <[email protected]> wrote:

> On 2014-06-12 09:23, Adrien Grand wrote:
> > How many of these 100M documents have been indexed with Elasticsearch
> > 1.2.0?
>
> All of them, and I have like 40 of them, with their size a similar order
> of magnitude :-/
>

If all of them were indexed in 1.2.0 then it makes sense to just reindex
from _source as you suggested.


> > The tool would only send indexing requests for documents that have
> > been misrouted, so if you only see 8 indexing requests per second, which
> is
> > indeed low, that might mean that the bottleneck is searching for the
> > mis-routed documents.
>
> I couldn't find the source of the tool on GitHub, but maybe if I had
> some insight into how that search works, I could change some settings.
> Ideas welcome, of course. If it helps, I'm in #elasticsearch as ralphm.
>

It does a SCAN search that filters using a script filter (the one you added
to your config/scripts). And for each matching documents, it reindexes it
to the right shard, either by requiring a create operation
(copy_if_missing) or overriding a potential document that would have the
same _id (copy_overwrite).

-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j585BOmxdcjmiQV9vsPPfhiOd7z2wxL3Je5gcTRX5NVfw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to