Well. I’m use to run demo where I can inject on my laptop (SSD drives) around 
8k to 10k doc per second.
I think the biggest problem you can have is to read your source documents not 
to write them to elasticsearch.

With a single index, I would probably reindex the 400 000 docs every day in a 
new a clean index and then switch the alias from old to new index.

But it depends on your read rate I guess.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr 
<https://twitter.com/elasticsearchfr> | @scrutmydocs 
<https://twitter.com/scrutmydocs>



> Le 3 nov. 2014 à 23:43, Ori P <[email protected]> a écrit :
> 
> And if I may ask, do you have a suggestion on how to update the single index? 
> I need to replace on a daily basis a bulk of about 20,000 documents at once, 
> with as little performance and data availability implications as possible.
> 
> On Tuesday, November 4, 2014 12:21:51 AM UTC+2, David Pilato wrote:
> Hmmm. Sounds like I misread what you explained in 2.
> 
> I missed the fact you want to have one index per store. So let me change my 
> answer.
> If a single index, one shard, can hold your 400 000 docs which sounds 
> reasonable to me, then one single index will be faster than querying 20 
> indices.
> 
> My 2 cents
> 
> -- 
> David Pilato | Technical Advocate | Elasticsearch.com 
> <http://elasticsearch.com/>
> @dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr 
> <https://twitter.com/elasticsearchfr> | @scrutmydocs 
> <https://twitter.com/scrutmydocs>
> 
> 
> 
>> Le 3 nov. 2014 à 23:01, Ori P <[email protected] <javascript:>> a écrit :
>> 
>> Thanks for replying David.
>> 
>> I thought approach 2 might be problematic since the alias on multiple 
>> indices would cause a query to run on every index separately, which I 
>> thought might slow things down. Apparently I was wrong?
>> 
>> And thanks for the tip about the refresh interval :)
>> 
>> On Monday, November 3, 2014 11:54:38 PM UTC+2, David Pilato wrote:
>> I don't see any benefit of solution 1.
>> 
>> I would definitely do solution 2.
>> 
>> I don't really think you could see a difference search time wise. But in 
>> term of IO 2 is better.
>> Also, you should modify refresh interval while indexing to -1 and call 
>> refresh after the bulk load.
>> 
>> HTH
>> 
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>> 
>> Le 3 nov. 2014 à 21:31, Ori P <[email protected] <>> a écrit :
>> 
>>> I would appreciate your suggestions in helping me design my elasticsearch 
>>> index.
>>> 
>>> I'm intending to index product feeds from about 20 on-line stores, each 
>>> store not having more than 20,000 products. each product has about 15 basic 
>>> fields.
>>> Most of the searches would be done on specific product categories, and not 
>>> specific stores.
>>> 
>>> Each store feed is updated every few days (each store separately), by 
>>> receiving an XML file containing all the products in the store (no deltas). 
>>> Each update, I need to remove from my index all the existing products from 
>>> that store and add the new ones.
>>> 
>>> I thought of two possibles approaches:
>>> 
>>> 1. Create a single index + an alias to that index. Once a new feed is 
>>> received, clone the existing index to a new index, remove from the new 
>>> index all the old products, add the new products and finally change the 
>>> alias to point to the new index.
>>> 
>>> 2. Create an index for each store, and an alias that points to all of the 
>>> indices. Once a new feed is received, just index it from scratch, remove 
>>> the old store index from the alias and add the new one.
>>> 
>>> I'm not sure which way will give me faster search results? or maybe there 
>>> is an even better approach I didn't think of...
>>> 
>>> Thanks in advance,
>>> 
>>> Ori
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected] <>.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/34f2766d-cada-4ba9-a4fa-961c34aa2f8b%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/elasticsearch/34f2766d-cada-4ba9-a4fa-961c34aa2f8b%40googlegroups.com?utm_medium=email&utm_source=footer>.
>>> For more options, visit https://groups.google.com/d/optout 
>>> <https://groups.google.com/d/optout>.
>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/6c85ec37-e93e-47d6-a29f-72207f9925d8%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/6c85ec37-e93e-47d6-a29f-72207f9925d8%40googlegroups.com?utm_medium=email&utm_source=footer>.
>> For more options, visit https://groups.google.com/d/optout 
>> <https://groups.google.com/d/optout>.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/6e4d869d-f09b-4f20-b2ca-4639c4a7bab4%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/elasticsearch/6e4d869d-f09b-4f20-b2ca-4639c4a7bab4%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3AABA50C-DAED-4BB9-B14A-C178C1D0CBE5%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Reply via email to