One thing you might want to consider is whether or not you need your index 
to stay perfectly in synch with your database. If a topic is viewed 1000 
times over the course of 2 minutes, is it important that Elasticsearch 
update after every one? Maybe after each update you queue a reindexing, but 
you only reindex once a minute and ignore duplicates. In this example 
instead of reindexing 1000 times you'd reindex twice, and your index would 
still be near-realtime. But that depends on the business requirements.

If you absolutely need every update to be applied to your index ASAP you'd 
have to use partial updates. If you don't need to search on viewCount at 
all, then you shouldn't let an update to that field trigger a 
reindex/update at all. We have a similar situation with a field that can be 
incremented/decremented constantly and we ignore it in our ingestion 
process. Search operates against the text fields and returns basic info - 
item descriptions, etc. - and the URL to the canonical version of the 
resource in our API. Clients that care about those dynamic fields can then 
hit the API to get all fields including the up-to-the-millisecond count 
fields.

On Saturday, March 21, 2015 at 4:24:25 AM UTC-4, ooo_saturn7 wrote:
>
> I have forum. And every topic has such field as viewCount - how many times 
> topic was viewed by forum users. 
>
> I wanted that all fields of topics were taken from ES 
> (id,date,title,content and viewCount). However, this case after every topic 
> view ES must reindex entire document again - I asked the question about 
> particial update at stack - 
> http://stackoverflow.com/questions/28937946/partial-update-on-field-that-is-not-indexed
>  
> .
>
> It means that if topic is viewed 1000 times ES will index it 1000 times. 
> And if I have a lot of users many documents will be indexed again and 
> again. This is first strategy.
>
> The second strategy, as I think is to take some fields of topic from index 
> and some from database. At this case I take viewAcount from DB. However, 
> then I can store all fields in DB and use index only as INDEX - to get ids 
> of current topic.
>
> Are there better strategies? What is the best way to solve such problem?
>
>
> -- 
> Александр Свиридов
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9ff06e0f-2bc3-43bc-91f2-d76f37e31309%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to