I'm starting a project to index log files.  I don't particularly want to 
wait until the log files roll over.  There will be files from 100's of apps 
running across 100's of machines (not all apps intersect with all machines, 
but you get the drift).  Some roll over very fast; some may take days.

The problem comes that if I am constantly reindexing the same document 
(same id) am I loosing all old space (store and or index) or is 
Elasticsearch/Lucene smart enough to say here's a new version we'll 
overwrite the old store/index entries and point to this one where they are 
the same and add new ones.

Certainly, there is a more sophisticated model that treats every line as a 
unique document/row such that this doesn't become an issue, but I'm not 
ready to spend that kind of dev and hardware at this issue.  (Our 
elasticsearch solution is wrapped in a system that becomes really heavy 
handed when indexing such small pieces.)

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9d9d38f7-ba4f-470c-9864-5b9af8abc773%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to