Why are you putting business logic of this type in ES? It belongs in your workflow. At the ES indexer level you will have no idea of the source of truth of the questionable content. Unless you're web crawliing which means you're using the wrong search platform altogether imo.
On Friday, December 12, 2014 5:11:05 PM UTC-5, Konstantin Erman wrote: > > I noticed that occasionally I need to shield my ES cluster from some > documents, which are too many or too big or otherwise poison ES. > Usually I can formulate pretty easy query or criteria to detect those > documents and I'm looking for a way to block them from entering the index. > > Is there such pre-indexing filtering mechanism? May be Transforms can be > used for that purpose? > > Thank you! > Konstantin > > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c5f5b748-a725-4d43-b248-67215e7da576%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
