You can validate the data at client side in your model before serializing it to JSON, or after a complete bulk index run.
There are reasons why Elasticsarch is schema-less. It is equivalent to allow any number of different fields (keys) and any content in fields (values) without any logical constraints. In a distributed system, commits per field, or transactions per field, or integrity checking can get very expensive. Because the index is inverted, and nodes can come and go, there is a significant penalty if you want document transaction safety and document integrity checks. I validate data in ES with the help o a large scan/scroll over the docs after bulk indexing, by searching for IDs if they exist or not. This is different from integrity constraint checking techniques like rule based methos known from RDBMs. Jörg On Sat, Feb 15, 2014 at 10:40 PM, Thierry Templier <[email protected]>wrote: > Hello, > > I wonder if there is a built-in way to validate data before indexing them. > I see two kinds of validation: > > * Structural validation of fields based on a regular expression for > example. Perhaps something can be configured in the mapping... > * Integrity validation of document. For example preventing from indexing a > document with a field value that already exists. > > In the case where there is no built-in support at the moment, is there a > way to extend ElasticSearch to add such processing before indexing using > the standard REST calls? > > Thanks very much for your help! > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/elasticsearch/52FFDEC9.7020007%40gmail.com. > For more options, visit https://groups.google.com/groups/opt_out. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGXXQxQ%2B5PRwrHw33uj3%2B8WwqLKiAZvnQrZ8bYUMfKYSw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
