Hi I have and index with basic analyzer. The data that I store is schemaless. The mapping - i say - store the document.
I add documents to ES through MR job. Data in LinkedHashMapWritable is loaded into ES. The number of key/values varies from 15 - 40 per document. Also the key/values are not the same for all the docuements. But i thought ES schemaless document handling manges the same. But the _source data returned with get or search contains wrong data (lots of key/values that are not a part of the document and few key/values of the document missing). Number of documents -> about 100k only. Below is the schema. curl -XPUT 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/' curl -XPOST 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_close' curl -XPUT 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_settings' -d ' { "index": { "analysis": { "analyzer": { "default": { "type": "custom", "tokenizer": "keyword", "filter": ["trim", "lowercase"]} } } } }' curl -XPOST 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_open' curl -XDELETE 'qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/dedup_out/' curl -XPUT 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/dedup_out/_mapping' -d ' { "dedup_out" : { "_ttl" : { "enabled" : true, "default" : "7d" }, "_source" : {"enabled" : true}, "_all" : {"enabled" : false}, "norms": {"enabled" : false}, "ignore_above":128 } }' Let me know if you need any more details. Thanks Shobana -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ff09cef0-f9d0-4592-a700-efdfadf43dd1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
