Hi

I have and index with basic analyzer. The data that I store is schemaless.
The mapping - i say - store the document.

I add documents to ES through MR job. Data in LinkedHashMapWritable is 
loaded into ES.
The number of key/values varies from 15 - 40 per document. Also the 
key/values are not the same for all the docuements. But i thought ES 
schemaless document handling manges the same.
But the _source data returned with get or search contains wrong data (lots 
of key/values that are not a part of the document and few key/values of the 
document missing).


Number of documents -> about 100k only.

Below is the schema.

curl -XPUT 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/'

curl -XPOST 
'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_close'


curl -XPUT 
'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_settings' -d '
{
  "index": {
    "analysis": {
      "analyzer": {
        "default": {
         "type": "custom",
         "tokenizer": "keyword",
         "filter": ["trim", "lowercase"]}
      }
    }
  }
}'

curl -XPOST 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_open'


curl -XDELETE 'qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/dedup_out/'

curl -XPUT 
'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/dedup_out/_mapping' 
-d '
{
"dedup_out" : {
    "_ttl" : { "enabled" : true, "default" : "7d" },
    "_source" : {"enabled" : true},
    "_all" : {"enabled" : false},
    "norms": {"enabled" : false},
    "ignore_above":128
}   
}'

Let me know if you need any more details.

Thanks
Shobana

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ff09cef0-f9d0-4592-a700-efdfadf43dd1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to