Hi all,

I have a tests index with 43 million documenst. there is a string document 
value for each document. (about 5-10 character value for each document)

Mapping is:

{

  "myindex" : {

    "mappings" : {

      "num_type" : {

        "_type" : {

          "store" : true

        },

        "properties" : {

          "doc_value" : {

            "type" : "string",

            "doc_values_format" : "default"

          },

          "int1" : {

            "type" : "integer",

            "index" : "analyzed",

            "store" : true

          },

          "int2" : {

.

.

.

I need to retrieve the document values only for queries that may return 
about 100,000 documents result set. I do not need ranking or anything else 
that will slow this down.

 

My understanding is that if the query is only a filter – ranking is not 
computed, and it is faster.

Here is a small python program to test it:


*import *elasticsearch

es = elasticsearch.Elasticsearch()

results = es.search(*"myindex"*, *"num_type"*,
    {
        *"fields"*:[*"doc_value"*],
       *"size"*:1000,
       *"query"*: {*"filtered"*: {
                   *"query"*: {*"match_all"*:{}}
                  ,*"filter"*: {
                    *"term"*: {*"r_int3"*: 929}}
               }}
    },scroll=*"10s"*,search_type=*"scan"*)


*while True*:
    results = es.scroll(results[*"_scroll_id"*], scroll=*"10s"*)
    *if *len(results[*"hits"*][*"hits"*]) <= 0:
        *break*

 

The query runs pretty slow, and I see there is huge number of access to the 
*.fdt (field data) file.

But I ask for a document value field – so why does ES access the *.fdt.

Thanks a lot in advance.

 


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/89480f13-b00e-4e3f-a538-15fdbd18f073%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to