I realize "limit" is not a limit for response size. I'm actually ok with getting more than one result. I'm actually not relying on limit for a size.
I often use size in conjunction with limit. I'll do this when I really don't care how many items I get back, as long as it is within a range. But I implement the limit to help decrease the load on the shards. That said, I need to understand what expectations I can have around limit. Is it completely non-deterministic? Or can I have reasonable expectations about it? I will propose an example and describe my expectations: Node setup: 1 index 1 mapping 5 shards 1,000,000 documents sharded across the 5 shards 1000 matching documents sharded across the 5 shards let's assume normal distribution of the matching documents: 200 documents per shard. I realize this is not realistic to get an exact distribution like this. If I place a limit of 5 on the query, I expect 25 documents back. That is, I get 5 documents from each node. I expect this because I have at least 5 matching documents per shard. In fact, I have many more than 5 matching documents per shard. But I expect the limit to return five documents from each shard. Now I realize there are lots of real world circumstance that would cause the query to return fewer than 25 documents. Let's ignore those for the time being and remain under the assumption that the distribution is even. Now, if I place a limit of 1 on the query, I expect 5 documents back. Are these two expectations correct? Now let's assume a worst case scenario: all of the matching documents are on one shard. A limit of 5 should still return 5 documents. A limit of 1 should return 1 document. If these expectations are true, then my original scenario is valid and a limit of 1 should still return 1 document. So are these expectations valid? Or is limit completely non-deterministic? Size does work, but if I can improve performance with a limit, I would like to do so. It is possible that I have tens of thousands of matching documents, and limit could be an excellent short-circuit. Basically I want the shard to stop searching as soon as it has found one document. Also, I don't have the document _id so I cannot make the HEAD call. Do these clarifications help? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7dd91dd3-bec2-48d5-97b6-334fe10e3cb1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
