Performance issues when flagging a document in Elasticsearch

Dror Atariah Wed, 10 Dec 2014 07:22:45 -0800

Assume that I want to be able to flag documents in an index according to 
their attributes: isFoo and isBar [1]. As far as I understand, there are 
two approaches:

1) Use dedicated fields for the flags: If the document is a Foo then add a
field named isFoo. Similarly, for isBar.
2) Use a flags field that will be an array of strings. In this case, if the
document is Foo then "flags" will contain the string "isFoo".

What are the pros and cons in terms of space and runtime complexities?

Bear in mind the following queries examples: Consider the case where one
wants to check the attributes of the documents in the index. In particular,
if I want to find the documents that are either Foo *or* Bar I can either
(a) In case (1): Use a Boolean "should" filter the surrounds two "exists"'s
filters checking whether either isFoo or isBar exist.
(b) In case (2): Use a single "exists" filter that checks the existence of
the field "flags".

A different case, is if I want to find the documents that are both Foo
*and* Bar:
(a) In case (1): Like before, replace the "should" with a "must".
(b) In case (2): Surround two "term"s filters with a "must" Boolean one.

Lastly, finding the documents that are Foo but *not* Bar.

In the bottom line, In case (1) all queries boil down to mixture of
Boolean, exists and missing filters. In case (2), one has to process the
strings in the array of strings named "flags". My intuition is that it is
faster to use method (1). In terms of space complexity I believe there is
no difference.

I'm looking forward to your insights!
Dror

[1]: Obviously, there could be way more flags...

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Performance issues when flagging a document in Elasticsearch

Reply via email to