Thank for your answer. I'm probably posted too few details, here is better
description:
I'm using postings highlighter, but also checked plain and fvh - both was
remarkable slower in my case.
Fields text_content* are mapped through dynamic template:
"dynamic_templates": [
{
"text_content": {
"match": "text_content*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"analyzer": "polish",
"index_options": "offsets"
}
}
}
]
}
polish analyzer is defined as follow (using this plugin:
https://github.com/monterail/elasticsearch-analysis-morfologik which
provides morfologik_stem token filter):
"analyzer": {
"polish": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"morfologik_stem"
]
}
}
Pure quering in text_content is always very fast - tooks <40ms.
Total amount of time for executing request is increasing when number of
matched documents grows (more items added to index).
So i've stared thinking that highlighter is working for all of matched
documents, not only for items requested by current request (start and size
parameters). It's correct? There is some way to speed up such case (forcing
to highlight only in requested window of documents?).
Karol
2015-01-19 1:31 GMT+01:00 Nikolas Everett <[email protected]>:
> Highlighting is complex and more hacky than you'd imagine at first glance.
> Each highlighter is different and we can't tell which one you are using
> without seeing your mapping. For the plain highlighter the cost is roughly
> proportional to the length of the highlighted field. So in your case its
> the cost to reanalyze every one of those pages.
>
> You could return which page is matched pretty cheaply if you were willing
> to write a plugin. Especially if you just wanted to know the first page or
> something.
>
> You could try using explain if you searched for text_content_*. That'd
> tell you which field matched.
>
> Nik
> On Jan 18, 2015 6:21 PM, "Karol Sikora" <[email protected]> wrote:
>
>> Hi all,
>>
>> I have some specific requirements for highlighting. I need to search in
>> full content of item for phrase, and then show on which page searched
>> phrase is occuring. So i've created one field named text_content and fields
>> named text_content_{page_number} (text_content_1, text_content_2, etc.).
>> Example query is:
>> {
>> "highlight": {
>> "fields": {
>> "text_content_*": {}
>> }
>> },
>> "query": {
>> "match": {
>> "text_content": "lorem"
>> }
>> },
>> "size": 40
>> }
>>
>> I've noticed that this query is fast, but only if i have small number of
>> documents in index. Quiering for documents is always fast (<40ms), but
>> highlight phase time is growing when number of documents in index is
>> growing.
>> I've stared thinking that highlighting may be processed before appending
>> "size": 40 - on the all matched documents. It's correct? How can in speed
>> up such case?
>>
>> Regards,
>> Karol
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/b8354eb3-3a75-4999-a180-6493240eb0cc%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/b8354eb3-3a75-4999-a180-6493240eb0cc%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/FzSTLVWyok8/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAPmjWd221YctsJE3QrkqnffjXACNzcZ5WaiuR1Ucrr0DV_U_NA%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAPmjWd221YctsJE3QrkqnffjXACNzcZ5WaiuR1Ucrr0DV_U_NA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAN8rAyJC38RPkxZTxb0tvX1UcsW4mtO_f1tBRrpQ3ssSQQaXHA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.