https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=18969

--- Comment #15 from Nick Clemens <n...@bywatersolutions.com> ---
(In reply to David Gustafsson from comment #14)
> Ok! I actually think that would be a bad idea, mainly for the following
> reasons:
> 
> 1) Elasticsearch uses ranking function called Okapi BM25...If you put all 
> values in one field, average field length
> and inverse document frequency will averaged out based on all fields,

Ah, okay, I see this in the documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-all-field.html

Built-in or constructed we pay a relevance price


> 2) You will also not be able to use per field boosting, unless you add
> boosted fields to "fields" as well, but then you might as well skip the
> "_all_*" fields and pass along the full list of fields instead.

Well, it does seem to work to only add boosted fields and boost those above
all, but again, not as exact

> 3) The index will be about 3x as big, increasing memory usage. This might
> not a huge issue, but could be for us for example as we have several million
> biblios and already quite a large index already.

Agreed, I think we would need to compare with an without the all field to see
exact impact


> 4) To utilize the full power of Elasticsearch one would want to be able to
> use different analyzers/normalizers and other useful mapping settings on a
> per field basis, and nice query string query options like
> "quote_field_suffix". With everything in one field, all data will be indexed
> using the same mapping settings, and features like quote_field_suffix will
> not work.

I don't think I actually follow you here - we still specify different analyzers
per field, but we also construct the _all field and use that for keyword
searching only - this is what we currently do. So we can search specific
fields, or use the all



> I can actually see no benefits with using "all_*" fields, and no real
> downside by instead generating a proper "fields" containing all searchable
> fields. 
The only downside is listing all the fields individually so a small cost in
construction of queries and query size, but not terrible I would think

>I begun working on a patch today (one of the reasons was that we
> need per field boosting), and it's actually not a very complicated change.
> Might not be ready tomorrow, but at least some time in the beginning of next
> week.

Looking forward to it! :-) - have you seen bug 18316?
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=18316

-- 
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Reply via email to