Re: [Koha-devel] Elasticsearch and facets

2018-02-21 Thread Tomas Cohen Arazi
What's the DB size BTW? I have a couple sites with 200k biblios and
everything seems below 2sec.

El lun., 19 feb. 2018 a las 9:01, Claes Eriksson ()
escribió:

> First some ranting about Zebra and then a question about ElasticSearch.
>
> Using Zebra there is a Global system pref. called maxRecordsForFacets. By
> default the number of records used to build the facet list is the first 20
> results shown. You also get a warning in the manual to increase the number
> since it will impact respons time. Building a facet list out of 20 records
> may be enough for some people but if you think twice about it you realise
> that it is quite misleading, especially since the user does not know that
> the facet list is built on a very limited selection of records. Having the
> sort set on Author and searching for realivity theory, the facet list may
> not show Einstein if your library is specialised in physics and have a
> fairly large collection.
>
> I have high hopes for ElasticSearch to work with facets in a better way.
> The best would be if all the index could be used for producing e.g. the
> author facet and that it would have minimal impact on system performance. I
> understand that there is discussion about making facets configurable (bug
> 18235) but I cannot see any traces of the improvements I would like to see,
> but I may be wrong since I am quite new working with Koha.
>
> Regards
> Claes Eriksson, VTI, Sweden
>
> ___
> Koha-devel mailing list
> Koha-devel@lists.koha-community.org
> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : http://www.koha-community.org/
> git : http://git.koha-community.org/
> bugs : http://bugs.koha-community.org/

-- 
Tomás Cohen Arazi
Theke Solutions (https://theke.io )
✆ +54 9351 3513384
GPG: B2F3C15F
___
Koha-devel mailing list
Koha-devel@lists.koha-community.org
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Re: [Koha-devel] Elasticsearch and facets

2018-02-20 Thread Jonathan Druart
Hello Claes,

About Zebra first: maxRecordsForFacets is only used if facets are not built
from Zebra (use_zebra_facets=0).
If facets are retrieved from Zebra you may face performance issue depending
on the number of records you have in your catalogue (see bug 13665). You
can then play with the parameter "facetNumRecs" in your zebra config file
(default is 1000).

Regarding Elastic: Facets are not configurable yet, but will be with the
patches on bug 18235. It is a priority for me and that is why I proposed a
patch one year ago. You can easily test it and provide us feedback.

Regards,
Jonathan

On Tue, 20 Feb 2018 at 10:23 Claes Eriksson  wrote:

>
>1. To begin with: “use_zebra_facets” was set to 1 in koha-conf.xml
>file, so that was not the problem.
>2. Regarding Zebra’s native faceting I find some traces in an old RCF
>saying that it would be nice to replace "homegrown and inefficient
>routines for faceting" with Zebra’s native faceting.
>
> https://wiki.koha-community.org/wiki/C_%26_P_Search_Rewrite_RFC#.28e.29_Implement_search_engine_native_faceting
>.
>3. I have also checked some libraries around the world with Koha set
>up by some of the larger consultants and they all handle facets in this
>way. I may ad to my ranting that it becomes a serious problem when it comes
>to subject facets. Narrowing down a search using subject facets that does
>only reflect the first 20 results is not helping a student or researcher.
>BTW, we tested setting default to 50 first results and response time
>increased with 5-8 seconds = not acceptable.
>4. As I understand facets are hardcoded in Zebra and not at all
>configurable. If I am wrong I would be most grateful if you could hint me
>in the right direction. In ElasticSearch it is under discussion and not
>solved according to bug 18235?
>
>
>
> The best
>
> Claes
>
> 
> *Claes Eriksson*
> Information specialist
>
>
> Department of Communication, Marketing and Library
>
> *Swedish National Road and Transport Research Institute* (VTI)
> VTI, SE-581 95 Linkoeping, Sweden
> Tel: +46-13-20 40 00 <+46%2013%2020%2040%2000>  Direct: +46-13-20 41 99
> <+46%2013%2020%2041%2099>
> E-mail: claes.eriks...@vti.se
> --
>
> *Från:* Tomas Cohen Arazi 
> *Skickat:* den 20 februari 2018 01:13:18
> *Till:* David Cook
> *Kopia:* Claes Eriksson; koha-devel@lists.koha-community.org
> *Ämne:* Re: [Koha-devel] Elasticsearch and facets
>
>
>
> Facets are configurable, and goal is feature parity.
>
> El lun., 19 de feb. de 2018 8:59 p. m., David Cook <
> dc...@prosentient.com.au> escribió:
>
> Hi Claes,
>
>
>
> I think the maxRecordsForFacets system preference might be a bit legacy at
> this point. It was first used when we were building facets manually from
> retrieved records from Zebra (which was an awful awful no good process).
> However, I think most installations using Zebra these days use Zebra’s
> native faceting which uses the entire result set. At least that’s how
> things were a year or two ago. It may have changed since then. I’d have to
> double-check, and I have a lot on at the moment, so I’ll leave that for
> yourself or someone else to check. In any case, we’re all aware of the
> limitations around that system preference, which is why we made the move to
> Zebra’s native faceting. (One easy thing to check is in your koha-conf.xml
> file to see if “use_zebra_facets” is set to 1. If so, then I think
> “maxRecordsForFacets” shouldn’t have any effect.
>
>
>
> I’m not very familiar with the ElasticSearch efforts, but I think it
> started out as a like-for-like drop-in replacement. Making facets
> configurable would be great, but I’m not sure anyone has this as a priority
> at the moment. I think for now the main goal of ElasticSearch may be
> stability?
>
>
>
> David Cook
>
> Systems Librarian
>
> Prosentient Systems
>
> 72/330 Wattle St
>
> Ultimo, NSW 2007
>
> Australia
>
>
>
> Office: 02 9212 0899 <02%2092%2012%2008%2099>
>
> Direct: 02 8005 0595 <02%2080%2005%2005%2095>
>
>
>
> *From:* koha-devel-boun...@lists.koha-community.org [mailto:
> koha-devel-boun...@lists.koha-community.org] *On Behalf Of *Claes Eriksson
> *Sent:* Monday, 19 February 2018 11:01 PM
> *To:* koha-devel@lists.koha-community.org
> *Subject:* [Koha-devel] Elasticsearch and facets
>
>
>
> First some ranting about Zebra and then a question about ElasticSearch.
>
>
>
> Using Zebra there is a Global system pref. called maxRecordsForFacets. By
> default the number of records used to build the fa

Re: [Koha-devel] Elasticsearch and facets

2018-02-20 Thread Tomas Cohen Arazi
El mar., 20 feb. 2018 a las 10:23, Claes Eriksson ()
escribió:
> Regarding Zebra’s native faceting I find some traces in an old RCF saying
that it would
> be nice to replace "homegrown and inefficient routines for faceting" with
Zebra’s native
> faceting.

Zebra facets usage was implemented here:
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=11232

> I have also checked some libraries around the world with Koha set up by
some of the larger
> consultants and they all handle facets in this way. I may ad to my
ranting that it becomes a
> serious problem when it comes to subject facets. Narrowing down a search
using subject
> facets that does only reflect the first 20 results is not helping a
student or researcher. BTW,
> we tested setting default to 50 first results and response time increased
with 5-8 seconds =
> not acceptable.

If you are using Zebra's native facets implementation (use_zebra_facets) on
a recent Koha version (i.e. using DOM indexing) then your system is not
using maxRecordsForFacets, but this configuration:

https://gitlab.com/koha-community/Koha/blob/master/etc/zebradb/zebra-biblios-dom.cfg#L31

As you can see, it is pulling the facets from the first 1000 records in the
result set. You could definitely try lowering it down to something
reasonable (in your terms).

> As I understand facets are hardcoded in Zebra and not at all
configurable. If I am wrong I would > be most grateful if you could hint me
in the right direction. In ElasticSearch it is under
> discussion and not solved according to bug 18235?

Sorry for the mistake, I had my eyes on that bug a while ago and was sure
there was lots of interest in the feature, so assumed it was already in
master!

Good luck.


-- 
Tomás Cohen Arazi
Theke Solutions (https://theke.io )
✆ +54 9351 3513384
GPG: B2F3C15F
___
Koha-devel mailing list
Koha-devel@lists.koha-community.org
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Re: [Koha-devel] Elasticsearch and facets

2018-02-20 Thread Claes Eriksson
  1.  To begin with: “use_zebra_facets” was set to 1 in koha-conf.xml file, so 
that was not the problem.
  2.  Regarding Zebra’s native faceting I find some traces in an old RCF saying 
that it would be nice to replace "homegrown and inefficient routines for 
faceting" with Zebra’s native faceting. 
https://wiki.koha-community.org/wiki/C_%26_P_Search_Rewrite_RFC#.28e.29_Implement_search_engine_native_faceting.
  3.  I have also checked some libraries around the world with Koha set up by 
some of the larger consultants and they all handle facets in this way. I may ad 
to my ranting that it becomes a serious problem when it comes to subject 
facets. Narrowing down a search using subject facets that does only reflect the 
first 20 results is not helping a student or researcher. BTW, we tested setting 
default to 50 first results and response time increased with 5-8 seconds = not 
acceptable.
  4.  As I understand facets are hardcoded in Zebra and not at all 
configurable. If I am wrong I would be most grateful if you could hint me in 
the right direction. In ElasticSearch it is under discussion and not solved 
according to bug 18235?


The best
Claes

Claes Eriksson
Information specialist

Department of Communication, Marketing and Library
Swedish National Road and Transport Research Institute (VTI)
VTI, SE-581 95 Linkoeping, Sweden
Tel: +46-13-20 40 00  Direct: +46-13-20 41 99
E-mail: claes.eriks...@vti.se<mailto:claes.eriks...@vti.se>

Från: Tomas Cohen Arazi 
Skickat: den 20 februari 2018 01:13:18
Till: David Cook
Kopia: Claes Eriksson; koha-devel@lists.koha-community.org
Ämne: Re: [Koha-devel] Elasticsearch and facets

Facets are configurable, and goal is feature parity.
El lun., 19 de feb. de 2018 8:59 p. m., David Cook 
mailto:dc...@prosentient.com.au>> escribió:

Hi Claes,



I think the maxRecordsForFacets system preference might be a bit legacy at this 
point. It was first used when we were building facets manually from retrieved 
records from Zebra (which was an awful awful no good process). However, I think 
most installations using Zebra these days use Zebra’s native faceting which 
uses the entire result set. At least that’s how things were a year or two ago. 
It may have changed since then. I’d have to double-check, and I have a lot on 
at the moment, so I’ll leave that for yourself or someone else to check. In any 
case, we’re all aware of the limitations around that system preference, which 
is why we made the move to Zebra’s native faceting. (One easy thing to check is 
in your koha-conf.xml file to see if “use_zebra_facets” is set to 1. If so, 
then I think “maxRecordsForFacets” shouldn’t have any effect.



I’m not very familiar with the ElasticSearch efforts, but I think it started 
out as a like-for-like drop-in replacement. Making facets configurable would be 
great, but I’m not sure anyone has this as a priority at the moment. I think 
for now the main goal of ElasticSearch may be stability?



David Cook

Systems Librarian

Prosentient Systems

72/330 Wattle St

Ultimo, NSW 2007

Australia



Office: 02 9212 0899

Direct: 02 8005 0595



From: 
koha-devel-boun...@lists.koha-community.org<mailto:koha-devel-boun...@lists.koha-community.org>
 
[mailto:koha-devel-boun...@lists.koha-community.org<mailto:koha-devel-boun...@lists.koha-community.org>]
 On Behalf Of Claes Eriksson
Sent: Monday, 19 February 2018 11:01 PM
To: 
koha-devel@lists.koha-community.org<mailto:koha-devel@lists.koha-community.org>
Subject: [Koha-devel] Elasticsearch and facets



First some ranting about Zebra and then a question about ElasticSearch.



Using Zebra there is a Global system pref. called maxRecordsForFacets. By 
default the number of records used to build the facet list is the first 20 
results shown. You also get a warning in the manual to increase the number 
since it will impact respons time. Building a facet list out of 20 records may 
be enough for some people but if you think twice about it you realise that it 
is quite misleading, especially since the user does not know that the facet 
list is built on a very limited selection of records. Having the sort set on 
Author and searching for realivity theory, the facet list may not show Einstein 
if your library is specialised in physics and have a fairly large collection.



I have high hopes for ElasticSearch to work with facets in a better way. The 
best would be if all the index could be used for producing e.g. the author 
facet and that it would have minimal impact on system performance. I understand 
that there is discussion about making facets configurable (bug 18235) but I 
cannot see any traces of the improvements I would like to see, but I may be 
wrong since I am quite new working with Koha.



Regards

Claes Eriksson, VTI, Sweden


___
Koha-devel mailing list
Koha-devel@lists.koha-co

Re: [Koha-devel] Elasticsearch and facets

2018-02-19 Thread Tomas Cohen Arazi
Facets are configurable, and goal is feature parity.

El lun., 19 de feb. de 2018 8:59 p. m., David Cook 
escribió:

> Hi Claes,
>
>
>
> I think the maxRecordsForFacets system preference might be a bit legacy at
> this point. It was first used when we were building facets manually from
> retrieved records from Zebra (which was an awful awful no good process).
> However, I think most installations using Zebra these days use Zebra’s
> native faceting which uses the entire result set. At least that’s how
> things were a year or two ago. It may have changed since then. I’d have to
> double-check, and I have a lot on at the moment, so I’ll leave that for
> yourself or someone else to check. In any case, we’re all aware of the
> limitations around that system preference, which is why we made the move to
> Zebra’s native faceting. (One easy thing to check is in your koha-conf.xml
> file to see if “use_zebra_facets” is set to 1. If so, then I think
> “maxRecordsForFacets” shouldn’t have any effect.
>
>
>
> I’m not very familiar with the ElasticSearch efforts, but I think it
> started out as a like-for-like drop-in replacement. Making facets
> configurable would be great, but I’m not sure anyone has this as a priority
> at the moment. I think for now the main goal of ElasticSearch may be
> stability?
>
>
>
> David Cook
>
> Systems Librarian
>
> Prosentient Systems
>
> 72/330 Wattle St
>
> Ultimo, NSW 2007
>
> Australia
>
>
>
> Office: 02 9212 0899
>
> Direct: 02 8005 0595
>
>
>
> *From:* koha-devel-boun...@lists.koha-community.org [mailto:
> koha-devel-boun...@lists.koha-community.org] *On Behalf Of *Claes Eriksson
> *Sent:* Monday, 19 February 2018 11:01 PM
> *To:* koha-devel@lists.koha-community.org
> *Subject:* [Koha-devel] Elasticsearch and facets
>
>
>
> First some ranting about Zebra and then a question about ElasticSearch.
>
>
>
> Using Zebra there is a Global system pref. called maxRecordsForFacets. By
> default the number of records used to build the facet list is the first 20
> results shown. You also get a warning in the manual to increase the number
> since it will impact respons time. Building a facet list out of 20 records
> may be enough for some people but if you think twice about it you realise
> that it is quite misleading, especially since the user does not know that
> the facet list is built on a very limited selection of records. Having the
> sort set on Author and searching for realivity theory, the facet list may
> not show Einstein if your library is specialised in physics and have a
> fairly large collection.
>
>
>
> I have high hopes for ElasticSearch to work with facets in a better way.
> The best would be if all the index could be used for producing e.g. the
> author facet and that it would have minimal impact on system performance. I
> understand that there is discussion about making facets configurable (bug
> 18235) but I cannot see any traces of the improvements I would like to see,
> but I may be wrong since I am quite new working with Koha.
>
>
>
> Regards
>
> Claes Eriksson, VTI, Sweden
>
>
> ___
> Koha-devel mailing list
> Koha-devel@lists.koha-community.org
> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : http://www.koha-community.org/
> git : http://git.koha-community.org/
> bugs : http://bugs.koha-community.org/

-- 
Tomás Cohen Arazi
Theke Solutions (https://theke.io <http://theke.io/>)
✆ +54 9351 3513384
GPG: B2F3C15F
___
Koha-devel mailing list
Koha-devel@lists.koha-community.org
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Re: [Koha-devel] Elasticsearch and facets

2018-02-19 Thread David Cook
Hi Claes,

 

I think the maxRecordsForFacets system preference might be a bit legacy at
this point. It was first used when we were building facets manually from
retrieved records from Zebra (which was an awful awful no good process).
However, I think most installations using Zebra these days use Zebra's
native faceting which uses the entire result set. At least that's how things
were a year or two ago. It may have changed since then. I'd have to
double-check, and I have a lot on at the moment, so I'll leave that for
yourself or someone else to check. In any case, we're all aware of the
limitations around that system preference, which is why we made the move to
Zebra's native faceting. (One easy thing to check is in your koha-conf.xml
file to see if "use_zebra_facets" is set to 1. If so, then I think
"maxRecordsForFacets" shouldn't have any effect.

 

I'm not very familiar with the ElasticSearch efforts, but I think it started
out as a like-for-like drop-in replacement. Making facets configurable would
be great, but I'm not sure anyone has this as a priority at the moment. I
think for now the main goal of ElasticSearch may be stability?

 

David Cook

Systems Librarian

Prosentient Systems

72/330 Wattle St

Ultimo, NSW 2007

Australia

 

Office: 02 9212 0899

Direct: 02 8005 0595

 

From: koha-devel-boun...@lists.koha-community.org
[mailto:koha-devel-boun...@lists.koha-community.org] On Behalf Of Claes
Eriksson
Sent: Monday, 19 February 2018 11:01 PM
To: koha-devel@lists.koha-community.org
Subject: [Koha-devel] Elasticsearch and facets

 

First some ranting about Zebra and then a question about ElasticSearch.

 

Using Zebra there is a Global system pref. called maxRecordsForFacets. By
default the number of records used to build the facet list is the first 20
results shown. You also get a warning in the manual to increase the number
since it will impact respons time. Building a facet list out of 20 records
may be enough for some people but if you think twice about it you realise
that it is quite misleading, especially since the user does not know that
the facet list is built on a very limited selection of records. Having the
sort set on Author and searching for realivity theory, the facet list may
not show Einstein if your library is specialised in physics and have a
fairly large collection.

 

I have high hopes for ElasticSearch to work with facets in a better way. The
best would be if all the index could be used for producing e.g. the author
facet and that it would have minimal impact on system performance. I
understand that there is discussion about making facets configurable (bug
18235) but I cannot see any traces of the improvements I would like to see,
but I may be wrong since I am quite new working with Koha.

 

Regards

Claes Eriksson, VTI, Sweden

 

___
Koha-devel mailing list
Koha-devel@lists.koha-community.org
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

[Koha-devel] Elasticsearch and facets

2018-02-19 Thread Claes Eriksson
First some ranting about Zebra and then a question about ElasticSearch.

Using Zebra there is a Global system pref. called maxRecordsForFacets. By 
default the number of records used to build the facet list is the first 20 
results shown. You also get a warning in the manual to increase the number 
since it will impact respons time. Building a facet list out of 20 records may 
be enough for some people but if you think twice about it you realise that it 
is quite misleading, especially since the user does not know that the facet 
list is built on a very limited selection of records. Having the sort set on 
Author and searching for realivity theory, the facet list may not show Einstein 
if your library is specialised in physics and have a fairly large collection.

I have high hopes for ElasticSearch to work with facets in a better way. The 
best would be if all the index could be used for producing e.g. the author 
facet and that it would have minimal impact on system performance. I understand 
that there is discussion about making facets configurable (bug 18235) but I 
cannot see any traces of the improvements I would like to see, but I may be 
wrong since I am quite new working with Koha.

Regards
Claes Eriksson, VTI, Sweden

___
Koha-devel mailing list
Koha-devel@lists.koha-community.org
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/