Re: [OPEN-ILS-DEV] request help optimizing very slow select query for ebooks

Blake Henderson Wed, 10 Jul 2019 12:38:57 -0700

Melissa,

I've found this magic sauce to do the trick and QUICK


select date_part('month',bre.create_date),count(*)
from
biblio.record_entry bre
where
not bre.deleted and
create_date between '2019-01-01' and '2019-07-01' and
lower(marc) ~ $$<datafield tag="856" ind1="4" ind2="0">$$ AND

(
        marc ~ $$tag="008">.......................[oqs]$$
        or
        marc ~ $$tag="006">......[oqs]$$
)
and
(
        marc ~ $$<leader>......[at]$$
)
and
(
        marc ~ $$<leader>.......[acdm]$$
)

group by 1

Tweak as needed

-Blake-
Conducting Magic
MOBIUS

On 7/10/2019 1:32 PM, Rogan Hamby wrote:

Hi Melissa,

Quick question, are you looking for all ebooks or a certain subsetindicate by the 'full text via%' matching? It's not clear to me ifthat's how you're identifying they are ebooks at all versus a subsetof them. If you're looking for all of them you can use the fact thatthe fixed field data will map to entries in theconfig.coded_values_map table to identify them. So, for example ifthe ebook search format is id 614 in that table then vlist entriesin metabib.record_attr_vector_list with a 614 as one of the valueswill be an ebook. That will certainly search faster than multiplejoins against metabib.real_full_rec.

On Wed, Jul 10, 2019 at 2:23 PM Melissa Belvadi <[email protected]<mailto:[email protected]>> wrote:


    We're on 3.0.3, with under 2 million total biblio recs, under 1
    million of those undeleted.
    The following query is taking over 10 hours to run at the command
    line or through pgadmin or dbeaver. For a system our size, that
    seems far too slow.
    We've checked other aspects of our server, e.g. other processes
    running on it and index integrity, and can't account for the long
    execution time.

    I'd appreciate any advice on how to either optimize this search or
    restructure it as we'll need to run this regularly.  As you can
    guess, there's another query that comes after this that uses the
    temp table, but that responds really fast once the temp table is
    finally made.
    The point of this query is to find all of the ebooks added to the
    system over a given time period, and pull together a bunch of data
    about those books.
    If there's some entirely different way to go about this, eg using
    SRU or some other tool, I'm totally open to that too.

    create temp table temp_newebooks (create_date, call_number,
    author, publisher, pubdate, elocation, title, id, isbn,
    contentnote, crecord) as

    WITH myconstants as (select to_date('20190101','YYYYMMDD') as
    startdate, to_date('20190701','YYYYMMDD') as enddate,
    cast('20160101' as varchar) as pubstart)

    select distinct bre.create_date, mfr.value, r.author, r.publisher,
    r.pubdate, mfr2.value, tfe.value, r.id <http://r.id>, r.isbn,
    mfr3.value, mfr3.record

    from biblio.record_entry bre

    join metabib.real_full_rec mfr on (mfr.record=bre.id <http://bre.id>)

    left join metabib.real_full_rec mfr3 on (mfr3.record=bre.id
    <http://bre.id>) and mfr3.tag = '520' and mfr3.subfield = 'a'

    join metabib.real_full_rec mfr2 on (mfr2.record=bre.id
    <http://bre.id>)

    join reporter.materialized_simple_record r on (r.id <http://r.id>
    = bre.id <http://bre.id>)

    join metabib.title_field_entry tfe on (tfe.source=bre.id
    <http://bre.id>)

    where

    not bre.deleted

    and substring(r.pubdate from 1 for 4) >= (select pubstart from
    myconstants)

    and tfe.field = 5

    and (mfr.tag like '050' or mfr.tag like '090') and mfr.subfield
    like 'a'

    and mfr2.tag like '856' and mfr2.subfield like 'y'

    and mfr2.value  ilike 'full text via%'

    and bre.create_date > (select startdate from myconstants) and
    bre.create_date < (select enddate from myconstants);



    Thank you!

    Melissa Belvadi
    Collections Librarian
    University of Prince Edward Island
    [email protected] <mailto:[email protected]> 902-566-0581
    my public calendar
    
<http://www.google.com/calendar/embed?src=mbelvadi%40upei.ca&ctz=America/Halifax&mode=week>
    Make an appointment <https://mbelvadi.youcanbook.me/> via YouCanBookMe

Re: [OPEN-ILS-DEV] request help optimizing very slow select query for ebooks

Reply via email to