Melissa,
I've found this magic sauce to do the trick and QUICK
select date_part('month',bre.create_date),count(*)
from
biblio.record_entry bre
where
not bre.deleted and
create_date between '2019-01-01' and '2019-07-01' and
lower(marc) ~ $$<datafield tag="856" ind1="4" ind2="0">$$ AND
(
marc ~ $$tag="008">.......................[oqs]$$
or
marc ~ $$tag="006">......[oqs]$$
)
and
(
marc ~ $$<leader>......[at]$$
)
and
(
marc ~ $$<leader>.......[acdm]$$
)
group by 1
Tweak as needed
-Blake-
Conducting Magic
MOBIUS
On 7/10/2019 1:32 PM, Rogan Hamby wrote:
Hi Melissa,
Quick question, are you looking for all ebooks or a certain subset
indicate by the 'full text via%' matching? It's not clear to me if
that's how you're identifying they are ebooks at all versus a subset
of them. If you're looking for all of them you can use the fact that
the fixed field data will map to entries in the
config.coded_values_map table to identify them. So, for example if
the ebook search format is id 614 in that table then vlist entries
in metabib.record_attr_vector_list with a 614 as one of the values
will be an ebook. That will certainly search faster than multiple
joins against metabib.real_full_rec.
On Wed, Jul 10, 2019 at 2:23 PM Melissa Belvadi <[email protected]
<mailto:[email protected]>> wrote:
We're on 3.0.3, with under 2 million total biblio recs, under 1
million of those undeleted.
The following query is taking over 10 hours to run at the command
line or through pgadmin or dbeaver. For a system our size, that
seems far too slow.
We've checked other aspects of our server, e.g. other processes
running on it and index integrity, and can't account for the long
execution time.
I'd appreciate any advice on how to either optimize this search or
restructure it as we'll need to run this regularly. As you can
guess, there's another query that comes after this that uses the
temp table, but that responds really fast once the temp table is
finally made.
The point of this query is to find all of the ebooks added to the
system over a given time period, and pull together a bunch of data
about those books.
If there's some entirely different way to go about this, eg using
SRU or some other tool, I'm totally open to that too.
create temp table temp_newebooks (create_date, call_number,
author, publisher, pubdate, elocation, title, id, isbn,
contentnote, crecord) as
WITH myconstants as (select to_date('20190101','YYYYMMDD') as
startdate, to_date('20190701','YYYYMMDD') as enddate,
cast('20160101' as varchar) as pubstart)
select distinct bre.create_date, mfr.value, r.author, r.publisher,
r.pubdate, mfr2.value, tfe.value, r.id <http://r.id>, r.isbn,
mfr3.value, mfr3.record
from biblio.record_entry bre
join metabib.real_full_rec mfr on (mfr.record=bre.id <http://bre.id>)
left join metabib.real_full_rec mfr3 on (mfr3.record=bre.id
<http://bre.id>) and mfr3.tag = '520' and mfr3.subfield = 'a'
join metabib.real_full_rec mfr2 on (mfr2.record=bre.id
<http://bre.id>)
join reporter.materialized_simple_record r on (r.id <http://r.id>
= bre.id <http://bre.id>)
join metabib.title_field_entry tfe on (tfe.source=bre.id
<http://bre.id>)
where
not bre.deleted
and substring(r.pubdate from 1 for 4) >= (select pubstart from
myconstants)
and tfe.field = 5
and (mfr.tag like '050' or mfr.tag like '090') and mfr.subfield
like 'a'
and mfr2.tag like '856' and mfr2.subfield like 'y'
and mfr2.value ilike 'full text via%'
and bre.create_date > (select startdate from myconstants) and
bre.create_date < (select enddate from myconstants);
Thank you!
Melissa Belvadi
Collections Librarian
University of Prince Edward Island
[email protected] <mailto:[email protected]> 902-566-0581
my public calendar
<http://www.google.com/calendar/embed?src=mbelvadi%40upei.ca&ctz=America/Halifax&mode=week>
Make an appointment <https://mbelvadi.youcanbook.me/> via YouCanBookMe