To explain my question, first some domain background. We have a search engine
where users can search for materials they can borrow at their local library.
Our top level documents are *works*. An example of a work could be "Harry
Potter and the Philosopher's Stone". Examples of information stored at this
level could be the title, the author of the work, and a genre.
At the second level, we have *manifestations" (we call these "pids"). It might
be that a work exists as a physical book, an ebook, as an audiobook on CDs, an
online audiobook, and there might be several editions of a book. Information
stored at this level includes material type, year of publication, contributors
(can be narrators, artists that have illustrated in a particular edition).
At the third level, we have *instances*. This includes information about the
physical books, and in which libraries they are located, which department, and
even down to locations within departments, if they are currently on loan, on
the shelf.
Each document has a `doc_type` (which is either work, pid, or instance), works
have a list of pids, and pids have a list of instances associated with them.
Our job is to formulate solr queries on behalf of users that belong to their
local library, so that they can search for materials that is available to them.
Given a query, we want to return works, along with the manifestations that
match the query. A query can specify restrictions at all three levels; you
might be interested in the (physical) book from last year written by Jussi
Adler-Olsen, and it should be available at the local branch of the community
library.
The way we find the appropriate works is pretty much in place. We use the
`/query` endpoint of solr, and we formulate a json object where
* the `query` field contains the restrictions at the work level, something like
`work.creator:'Jussi Adler-Olsen'`.
* To restrict to works where manifestations/pids apply to the restrictions at
that level, we use a "parent which" construction in the `filter` part of the
solr query. Something like `{!parent
which='doc_type:work'}(pid.material_type:book AND pid.year:(2021))`.
* To restrict to works where we can find a physical copy at the local library,
we add another element to the `filter`. Something like `{!parent
which='doc_type:work'}(instance.agency:900004 AND
instance.status:\"onShelf\")`, where 900004 is the id of the local library.
That seems to work well. We get the works we are interested in. The question I
have is, how do I restrict the manifestations we return? We use the field list
and a `childFilter` to restrict manifestations, something like this: `"fields":
"work.workid work.title work.creator, pids, id, pid.year, pid.material_type
[child childFilter='pid.material_type:bog' limit=-1]"`. That part of the
filtering also seems to work OK, but we get all the manifestations that match,
from all libraries. We want to restrict to those manifestations, where the
local library has a copy.
In other words, (I guess) we need to formulate a restriction in the `[child
childFilter=...]` part of the field list, restricting the second-level
documents on information stored at the third level. I am not sure how to do
that. Can anyone help?
Thanks a lot in advance, and best regards.
/Noah
--
Noah Torp-Smith ([email protected])