I know I've noticed this as well -- that the `pf` parsing is naive with
respect to more complex query syntax. I'm curious what others might have to
say about this; if nobody else weighs in perhaps it might be a question for
the dev@solr list.

Regardless of the above, I'd advise against the kind of implicit "mixed
query parsing" that you have currently in your `q` param. Assuming that the
`!parent` qparser does not affect scoring, I wonder if you'd be better off
placing it on its own in an `fq` -- i.e.: `fq={!parent [...]}`. It's also
worth noting that SOLR-11501 [1] should change the parsing of this kind of
nested query syntax as of version 7.2 (subject to `luceneMatchVersion`) --
so you'd do well to switch approaches regardless.

If you still want to bundle these as a single query, I'd recommend
explicitly combining in a boolean query, e.g.:

defType=lucene&q={!boolean should='{!edismax v=$qq}'
filter=$myParentFilter}&qq=(scherfig)&myParentFilter={!parent
which='doc_type:work' v='pid.material_type:(\"bog\")'}

Any of these more explicit alternate approaches should also cause the `pf`
param to properly construct boosting phrase queries (according to the
purpose of the `pf` param).

[1] https://issues.apache.org/jira/browse/SOLR-11501

On Mon, May 2, 2022 at 3:51 AM Noah Torp-Smith <n...@dbc.dk.invalid> wrote:

> Hello,
>
> This is the first time I reach out in this forum, so I apologize in
> advance if this is a known issue or if I have not spent enough time reading
> carefully through previous posts.
>
> I am working with a solr containing library data. We have a nested
> structure where a "work" can have child documents that represent different
> "pids" (or manifestations) for that work. The canonical example is the work
> "Harry Potter and the Philosopher's Stone" that can have different
> manifestations/pids representing for example an audiobook version, an
> e-book version, and of course, the physical book. Some information, like
> the title and the author is stored at the work level, and other
> information, like the materialType (book/audiobook/ebook) or the year is
> stored at the manifestation/pid level in child docs. I hope that makes
> sense. It is of course simplified, but it should convey what we are trying
> to do.
>
> I can provide the full schema of our solr if necessary, but there's a lot
> of info in there that I am not sure would convey much information. If need
> be, I will be happy to provide it, though. But I thought I'd try and
> describe a simplified version of the issue I am struggling with. There's a
> Danish author called Hans Scherfig and I want to search for physical books
> by him. I issue this query to our solr. As you can see, I have enabled
> debugging at the `query` level.
>
> ```json
> {
>     "query": "(scherfig)+{!parent which='doc_type:work'
> v='pid.material_type:(\"bog\")'}",
>     "filter": [
>         "doc_type:work"
>     ],
>     "fields": "work.workid work.title, [child
> childFilter='pid.material_type:(\"bog\")']",
>     "offset": 0,
>     "limit": 1,
>     "params": {
>         "defType": "edismax",
>         "qf": [
>             "work.creator",
>             "work.title",
>             "pid.material_type"
>         ],
>         "pf": "work.creator",
>         "sort": "score desc",
>         "debug": "query"
>     }
> }
> ```
>
> We send this to the /query endpoint of solr, like this (the core is called
> simple-search):
>
> ```
> curl -H "Content-Type: application/json" "
> http://search-solr/solr/simple-search/query"; -d @scherfig-filter-test.json
> ```
>
> I am using the `parent which` construction, documented here, for example:
> https://solr.apache.org/guide/8_2/other-parsers.html (we are on solr
> 8.10.1). Looking at the debug output, I see this:
>
> ```
> (work.creator:\"scherfig parent which doc_type:work v pid.material_type\")
> ```
>
> which worries me slightly. It looks like "parent which" is part of what
> solr is looking for in the work.creator field?
>
> The "interesting" bit is that, if I remove the line with
> `"pf":"work.creator"`, then that part of the debug output is no longer
> there. Is there an issue with `pf` here? Or am I formatting my query
> wrongly?
>
> Thanks in advance for any insight you can provide.
>
> Best regards,
>
> /Noah
>
>
>
> --
>
> Noah Torp-Smith (n...@dbc.dk)
>

Reply via email to