SOLR and string comparison functions

2017-09-18 Thread Dariusz Wojtas
Hi,
I am working on an application that searches for entries that may be
queried by multiple parameters.
These parameters may be sent to SOLR in different sets, each parameter with
it's own weight.

Values for the example below might be as follows:
firstName=John&
firstName.weight=0.2&
id=Aw34563456WWA&
id.weight=0.5&
fullName=John Adreew Jr. Doe and Partners&
fullName.weight=0.3


There is one very important requirement.
No marther how many parameters are out there, the total result score cannot
exceed 1 (100%).
In every case I multiply param weight and result of string comparison.
A field may be used in comparison if it's weight is greater than 0 (in fact
greater than 0.0001).

  {!func v=$global_search_function}
  sum(
product($firstName.weight, strdist(literal($firstName),
firstName, edit)),
map($id.weight, 0.0001, 1000, product($id.weight,
strdist(literal($id), id, edit)), 0),
map($fullName.weight, 0.0001, 1000,
product($fullName.weight, strdist(literal($fullName), fullName, ngram,10)),
0),
)

The question is about comparing fullName above.
What function should I use for comparison working on the fullName field the
same way as:
   "John Adreew Jr. Doe and Partners"~10^0.3
?

What are the functions that compare strings, other than strdist?
How do I create function similar to the "John Andrew ..." example above?


Best regards,
Dariusz Wojtas


Rescoring from 0 - full

2017-09-20 Thread Dariusz Wojtas
Hi,
When I use boosting fuctionality, it is always about adding or
multiplicating the score calculated in the 'q' param.
I mau use function queries inside 'q', but this may hit performance on
calling multiple nested functions.
I thaught that 'rerank' could help, but it is still about changing the
original score, not full calculation.

How can take full control on score in rerank? Is it possible?

Best regards,
Dariusz Wojtas


Re: SOLR and string comparison functions

2017-09-18 Thread Dariusz Wojtas
Hi Emir,

I am calculating a "normalizzed" score, as it will be later used by
automatic decisioning processes to find if the result found "matches
enough". For example I might create rule to decide if found result score is
higher that 97% (matches), otherwise it is just a noise.
I've been thinking about the reranking query parser, but was not able to
create a real life working example, even something that would show the
concept on just 2 fields, then rerant the result.
I'd be happy to see such example.

I have found the answer for my original question, seems to work:
   {!func v=$global_search_function}
   sum(
  product($firstName.weight, strdist(literal($firstName), firstName,
edit)),
  map($id.weight, 0.0001, 1000, product($id.weight,
strdist(literal($id), id, edit)), 0),
  map($fullName.weight, 0.0001, 1000, product($fullName.weight,
query($fullName_filter)), 0),
 )
   {!edismax qf=fullName pf=fullName ps=10
v=$fullName}

Please see the fullName_filter definition and it's usage in the query()
above.

But now I am really worried about the performance, as there may be several
more filter fields that may affect the score.

Best regards,
Dariusz



On Tue, Sep 19, 2017 at 12:33 AM, Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> Hi Darius,
> This seems to me like misuse/misunderstanding of Solr. As you probably
> noticed, Solr score is not normalised - you cannot compare scores of two
> queries and tell if one result match better query than the other. There are
> some techniques to achieve something close, but that is not that straight
> forward and might depend on your case.
> In your case, you are trying to use function to query score, and depending
> on your index size, it might not perform well. You should probably be
> better with custom scorer.
> Back to your question: What do you try to achieve? When do you consider
> two names to match? Or you expect to calculate score for each document in
> the index and return top scored ones? Such solution will not scale.
> IMO, it would be the best if you rethink your requirement about score (or
> use reranking query parser https://cwiki.apache.org/
> confluence/display/solr/Query+Re-Ranking <https://cwiki.apache.org/
> confluence/display/solr/Query+Re-Ranking>) and set proper field analysis
> and edismax query parser.
> Otherwise good luck if you have a large index.
>
> Regards,
> Emir
>
> > On 19 Sep 2017, at 00:01, Dariusz Wojtas <dwoj...@gmail.com> wrote:
> >
> > Hi,
> > I am working on an application that searches for entries that may be
> > queried by multiple parameters.
> > These parameters may be sent to SOLR in different sets, each parameter
> with
> > it's own weight.
> >
> > Values for the example below might be as follows:
> > firstName=John&
> > firstName.weight=0.2&
> > id=Aw34563456WWA&
> > id.weight=0.5&
> > fullName=John Adreew Jr. Doe and Partners&
> > fullName.weight=0.3
> >
> >
> > There is one very important requirement.
> > No marther how many parameters are out there, the total result score
> cannot
> > exceed 1 (100%).
> > In every case I multiply param weight and result of string comparison.
> > A field may be used in comparison if it's weight is greater than 0 (in
> fact
> > greater than 0.0001).
> >
> >  {!func v=$global_search_function}
> >  sum(
> >product($firstName.weight, strdist(literal($firstName),
> > firstName, edit)),
> >map($id.weight, 0.0001, 1000, product($id.weight,
> > strdist(literal($id), id, edit)), 0),
> >map($fullName.weight, 0.0001, 1000,
> > product($fullName.weight, strdist(literal($fullName), fullName,
> ngram,10)),
> > 0),
> >)
> >
> > The question is about comparing fullName above.
> > What function should I use for comparison working on the fullName field
> the
> > same way as:
> >   "John Adreew Jr. Doe and Partners"~10^0.3
> > ?
> >
> > What are the functions that compare strings, other than strdist?
> > How do I create function similar to the "John Andrew ..." example above?
> >
> >
> > Best regards,
> > Dariusz Wojtas
>
>


Re: Rescoring from 0 - full

2017-10-05 Thread Dariusz Wojtas
Hi,
Your answers have helped me a lot.
I've managed to use the LTRQParserPlugin and it does what I need. Full
control over scoring with it's re-ranking functionality.
I define my custom features and may pass custom params to them using the
"efi.*" syntax.
Is there something similar to define weights in the model that uses these
features?
Can I have single model, byt pass feature weights in each request?
How do I pass my custom weights with each request in the example below?

{
  "store" : "myFeaturesStore",
  "name" : "myModel",
  "class" : "org.apache.solr.ltr.model.LinearModel",
  "features" : [
{ "name" : "scorePersonalId" },
{ "name" : "originalScore" }
  ],
  "params" : {
"weights" : {
  "scorePersonalId" : 0.9,
  "originalScore" : 0.1
}
  }
}

I am using SOLR 6.6, soon switching to 7.0

Best regards,
Dariusz Wojtas


On Thu, Sep 21, 2017 at 5:18 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Sure, you can take full control of the scoring, just write a custom
> similarity.
>
> What's not at all clear is why you want to. RerankQParserPlugin will
> re-rank the to N documents by pushing them through a different query,
> can you make that work?
>
> Best,
> Erick
>
>
>
> On Thu, Sep 21, 2017 at 4:20 AM, Diego Ceccarelli (BLOOMBERG/ LONDON)
> <dceccarel...@bloomberg.net> wrote:
> > Hi Dariusz,
> > If you use *:* you'll rerank only the top N random documents, as Emir
> said, that will not produce interesting results probably.
> > If you want to replace the original score, you can take a look at the
> learning to rank module [1], that would allow you to reassign a
> > new score to the top N documents returned by your query and then reorder
> them based on that (ignoring the original score, if you want).
> >
> > Cheers,
> > Diego
> >
> > [1] https://cwiki.apache.org/confluence/display/solr/Learning+To+Rank
> >
> > From: solr-user@lucene.apache.org At: 09/21/17 08:49:13
> > To: solr-user@lucene.apache.org
> > Subject: Re: Rescoring from 0 - full
> >
> > Hi Dariusz,
> > You could use fq for filtering (can disable caching to avoid polluting
> filter cache) and q=*:*. That way you’ll get score=1 for all doc and can
> rerank. The issue with this approach is that you rerank top N and without
> score they wouldn’t be ordered so it is no-go.
> > What you could do (did not try) in rescoring divide by score (not sure
> if can access calculated but could calculate) to eliminate score.
> >
> > HTH,
> > Emir
> >
> >> On 20 Sep 2017, at 21:38, Dariusz Wojtas <dwoj...@gmail.com> wrote:
> >>
> >> Hi,
> >> When I use boosting fuctionality, it is always about adding or
> >> multiplicating the score calculated in the 'q' param.
> >> I mau use function queries inside 'q', but this may hit performance on
> >> calling multiple nested functions.
> >> I thaught that 'rerank' could help, but it is still about changing the
> >> original score, not full calculation.
> >>
> >> How can take full control on score in rerank? Is it possible?
> >>
> >> Best regards,
> >> Dariusz Wojtas
> >
> >
>


LTR 'feature' and passing date parameters

2017-10-18 Thread Dariusz Wojtas
Hi,
I am using the LTR functionality (SOLR 7) and need to define a feature that
will check if the given request parameter of type date (ie. '1998-11-23')
matches birthDate in the stored document. Date granularity should be on DAY
level.
Simply:
* if dates match - return 1
* otherwise (birthDate not set, or they do not match) - return 0

I have several features and do run some model that gives me the final
score. I cannot find a way that will calculate value for date related
feature.

Currently i am having problem even with passing the date param, ie
'1998-11-23' to the feature to treat it as a date.

My 'efi.' param for date is defined as follows:
 efi.searchBirthDate=1998-11-23

In my feature I want to compare dates using the ms(x,y) function and check
if they are equal.
 ms(${searchBirthDate}, birthDate)

But I get exception on calculating the feature:
Invalid Date String:'1998-11-23'

Any idea how to solve such problem?

Best regards,
Dariusz Wojtas


Re: LTR 'feature' and passing date parameters

2017-10-18 Thread Dariusz Wojtas
Thank you very much Binoy.
This worked perfectly.

Best regards,
Dariusz Wojtas

On Wed, Oct 18, 2017 at 5:06 PM, Binoy Dalal <binoydala...@gmail.com> wrote:

> Dariusz,
> This problem is most probably occurring because solr does not store dates
> in the format you've specified. It's something like: 2017-10-08T12:23:00Z.
> You'll probably need to specify your date in your efi feature in the manner
> above to get it to work.
>
> You can find more details on dates here:
> https://lucene.apache.org/solr/guide/6_6/working-with-dates.html
>
> On Wed 18 Oct, 2017, 19:16 Dariusz Wojtas, <dwoj...@gmail.com> wrote:
>
> > Hi,
> > I am using the LTR functionality (SOLR 7) and need to define a feature
> that
> > will check if the given request parameter of type date (ie. '1998-11-23')
> > matches birthDate in the stored document. Date granularity should be on
> DAY
> > level.
> > Simply:
> > * if dates match - return 1
> > * otherwise (birthDate not set, or they do not match) - return 0
> >
> > I have several features and do run some model that gives me the final
> > score. I cannot find a way that will calculate value for date related
> > feature.
> >
> > Currently i am having problem even with passing the date param, ie
> > '1998-11-23' to the feature to treat it as a date.
> >
> > My 'efi.' param for date is defined as follows:
> >  efi.searchBirthDate=1998-11-23
> >
> > In my feature I want to compare dates using the ms(x,y) function and
> check
> > if they are equal.
> >  ms(${searchBirthDate}, birthDate)
> >
> > But I get exception on calculating the feature:
> > Invalid Date String:'1998-11-23'
> >
> > Any idea how to solve such problem?
> >
> > Best regards,
> > Dariusz Wojtas
> >
> --
> Regards,
> Binoy Dalal
>


Re: Streaming expressions and fetch()

2018-06-18 Thread Dariusz Wojtas
Hi,
I thing this might give some clue.
I tried to reproduce the issue with a collection called testCloud.

fetch(testCloud1,
  search(testCloud1, q="*:*", fq="type:name", fl="parentId",
sort="parentId asc"),
  fl="id,name",
  on="parentId=id")

The expression above produces 3 log entries presented below (just cut the
content before 'webapp' in each line to save space):

webapp=/solr path=/stream
params={expr=fetch(testCloud1,%0a++search(testCloud1,+q%3D"*:*",+fq%3D"type:name",+fl%3D"parentId",+sort%3D"parentId+asc"),%0a++fl%3D"id,name",%0a++on%3D"parentId%3Did")&_=1529178931117}
status=0 QTime=1
webapp=/solr path=/select
params={q=*:*=false=parentId=type:name=parentId+asc=json=2.2}
hits=1 status=0 QTime=1
webapp=/solr path=/select
params={q={!+df%3Did+q.op%3DOR+cache%3Dfalse+}+123=false=id,name,_version_=_version_+desc=50=json=2.2}
hits=0 status=0 QTime=1

If I use the 3rd line parameters with an url:
http://10.0.75.1:8983/solr/testCloud1/select?q={!+df%3Did+q.op%3DOR+cache%3Dfalse+}+123=false=id,name,_version_=_version_+desc=50=json=2.2

then the resultset is empty. It searches for 'id' value fo 123.
BUT if I remove the plus sign before the '123' and have url like this:
http://10.0.75.1:8983/solr/testCloud1/select?q={!+df%3Did+q.op%3DOR+cache%3Dfalse+}123=false=id,name,_version_=_version_+desc=50=json=2.2

THEN IT RETURNS SINGLE ROW WITH EXPECTED VALUES.
Maybe this gives some light? Maybe it's about the enriching query syntax?

I have tried with fetch containing query that returns more identifiers.
In the 3rd log entry the identifiers start with a plus sign and are
separated with pluses, as in the log entry below

q={!+df%3Did+q.op%3DOR+cache%3Dfalse+}+123+124=false=id,name,_version_=_version_+desc=50=json=2.2
No results returned, and the data is not enriched with additional
attributes.

Best regards,
Darek


On Mon, Jun 18, 2018 at 3:07 PM, Joel Bernstein  wrote:

> There is a test case working that is basically the same construct that you
> are having issues with. So, I think the next step is to try and reproduce
> the problem that you are seeing in a test case.
>
> If you have a small sample test dataset I can use to reproduce the error
> please create a jira ticket and I will work on the issue.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Sun, Jun 17, 2018 at 2:40 PM, Dariusz Wojtas  wrote:
>
> > Hi,
> > I am trying to use streaming expressions with SOLR 7.3.1.
> > I have successfully used innerJoin, leftOuterJoin and several other
> > functions but failed to achieve expected results with the fetch()
> function.
> >
> > Example below is silmplfied, in reality the base search() function uses
> > fuzzy matching and scoring. And works perfectly.
> > But I need to enrich the search results with additional column from the
> > same collection.
> > search() call does a query on nested documents, and returns parentId
> (yes,
> > i know there is _root_, tried it as well) + some calculated custom
> values,
> > requiring some aggregation calls, like rollup(). This part works
> perfectly.
> > But then I want to enrich the resultset with attributes from the top
> level
> > document, where "parentId=id".
> > And all my attempts to fetch additional data have failed, the fetch()
> call
> > below always gives the same results as the search() call inside.
> >
> > fetch(users,
> >   search(users, q="*:*", fq="type:name", fl="parentId",
> sort="parentId
> > asc"),
> >   fl="id,name",
> >   on="parentId=id")
> >
> > As I understand fetch() should retrieve only records narrowed by the
> > "parentId" results.
> > If I call leftOuterJoin(), then I loose the benefit of such nice
> narrowing
> > call.
> > Any clue what i am doing wrong with fetch()?
> >
> > Best regards,
> > Darek
> >
>


Streaming expressions and fetch()

2018-06-17 Thread Dariusz Wojtas
Hi,
I am trying to use streaming expressions with SOLR 7.3.1.
I have successfully used innerJoin, leftOuterJoin and several other
functions but failed to achieve expected results with the fetch() function.

Example below is silmplfied, in reality the base search() function uses
fuzzy matching and scoring. And works perfectly.
But I need to enrich the search results with additional column from the
same collection.
search() call does a query on nested documents, and returns parentId (yes,
i know there is _root_, tried it as well) + some calculated custom values,
requiring some aggregation calls, like rollup(). This part works perfectly.
But then I want to enrich the resultset with attributes from the top level
document, where "parentId=id".
And all my attempts to fetch additional data have failed, the fetch() call
below always gives the same results as the search() call inside.

fetch(users,
  search(users, q="*:*", fq="type:name", fl="parentId", sort="parentId
asc"),
  fl="id,name",
  on="parentId=id")

As I understand fetch() should retrieve only records narrowed by the
"parentId" results.
If I call leftOuterJoin(), then I loose the benefit of such nice narrowing
call.
Any clue what i am doing wrong with fetch()?

Best regards,
Darek


Re: Streaming expressions and fetch()

2018-06-20 Thread Dariusz Wojtas
I have filled a JIRA Issue: SOLR-12505

Best regards,
Darek

On Mon, Jun 18, 2018 at 11:08 PM, Dariusz Wojtas  wrote:

> Hi,
> I thing this might give some clue.
> I tried to reproduce the issue with a collection called testCloud.
>
> fetch(testCloud1,
>   search(testCloud1, q="*:*", fq="type:name", fl="parentId",
> sort="parentId asc"),
>   fl="id,name",
>   on="parentId=id")
>
> The expression above produces 3 log entries presented below (just cut the
> content before 'webapp' in each line to save space):
>
> webapp=/solr path=/stream params={expr=fetch(testCloud1,
> %0a++search(testCloud1,+q%3D"*:*",+fq%3D"type:name",+fl%
> 3D"parentId",+sort%3D"parentId+asc"),%0a++fl%3D"
> id,name",%0a++on%3D"parentId%3Did")&_=1529178931117} status=0 QTime=1
> webapp=/solr path=/select params={q=*:*=false&
> fl=parentId=type:name=parentId+asc=json=2.2} hits=1
> status=0 QTime=1
> webapp=/solr path=/select params={q={!+df%3Did+q.op%
> 3DOR+cache%3Dfalse+}+123=false=id,name,_
> version_=_version_+desc=50=json=2.2} hits=0 status=0
> QTime=1
>
> If I use the 3rd line parameters with an url:
> http://10.0.75.1:8983/solr/testCloud1/select?q={!+df%
> 3Did+q.op%3DOR+cache%3Dfalse+}+123=false=id,name,
> _version_=_version_+desc=50=json=2.2
> <http://10.0.75.1:8983/solr/testCloud1/select?q=%7B!+df%3Did+q.op%3DOR+cache%3Dfalse+%7D+123=false=id,name,_version_=_version_+desc=50=json=2.2>
>
> then the resultset is empty. It searches for 'id' value fo 123.
> BUT if I remove the plus sign before the '123' and have url like this:
> http://10.0.75.1:8983/solr/testCloud1/select?q={!+df%
> 3Did+q.op%3DOR+cache%3Dfalse+}123=false=id,name,_
> version_=_version_+desc=50=json=2.2
> <http://10.0.75.1:8983/solr/testCloud1/select?q=%7B!+df%3Did+q.op%3DOR+cache%3Dfalse+%7D123=false=id,name,_version_=_version_+desc=50=json=2.2>
>
> THEN IT RETURNS SINGLE ROW WITH EXPECTED VALUES.
> Maybe this gives some light? Maybe it's about the enriching query syntax?
>
> I have tried with fetch containing query that returns more identifiers.
> In the 3rd log entry the identifiers start with a plus sign and are
> separated with pluses, as in the log entry below
> q={!+df%3Did+q.op%3DOR+cache%3Dfalse+}+123+124=
> false=id,name,_version_=_version_+desc=50=json=2.2
> No results returned, and the data is not enriched with additional
> attributes.
>
> Best regards,
> Darek
>
>
> On Mon, Jun 18, 2018 at 3:07 PM, Joel Bernstein 
> wrote:
>
>> There is a test case working that is basically the same construct that you
>> are having issues with. So, I think the next step is to try and reproduce
>> the problem that you are seeing in a test case.
>>
>> If you have a small sample test dataset I can use to reproduce the error
>> please create a jira ticket and I will work on the issue.
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>> On Sun, Jun 17, 2018 at 2:40 PM, Dariusz Wojtas 
>> wrote:
>>
>> > Hi,
>> > I am trying to use streaming expressions with SOLR 7.3.1.
>> > I have successfully used innerJoin, leftOuterJoin and several other
>> > functions but failed to achieve expected results with the fetch()
>> function.
>> >
>> > Example below is silmplfied, in reality the base search() function uses
>> > fuzzy matching and scoring. And works perfectly.
>> > But I need to enrich the search results with additional column from the
>> > same collection.
>> > search() call does a query on nested documents, and returns parentId
>> (yes,
>> > i know there is _root_, tried it as well) + some calculated custom
>> values,
>> > requiring some aggregation calls, like rollup(). This part works
>> perfectly.
>> > But then I want to enrich the resultset with attributes from the top
>> level
>> > document, where "parentId=id".
>> > And all my attempts to fetch additional data have failed, the fetch()
>> call
>> > below always gives the same results as the search() call inside.
>> >
>> > fetch(users,
>> >   search(users, q="*:*", fq="type:name", fl="parentId",
>> sort="parentId
>> > asc"),
>> >   fl="id,name",
>> >   on="parentId=id")
>> >
>> > As I understand fetch() should retrieve only records narrowed by the
>> > "parentId" results.
>> > If I call leftOuterJoin(), then I loose the benefit of such nice
>> narrowing
>> > call.
>> > Any clue what i am doing wrong with fetch()?
>> >
>> > Best regards,
>> > Darek
>> >
>>
>
>


LTR features and searching for field using multiple words

2017-10-20 Thread Dariusz Wojtas
Hi,

Recently I work with LTR features.
In some of these features I use the block join parent parser.
It works as expected until I pass multi-word value into the query.
I have a parameter called 'fullAddressStreet' and it
- works when I pass value 'something'
- does not work if I pass value 'something street' or 'something 17B'.

It fails with:
  java.lang.RuntimeException: Error while creating weights in LTR:
  java.lang.RuntimeException: Exception from createWeight for SolrFeature
  [name=scoreFullAddressStreet,
params={q={!parent which='type:entity'
score='max'}keyword_address:${fullAddressStreet}}]
no field name specified in query and no default specified via 'df' param

I have tried fifferent variants:
   {!parent which='type:entity'
score='max'}keyword_address:${fullAddressStreet}
   {!parent which='type:entity' score='max'
v='keyword_address:${fullAddressStreet}'}
   {!parent which='type:entity' score='max' df='keyword_address'
v='keyword_address:${fullAddressStreet}'}

When using in LTR feature, all query definition is surrounded with double
quotes.
All these variants work when I ask for single word, fail the same way with
multiple words.

The 'keyword_address' field definition is:


and the value is copied from other fields:




How do I correctly use this parser?
Character escaping? How?

Best regards,
Dariusz Wojtas


LTR feature and proximity search with Block Join Parent query Parser

2017-10-19 Thread Dariusz Wojtas
Hi,
I am working on features and my main document ('type:entity') has child
documents, some of them contain addresses ('type:entityAddress').

My feature definition:
{
  "store": "store_myStore",
  "name": "scoreAddressCity",
  "class": "org.apache.solr.ltr.feature.SolrFeature",
  "params":{ "q": "+{!parent which='type:entity'
score='max'}type:entityAddress +{!parent which='type:entity'
score='max'}address.city:${searchedCity}" }
}

Two sample searches where I search for city 'Warszawa'.
I am passing the searched city name with as efi.searchedCity .
a) the address document contains value 'Warszawa' in field 'address.city'
The result feature score is 1.98

b) the address document contains value 'WarszawaRado' in field
'address.city'
The result score is 0.0

How to return a score that finds some similarities between 'Warszawa' and
'WarszawaRado' in search b)?

Best regards,
Dariusz Wojtas


Re: LTR features and searching for field using multiple words

2017-10-20 Thread Dariusz Wojtas
I have found a solution based on Yonik's post in this thread:
http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/95646
The answer was to surround the searched value with double quotes. In my
case they had to be escaped, because there are already quotes in the SOLR
feature definition.
The working feature definition is as follows:

{
  "store": "store_incidentDB",
  "name": "scoreFullAddressStreet",
  "class": "org.apache.solr.ltr.feature.SolrFeature",
  "params":{ "q": "{!parent which='type:entity' score='max'
v='keyword_address:\"${fullAddressStreet}\"'}" }
}

Best regards,
Dariusz Wojtas


On Fri, Oct 20, 2017 at 11:49 AM, Dariusz Wojtas <dwoj...@gmail.com> wrote:

> Hi,
>
> Recently I work with LTR features.
> In some of these features I use the block join parent parser.
> It works as expected until I pass multi-word value into the query.
> I have a parameter called 'fullAddressStreet' and it
> - works when I pass value 'something'
> - does not work if I pass value 'something street' or 'something 17B'.
>
> It fails with:
>   java.lang.RuntimeException: Error while creating weights in LTR:
>   java.lang.RuntimeException: Exception from createWeight for SolrFeature
>   [name=scoreFullAddressStreet,
> params={q={!parent which='type:entity' score='max'}keyword_address:${
> fullAddressStreet}}]
> no field name specified in query and no default specified via 'df' param
>
> I have tried fifferent variants:
>{!parent which='type:entity' score='max'}keyword_address:${
> fullAddressStreet}
>{!parent which='type:entity' score='max' v='keyword_address:${
> fullAddressStreet}'}
>{!parent which='type:entity' score='max' df='keyword_address'
> v='keyword_address:${fullAddressStreet}'}
>
> When using in LTR feature, all query definition is surrounded with double
> quotes.
> All these variants work when I ask for single word, fail the same way with
> multiple words.
>
> The 'keyword_address' field definition is:
>      stored="false" multiValued="true"/>
>
> and the value is copied from other fields:
> 
> 
> 
>
> How do I correctly use this parser?
> Character escaping? How?
>
> Best regards,
> Dariusz Wojtas
>


SOLR 7.2 and LTR

2017-12-27 Thread Dariusz Wojtas
Hi,

I am using SOLR 7.0 and use the ltr parser.
The configuration I use works nicely under SOLR 7.0.0.
I am trying to upgrade to 7.2.0 but whenever I want to use my handler, I
get an exception:
"rq parameter must be a RankQuery"

The exact response is:


org.apache.solr.common.SolrException
org.apache.solr.common.SolrException

rq parameter must be a RankQuery

Re: SOLR 7.2 and LTR

2017-12-29 Thread Dariusz Wojtas
I have declared the rerank query parser and executed it.
Works under 7.1, but does not work under 7.2. The same copied config file.
Under 7.2 I receive the same exception "rq parameter must be a RankQuery"
as for ltr.

And I am sure I've declared it correctly, because in 7.1 it even complained
if I missed to pass it's rerank query. Worked if the query was passed.
With 7.2 it does not come to this point, it does not understand what rerank
is and throws the exception above.

Best regards,
Dariusz Wojtas


On Fri, Dec 29, 2017 at 10:57 AM, Diego Ceccarelli (BLOOMBERG/ LONDON) <
dceccarel...@bloomberg.net> wrote:

> Dariusz, does the rerank query work?
>
> From: solr-user@lucene.apache.org At: 12/28/17 22:25:28To:
> solr-user@lucene.apache.org
> Subject: Re: SOLR 7.2 and LTR
>
> Yes, this could be SOLR-11501.
> But from the description in the ticket I see no option to run LTR, unless I
> am missing something.
>
> I have the ltr queryParser registered. I believe it is declared correctly,
> works with 7.0.0.
> I have just double checked with different SOLR versions, copying exactly
> the same config directory to each 'server/solr' directory.
> * SOLR 7.0.0 - works
> * SOLR 7.1.0 - works
> * SOLR 7.2.0 - does not work, exception as previously described.
>
> I have tried to run it with
> *  'luceneMatchVersion' => 7.0.0, 7.1.0 and 7.2.0. It does not change
> anything.
> * *,_query_ defined in initParams
> * defType=ltr, but then the main query, which is of type edismax, cannot be
> instantiated because of NPE
>
> Any Hint how to use LTR with 7.2?
>
>
> Best regards,
> Dariusz Wojtas
>
>
> On Thu, Dec 28, 2017 at 6:11 PM, Christine Poerschke (BLOOMBERG/ LONDON) <
> cpoersc...@bloomberg.net> wrote:
>
> > From a (very) quick look it seems like the https://issues.apache.org/
> > jira/browse/SOLR-11501 upgrade notes might be relevant, potentially.
> >
> > From: solr-user@lucene.apache.org At: 12/28/17 15:18:22To:
> > solr-user@lucene.apache.org
> > Subject: Re: SOLR 7.2 and LTR
> >
> > Do you have the ltr qparser plugin registered into the solrconfig?
> >
> > Can you check what happens if instead of ltr you use the rerank query
> > plugin? does it work or you get the same error?
> > https://lucene.apache.org/solr/guide/6_6/query-re-ranking.html
> >
> >
> > From: solr-user@lucene.apache.org At: 12/28/17 13:58:26To:
> > solr-user@lucene.apache.org
> > Subject: Re: SOLR 7.2 and LTR
> >
> > Hello Diego,
> >
> > solr.log contains always the same single stacktrace in SOLR 7.2.
> > I've been trying to pass rq via solrconfig.xml and via HTTP form.
> > The /searchIncidents handler contains edismax query.
> > Works if I completely disable rq. When I add the rq param, even something
> > like:
> >{!ltr reRankDocs=25 model=incidentModel}
> > I get the exception.
> > The model is there, it's LinearModel model simplified to contain only
> > single feature 'originalScore', defined as in all available examples.
> > I just copy the same config directory under 'server\solr' to SOLR 7.0 and
> > it works.
> > I only skip the 'data' subfolder because of index differences, wen
> copying.
> >
> > 2017-12-28 13:51:08.141 DEBUG (qtp205125520-18) [   x:entityindex]
> > o.a.s.c.S.Request [entityindex]  webapp=/solr path=/searchIncidents
> > params={personalId=1234567890=Test={!ltr+
> > reRankDocs%3D25+model%3DincidentModel}}
> > 2017-12-28 13:51:08.145 ERROR (qtp205125520-18) [   x:entityindex]
> > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: rq
> > parameter must be a RankQuery
> > at
> > org.apache.solr.handler.component.QueryComponent.
> > prepare(QueryComponent.java:183)
> > at
> > org.apache.solr.handler.component.SearchHandler.handleRequestBody(
> > SearchHandler.java:276)
> > at
> > org.apache.solr.handler.RequestHandlerBase.handleRequest(
> > RequestHandlerBase.java:177)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)
> > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)
> > at
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> > SolrDispatchFilter.java:382)
> > at
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> > SolrDispatchFilter.java:326)
> > at
> > org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> > doFilter(ServletHandler.java:1751)
> > at
> > org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHandler.java:582)
> > at
> > org.eclipse.jetty.server.ha

Re: Simple string comparison by SOLR function

2018-01-18 Thread Dariusz Wojtas
ok, I've found working solution using strdist(). If the result is less than
1.0 - these strings differ somehow.

{!func}if(lt(strdist(\"${gender}\",\"male\",edit),1),0,1)

this will let me go further with my logic.

Best regards,
Dariusz Wojtas


On Thu, Jan 18, 2018 at 3:43 PM, Dariusz Wojtas <dwoj...@gmail.com> wrote:

> Hi,
>
> I am using LTR, and writing custom features.
> What I want to do, is conditional calculation, based on string comparison.
> Something like
>   if ($gender == 'test') {
> calculation 1 (for test just return 0)
>   } else {
> calculation 2 (for test just return 1)
>   }
>
> 'gender' is param passed in the user query, for test purposes below it is
> given value '*'.
> Different variants of what I am trying to do
>
> 1){!func}if(\"${gender}\"==\"test\",0,1)
> Exception: org.apache.solr.search.SyntaxError: Expected ',' at position 6
> in 'if("*"=="test",0,1)'
>
> 2){!func}if('${gender}'=='test',0,1)
> Exception: org.apache.solr.search.SyntaxError: Expected ',' at position 6
> in 'if('*'=='test',0,1)'
>
> 3){!func}if(eq(${gender},'test'),0,1)
> Exception: java.lang.UnsupportedOperationException
> at org.apache.lucene.queries.function.FunctionValues.
> doubleVal(FunctionValues.java:49)
> at org.apache.solr.search.function.SolrComparisonBoolFunction.compare(
> SolrComparisonBoolFunction.java:55)
> at org.apache.lucene.queries.function.valuesource.
> ComparisonBoolFunction$1.boolVal(ComparisonBoolFunction.java:61)
>
>
> Any hint how to achieve this?
>
> Best regards,
> Dariusz Wojtas
>


Simple string comparison by SOLR function

2018-01-18 Thread Dariusz Wojtas
Hi,

I am using LTR, and writing custom features.
What I want to do, is conditional calculation, based on string comparison.
Something like
  if ($gender == 'test') {
calculation 1 (for test just return 0)
  } else {
calculation 2 (for test just return 1)
  }

'gender' is param passed in the user query, for test purposes below it is
given value '*'.
Different variants of what I am trying to do

1){!func}if(\"${gender}\"==\"test\",0,1)
Exception: org.apache.solr.search.SyntaxError: Expected ',' at position 6
in 'if("*"=="test",0,1)'

2){!func}if('${gender}'=='test',0,1)
Exception: org.apache.solr.search.SyntaxError: Expected ',' at position 6
in 'if('*'=='test',0,1)'

3){!func}if(eq(${gender},'test'),0,1)
Exception: java.lang.UnsupportedOperationException
at
org.apache.lucene.queries.function.FunctionValues.doubleVal(FunctionValues.java:49)
at
org.apache.solr.search.function.SolrComparisonBoolFunction.compare(SolrComparisonBoolFunction.java:55)
at
org.apache.lucene.queries.function.valuesource.ComparisonBoolFunction$1.boolVal(ComparisonBoolFunction.java:61)


Any hint how to achieve this?

Best regards,
Dariusz Wojtas


Re: strdist on nested doc field

2018-01-15 Thread Dariusz Wojtas
Hello Mikhail,

I've tried it and the query executes, but it does not treat strdist() as a
function to execute.
Looks like each part of the function - it's name and parameters - are
treated as keywords to search for against the default field.

If I try something different:

   q=+firstName:Adam +{!parent which=type:record v=$chq}=+type:address
+{!func}strdist('Shakespeare',address.street, edit)

then I get exception:
  org.apache.solr.search.SyntaxError: Missing end to unquoted value
starting at 37 str='strdist('Shakespeare',address.street,'

Best regards,
Dariusz Wojtas




On Tue, Jan 16, 2018 at 4:04 AM, Mikhail Khludnev <m...@apache.org> wrote:

> Hello, Dariusz.
>
> It should be something like
> q=+firstName:Adam +{!parent which=type:record
> v=$chq}=+type:address +strdist('Shakespeare',
> address.street, edit)
> post exception if it doesn't work.
>
> On Tue, Jan 16, 2018 at 1:39 AM, Dariusz Wojtas <dwoj...@gmail.com> wrote:
>
> > Hi,
> >
> > Is it possible to use the strdist() function to return distance on the
> > child document field?
> > Let's say I have:
> >
> > 
> >1
> >record
> > Adam
> > 
> >   A1
> >   address
> >   business
> >   Shakespeare
> > 
> > 
> >   A2
> >   address
> >   correspondence
> >   Baker Street
> > 
> > 
> >
> > What I want to do is to search for documents:
> >   type:record
> >   firstName:Adam
> >   and return max strdist('Shakespeare', address.street, edit) as the
> > resulting score?
> >
> > or
> >   type:record
> >   firstName:Adam
> >   and return max strdist('Shakespeare', address.street, edit) of
> > "address.type:business" as the resulting score?
> >
> > I am trying with the {!parent} mode and {!function}, various
> combinations.
> > But I do not get what I'd expect.
> >
> > Best regards,
> > Dariusz Wojtas
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


strdist on nested doc field

2018-01-15 Thread Dariusz Wojtas
Hi,

Is it possible to use the strdist() function to return distance on the
child document field?
Let's say I have:


   1
   record
Adam

  A1
  address
  business
  Shakespeare


  A2
  address
  correspondence
  Baker Street



What I want to do is to search for documents:
  type:record
  firstName:Adam
  and return max strdist('Shakespeare', address.street, edit) as the
resulting score?

or
  type:record
  firstName:Adam
  and return max strdist('Shakespeare', address.street, edit) of
"address.type:business" as the resulting score?

I am trying with the {!parent} mode and {!function}, various combinations.
But I do not get what I'd expect.

Best regards,
Dariusz Wojtas


Re: strdist on nested doc field

2018-01-16 Thread Dariusz Wojtas
Please see the answers below.

On Tue, Jan 16, 2018 at 8:40 AM, Mikhail Khludnev <m...@apache.org> wrote:

> On Tue, Jan 16, 2018 at 10:30 AM, Dariusz Wojtas <dwoj...@gmail.com>
> wrote:
>
> > Hello Mikhail,
> >
> > I've tried it and the query executes, but it does not treat strdist() as
> a
> > function to execute.
> > Looks like each part of the function - it's name and parameters - are
> > treated as keywords to search for against the default field.
> >
> Can you post the exact observations, rather than interpretation?
>

Yes,
Here is the output of the original query proposed, where I believe strdist
call is not treated as a function. keyword1 if the default search field:


firstName:Adam AllParentsAware(ToParentBlockJoinQuery (type:address
keyword1:strdic (keyword1:shakespear keyword1:address.streeo
keyword1:edit)))

This is not a function interpretation call.


> If I try something different:
> >
> >q=+firstName:Adam +{!parent which=type:record
> v=$chq}=+type:address
> > +{!func}strdist('Shakespeare',address.street, edit)
> >
> > then I get exception:
> >   org.apache.solr.search.SyntaxError: Missing end to unquoted value
> > starting at 37 str='strdist('Shakespeare',address.street,'
> >
> This particular query failed because of the space. Here is my pet peeve in
> Solr: the syntax {!foo} captures whole string if it's in beginning of the
> string, but in the middle of the string it captures only substring until
> the first space.
> So, after removing space it should work fine. Another potential issues are:
> single quotes (do they it ever supported?), and the dot in the fieldname
> (you never know).



Yes, it's been about the space first, then about the dot in the name of the
nested field 'address.street'.
After many failed attempts with various escaping modes, I have executed it
with double parameter dereferencing, but have faced another issue:

q=+firstName:Adam + {!parent which='type:record' score='max' v=$chq1}
chq1=+type:address +{!func v='$chq2'}
chq2=strdist('Shakespeare',address.street,edit)

java.lang.IllegalStateException: Child query must not match same docs with
parent filter. Combine them as must clauses (+) to find a problem doc.
docId=5, class org.apache.lucene.search.DisjunctionSumScorer at
org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinScorer.setScoreAndFreq(ToParentBlockJoinQuery.java:327)
at
org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinScorer.score(ToParentBlockJoinQuery.java:286)
at
org.apache.lucene.search.BooleanScorer$OrCollector.collect(BooleanScorer.java:142)
at
org.apache.lucene.search.Weight$DefaultBulkScorer.scoreRange(Weight.java:208)
at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:195)
at
org.apache.lucene.search.BooleanScorer$BulkScorerAndDoc.score(BooleanScorer.java:61)
at
org.apache.lucene.search.BooleanScorer.scoreWindowIntoBitSetAndReplay(BooleanScorer.java:213)
at
org.apache.lucene.search.BooleanScorer.scoreWindowMultipleScorers(BooleanScorer.java:260)
at
org.apache.lucene.search.BooleanScorer.scoreWindow(BooleanScorer.java:305)
at org.apache.lucene.search.BooleanScorer.score(BooleanScorer.java:317) at
org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39) at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:658) at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:462) at
org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:215)
at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1591)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:575)
at


Any hint now?
Mo goal is to return the result of the strdist() function on a nested
document field.

Best regards,
Dariusz Wojtas


>
> > Best regards,
> > Dariusz Wojtas
> >
> >
> >
> >
> > On Tue, Jan 16, 2018 at 4:04 AM, Mikhail Khludnev <m...@apache.org>
> wrote:
> >
> > > Hello, Dariusz.
> > >
> > > It should be something like
> > > q=+firstName:Adam +{!parent which=type:record
> > > v=$chq}=+type:address +strdist('Shakespeare',
> > > address.street, edit)
> > > post exception if it doesn't work.
> > >
> > > On Tue, Jan 16, 2018 at 1:39 AM, Dariusz Wojtas <dwoj...@gmail.com>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > Is it possible to use the strdist() function to return distance on
> the
> > > > child document field?
> > > > Let's say I have:
> > > >
> > > > 
> > > >1
> > > >record
> > > > Adam
> 

LTR and features operating on children doc data

2018-01-21 Thread Dariusz Wojtas
Dear Solr Masters,

I am using the LTR functionality with Solr, and it works beautifully.
I have a nice catch-all query at the beginning, then I am recalculating the
score with LTR.
And I have already learned some nice tricks. but there is something that I
still cannot do.
I need to create LTR model features, that will operate on children document
properties.

Part of my doc structure:

b0001
record
male
Krzysztof Kowalski

d1e12
name
alias
Chris Kowalski


d1e18
name
spelling
Krzysiek Kowalski





What is pretty easy:
1. return strdist() between the given ${fullNameParam} and field 'fullName'
{
"store": "myStore",
"name": "scoreFullName",
"class": "org.apache.solr.ltr.feature.SolrFeature",
"params":{ "q":
"{!func}strdist(\"${fullNameParam}\",fullName,edit)" }
}

2. I may also execute conditional evaluation, but only on top level
document attributes.


But what I do not know how to achieve is:
*  return max(strdist()) of the given $fullNameParam against field
'name_fullName' in all children documents (type='name')*

I need max() because there may be several children documents, I only need
to find the top matching one.
No, synonyms do not fit here. The example with names above is only a part
of my data, there are other cases.


Can this be done? How?
I am returning the top level documents, where type='record'. But do not
know how to achieve children evaluation result, where there may be many
children documents.
I've tried parent and child block join, with no luck.

Best regards,
Dariusz Wojtas


Re: LTR and working with feature stores

2018-01-13 Thread Dariusz Wojtas
Ups,

Diego, I have just read your answer again.
Now I see that it is [features] element that triggers calculation of all
store features.
That gives hope model only executed the features it needs ;)

Best regards,
Dariusz Wojtas

On Sat, Jan 13, 2018 at 11:12 PM, Dariusz Wojtas <dwoj...@gmail.com> wrote:

> Hi,
>
> Thanks for the response, I understand that all features from the given
> store are calculated, no matter if they are used or not.
> OK, spread features across different models.
> But what if different models share some features?
> Creating copies of feature definitions in different stores, one per model,
> is erroneous ...
> Having several models in one store, some of them use only part of these
> features - that seems 'expensive' ;)
>
> Simple syntax evolution would be very helpful, to give {!ltr} optional
> 'store' parameter. It could override the current features store, is
> specified.
>   {!ltr reRankDocs=25 store=storeA model=simpleModelA}
>
> And {!ltr} executes 'model based calculation', not 'store based
> calculation'. Model knows what featues are required.
> Why are all features executed?
>
> Best regards,
> Dariusz Wojtas
>
>
> On Sat, Jan 13, 2018 at 4:03 PM, Diego Ceccarelli <
> diego.ceccare...@gmail.com> wrote:
>
>> Hi Dariusz,
>>
>> On Jan 12, 2018 14:40, "Dariusz Wojtas" <dwoj...@gmail.com> wrote:
>>
>> Hi,
>>
>> I am working with the LTR rescoring.
>> Works beautifully, but I am curious about something.
>> How do I specify the feature store in a way different than using the
>> [features] syntax?
>> [features store=yourFeatureStore]
>>
>>
>>
>> What is the problem with this syntax? If the problem is the name of the
>> field, you can also call it by doing fl=title,authors,myfield=[features
>> store=yourFeatureStore]
>> I can't think of alternative ways..
>>
>>
>>
>> I have a range of models in my custom feature store, with plenty of
>> features implemented.
>> I have found that when I call LTR with model using only two features, Solr
>> still executes them all.
>>
>> My setup in solrconfig.xml
>> -
>> id,score,why_score:[explain style=nl],[features
>> store=store_incidentDB]
>> {!ltr reRankDocs=$reRankDocs model=simpleModelA}
>> --
>>
>> simpleModel above only uses LinearModel with 2 features.
>>
>>
>> What do I see in results?
>> In response I can see it has executed ALL features (there are values
>> calculated) in section:
>> 1)  -> response -> result -> doc -> HERE
>>
>> In addition, there is my model executed and only TWO features of the
>> executed model are presented in:
>>
>>
>> It is intended, the reason is that usually you want to execute your model
>> and at the same time log a *superset* of the features to train the next
>> model. If you want to compute only the features of the model you can
>> define
>> a featureStore that matches exactly the features that you have in the
>> model.
>>
>> 2)  -> response -> debug -> explain
>>
>> Why do I see all features being executed, if the specified model only
>> contains two features?
>>
>> I tried to reduce 'fl' to:
>>   id,score,why_score:[explain style=nl]
>> and id works as expected then:
>> 1. additional features are not executed (correct)
>> 2. my model works, only two features of the selected model (correct)
>>
>> And the final questions for this long email are:
>> 1. why does it execute all features when i specify 'store'?
>> 2. how do I specify the 'store', if I have more stores, but do not want to
>> execute all their features?
>>
>>
>> Just define a feature store that matches the features that you have in the
>> model. Please note that the featureStore that you specify in fl=
>> [features]
>> field **will not** affect the reranking (the model will compute only the
>> features that are specified in the model json file), you should ask for
>> the
>> [features] only if you want to log them.
>> Please do not hesitate to ask if something is not clear ;)
>>
>> Cheers,
>> Diego
>>
>>
>>
>> Best regards,
>> Dariusz Wojtas
>>
>
>


Re: LTR and working with feature stores

2018-01-13 Thread Dariusz Wojtas
Hi,

Thanks for the response, I understand that all features from the given
store are calculated, no matter if they are used or not.
OK, spread features across different models.
But what if different models share some features?
Creating copies of feature definitions in different stores, one per model,
is erroneous ...
Having several models in one store, some of them use only part of these
features - that seems 'expensive' ;)

Simple syntax evolution would be very helpful, to give {!ltr} optional
'store' parameter. It could override the current features store, is
specified.
  {!ltr reRankDocs=25 store=storeA model=simpleModelA}

And {!ltr} executes 'model based calculation', not 'store based
calculation'. Model knows what featues are required.
Why are all features executed?

Best regards,
Dariusz Wojtas


On Sat, Jan 13, 2018 at 4:03 PM, Diego Ceccarelli <
diego.ceccare...@gmail.com> wrote:

> Hi Dariusz,
>
> On Jan 12, 2018 14:40, "Dariusz Wojtas" <dwoj...@gmail.com> wrote:
>
> Hi,
>
> I am working with the LTR rescoring.
> Works beautifully, but I am curious about something.
> How do I specify the feature store in a way different than using the
> [features] syntax?
> [features store=yourFeatureStore]
>
>
>
> What is the problem with this syntax? If the problem is the name of the
> field, you can also call it by doing fl=title,authors,myfield=[features
> store=yourFeatureStore]
> I can't think of alternative ways..
>
>
>
> I have a range of models in my custom feature store, with plenty of
> features implemented.
> I have found that when I call LTR with model using only two features, Solr
> still executes them all.
>
> My setup in solrconfig.xml
> -
> id,score,why_score:[explain style=nl],[features
> store=store_incidentDB]
> {!ltr reRankDocs=$reRankDocs model=simpleModelA}
> --
>
> simpleModel above only uses LinearModel with 2 features.
>
>
> What do I see in results?
> In response I can see it has executed ALL features (there are values
> calculated) in section:
> 1)  -> response -> result -> doc -> HERE
>
> In addition, there is my model executed and only TWO features of the
> executed model are presented in:
>
>
> It is intended, the reason is that usually you want to execute your model
> and at the same time log a *superset* of the features to train the next
> model. If you want to compute only the features of the model you can define
> a featureStore that matches exactly the features that you have in the
> model.
>
> 2)  -> response -> debug -> explain
>
> Why do I see all features being executed, if the specified model only
> contains two features?
>
> I tried to reduce 'fl' to:
>   id,score,why_score:[explain style=nl]
> and id works as expected then:
> 1. additional features are not executed (correct)
> 2. my model works, only two features of the selected model (correct)
>
> And the final questions for this long email are:
> 1. why does it execute all features when i specify 'store'?
> 2. how do I specify the 'store', if I have more stores, but do not want to
> execute all their features?
>
>
> Just define a feature store that matches the features that you have in the
> model. Please note that the featureStore that you specify in fl= [features]
> field **will not** affect the reranking (the model will compute only the
> features that are specified in the model json file), you should ask for the
> [features] only if you want to log them.
> Please do not hesitate to ask if something is not clear ;)
>
> Cheers,
> Diego
>
>
>
> Best regards,
> Dariusz Wojtas
>


LTR and working with feature stores

2018-01-12 Thread Dariusz Wojtas
Hi,

I am working with the LTR rescoring.
Works beautifully, but I am curious about something.
How do I specify the feature store in a way different than using the
[features] syntax?
[features store=yourFeatureStore]

I have a range of models in my custom feature store, with plenty of
features implemented.
I have found that when I call LTR with model using only two features, Solr
still executes them all.

My setup in solrconfig.xml
-
id,score,why_score:[explain style=nl],[features
store=store_incidentDB]
{!ltr reRankDocs=$reRankDocs model=simpleModelA}
--

simpleModel above only uses LinearModel with 2 features.

What do I see in results?
In response I can see it has executed ALL features (there are values
calculated) in section:
1)  -> response -> result -> doc -> HERE

In addition, there is my model executed and only TWO features of the
executed model are presented in:
2)  -> response -> debug -> explain

Why do I see all features being executed, if the specified model only
contains two features?

I tried to reduce 'fl' to:
  id,score,why_score:[explain style=nl]
and id works as expected then:
1. additional features are not executed (correct)
2. my model works, only two features of the selected model (correct)

And the final questions for this long email are:
1. why does it execute all features when i specify 'store'?
2. how do I specify the 'store', if I have more stores, but do not want to
execute all their features?

Best regards,
Dariusz Wojtas


Re: SOLR 7.2 and LTR

2017-12-28 Thread Dariusz Wojtas
Hello Diego,

solr.log contains always the same single stacktrace in SOLR 7.2.
I've been trying to pass rq via solrconfig.xml and via HTTP form.
The /searchIncidents handler contains edismax query.
Works if I completely disable rq. When I add the rq param, even something
like:
   {!ltr reRankDocs=25 model=incidentModel}
I get the exception.
The model is there, it's LinearModel model simplified to contain only
single feature 'originalScore', defined as in all available examples.
I just copy the same config directory under 'server\solr' to SOLR 7.0 and
it works.
I only skip the 'data' subfolder because of index differences, wen copying.

2017-12-28 13:51:08.141 DEBUG (qtp205125520-18) [   x:entityindex]
o.a.s.c.S.Request [entityindex]  webapp=/solr path=/searchIncidents
params={personalId=1234567890=Test={!ltr+reRankDocs%3D25+model%3DincidentModel}}
2017-12-28 13:51:08.145 ERROR (qtp205125520-18) [   x:entityindex]
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: rq
parameter must be a RankQuery
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:183)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:276)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1751)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Unknown Source)

Best regards,
Dariusz Wojtas



On Thu, Dec 28, 2017 at 1:03 PM, Diego Ceccarelli (BLOOMBERG/ LONDON) <
dceccarel...@bloomberg.net> wrote:

> Hello Dariusz,
>
> Can you look into the solr logs for a stack trace or ERROR logs?
>
>
>
> From: solr-user@lucene.apache.org At: 12/27/17 19:01:29To:
> solr-user@lucene.apache.org
> Subject: SOLR 7.2 and LTR
>
> Hi,
>
> I am using SOLR 7.0 and use the ltr parser.
> The configuration I use works nicely under SOLR 7.0.0.
> I am trying to upgrade to 7.2.0 but whenever I want to use my handler, I
> get an exception:
> "rq parameter must be a RankQuery"
>
> The exact response is:
> 
> 
> org.apache.solr.common.SolrException
> org.apache.solr.common.SolrException
> 
> rq parameter must be a RankQuery
> 

Re: SOLR 7.2 and LTR

2017-12-28 Thread Dariusz Wojtas
Yes, this could be SOLR-11501.
But from the description in the ticket I see no option to run LTR, unless I
am missing something.

I have the ltr queryParser registered. I believe it is declared correctly,
works with 7.0.0.
I have just double checked with different SOLR versions, copying exactly
the same config directory to each 'server/solr' directory.
* SOLR 7.0.0 - works
* SOLR 7.1.0 - works
* SOLR 7.2.0 - does not work, exception as previously described.

I have tried to run it with
*  'luceneMatchVersion' => 7.0.0, 7.1.0 and 7.2.0. It does not change
anything.
* *,_query_ defined in initParams
* defType=ltr, but then the main query, which is of type edismax, cannot be
instantiated because of NPE

Any Hint how to use LTR with 7.2?


Best regards,
Dariusz Wojtas


On Thu, Dec 28, 2017 at 6:11 PM, Christine Poerschke (BLOOMBERG/ LONDON) <
cpoersc...@bloomberg.net> wrote:

> From a (very) quick look it seems like the https://issues.apache.org/
> jira/browse/SOLR-11501 upgrade notes might be relevant, potentially.
>
> From: solr-user@lucene.apache.org At: 12/28/17 15:18:22To:
> solr-user@lucene.apache.org
> Subject: Re: SOLR 7.2 and LTR
>
> Do you have the ltr qparser plugin registered into the solrconfig?
>
> Can you check what happens if instead of ltr you use the rerank query
> plugin? does it work or you get the same error?
> https://lucene.apache.org/solr/guide/6_6/query-re-ranking.html
>
>
> From: solr-user@lucene.apache.org At: 12/28/17 13:58:26To:
> solr-user@lucene.apache.org
> Subject: Re: SOLR 7.2 and LTR
>
> Hello Diego,
>
> solr.log contains always the same single stacktrace in SOLR 7.2.
> I've been trying to pass rq via solrconfig.xml and via HTTP form.
> The /searchIncidents handler contains edismax query.
> Works if I completely disable rq. When I add the rq param, even something
> like:
>{!ltr reRankDocs=25 model=incidentModel}
> I get the exception.
> The model is there, it's LinearModel model simplified to contain only
> single feature 'originalScore', defined as in all available examples.
> I just copy the same config directory under 'server\solr' to SOLR 7.0 and
> it works.
> I only skip the 'data' subfolder because of index differences, wen copying.
>
> 2017-12-28 13:51:08.141 DEBUG (qtp205125520-18) [   x:entityindex]
> o.a.s.c.S.Request [entityindex]  webapp=/solr path=/searchIncidents
> params={personalId=1234567890=Test={!ltr+
> reRankDocs%3D25+model%3DincidentModel}}
> 2017-12-28 13:51:08.145 ERROR (qtp205125520-18) [   x:entityindex]
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: rq
> parameter must be a RankQuery
> at
> org.apache.solr.handler.component.QueryComponent.
> prepare(QueryComponent.java:183)
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(
> SearchHandler.java:276)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(
> RequestHandlerBase.java:177)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:382)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:326)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilter(ServletHandler.java:1751)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:143)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHandler.java:548)
> at
> org.eclipse.jetty.server.session.SessionHandler.
> doHandle(SessionHandler.java:226)
> at
> org.eclipse.jetty.server.handler.ContextHandler.
> doHandle(ContextHandler.java:1180)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHandler.java:512)
> at
> org.eclipse.jetty.server.session.SessionHandler.
> doScope(SessionHandler.java:185)
> at
> org.eclipse.jetty.server.handler.ContextHandler.
> doScope(ContextHandler.java:1112)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:141)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> ContextHandlerCollection.java:213)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.
> handle(HandlerCollection.java:119)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
> at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
> RewriteHandler.java:335)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
> a

Re: SOLR 7.2 and LTR

2018-01-02 Thread Dariusz Wojtas
I have created issue SOLR-11809 (
https://issues.apache.org/jira/browse/SOLR-11809) in JIRA and uploaded a
minimal working configuration that shows the problem. I hope this will make
it easier to verify and find some solution.

Best regards,
Dariusz Wojtas

On Fri, Dec 29, 2017 at 11:35 AM, Dariusz Wojtas <dwoj...@gmail.com> wrote:

> I have declared the rerank query parser and executed it.
> Works under 7.1, but does not work under 7.2. The same copied config file.
> Under 7.2 I receive the same exception "rq parameter must be a RankQuery"
> as for ltr.
>
> And I am sure I've declared it correctly, because in 7.1 it even
> complained if I missed to pass it's rerank query. Worked if the query was
> passed.
> With 7.2 it does not come to this point, it does not understand what
> rerank is and throws the exception above.
>
> Best regards,
> Dariusz Wojtas
>
>
> On Fri, Dec 29, 2017 at 10:57 AM, Diego Ceccarelli (BLOOMBERG/ LONDON) <
> dceccarel...@bloomberg.net> wrote:
>
>> Dariusz, does the rerank query work?
>>
>> From: solr-user@lucene.apache.org At: 12/28/17 22:25:28To:
>> solr-user@lucene.apache.org
>> Subject: Re: SOLR 7.2 and LTR
>>
>> Yes, this could be SOLR-11501.
>> But from the description in the ticket I see no option to run LTR, unless
>> I
>> am missing something.
>>
>> I have the ltr queryParser registered. I believe it is declared correctly,
>> works with 7.0.0.
>> I have just double checked with different SOLR versions, copying exactly
>> the same config directory to each 'server/solr' directory.
>> * SOLR 7.0.0 - works
>> * SOLR 7.1.0 - works
>> * SOLR 7.2.0 - does not work, exception as previously described.
>>
>> I have tried to run it with
>> *  'luceneMatchVersion' => 7.0.0, 7.1.0 and 7.2.0. It does not change
>> anything.
>> * *,_query_ defined in initParams
>> * defType=ltr, but then the main query, which is of type edismax, cannot
>> be
>> instantiated because of NPE
>>
>> Any Hint how to use LTR with 7.2?
>>
>>
>> Best regards,
>> Dariusz Wojtas
>>
>>
>> On Thu, Dec 28, 2017 at 6:11 PM, Christine Poerschke (BLOOMBERG/ LONDON) <
>> cpoersc...@bloomberg.net> wrote:
>>
>> > From a (very) quick look it seems like the https://issues.apache.org/
>> > jira/browse/SOLR-11501 upgrade notes might be relevant, potentially.
>> >
>> > From: solr-user@lucene.apache.org At: 12/28/17 15:18:22To:
>> > solr-user@lucene.apache.org
>> > Subject: Re: SOLR 7.2 and LTR
>> >
>> > Do you have the ltr qparser plugin registered into the solrconfig?
>> >
>> > Can you check what happens if instead of ltr you use the rerank query
>> > plugin? does it work or you get the same error?
>> > https://lucene.apache.org/solr/guide/6_6/query-re-ranking.html
>> >
>> >
>> > From: solr-user@lucene.apache.org At: 12/28/17 13:58:26To:
>> > solr-user@lucene.apache.org
>> > Subject: Re: SOLR 7.2 and LTR
>> >
>> > Hello Diego,
>> >
>> > solr.log contains always the same single stacktrace in SOLR 7.2.
>> > I've been trying to pass rq via solrconfig.xml and via HTTP form.
>> > The /searchIncidents handler contains edismax query.
>> > Works if I completely disable rq. When I add the rq param, even
>> something
>> > like:
>> >{!ltr reRankDocs=25 model=incidentModel}
>> > I get the exception.
>> > The model is there, it's LinearModel model simplified to contain only
>> > single feature 'originalScore', defined as in all available examples.
>> > I just copy the same config directory under 'server\solr' to SOLR 7.0
>> and
>> > it works.
>> > I only skip the 'data' subfolder because of index differences, wen
>> copying.
>> >
>> > 2017-12-28 13:51:08.141 DEBUG (qtp205125520-18) [   x:entityindex]
>> > o.a.s.c.S.Request [entityindex]  webapp=/solr path=/searchIncidents
>> > params={personalId=1234567890=Test={!ltr+
>> > reRankDocs%3D25+model%3DincidentModel}}
>> > 2017-12-28 13:51:08.145 ERROR (qtp205125520-18) [   x:entityindex]
>> > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: rq
>> > parameter must be a RankQuery
>> > at
>> > org.apache.solr.handler.component.QueryComponent.
>> > prepare(QueryComponent.java:183)
>> > at
>> > org.apache.solr.handler.component.SearchHandler.handleRequestBody(
>> > SearchHandler.java:276)
>> > at
>> > org.apache.solr

Re: LTR and features operating on children doc data

2018-01-22 Thread Dariusz Wojtas
No answers so far, but I have found a workaround that fits my needs. Maybe
it will help somebody in the future.
The solution is transparent to the client system.

Use the XSLT  response writer
 a. create a stylesheet that understands the query response XML format
 b. let's assume the query returns 3 top level documents, with identifiers
1, 2, 3.
 c. iterate over them and prepare URI to call for processing children
documents.
 In result it should be something like:

  /select
   ?q=_root_:(1 OR 2 OR 3)
   =type:name
   =
   =id,parentId:_root_,childScore:product(0.15,
strdist("",name_fullName,edit))

I aliased the magic _root_ field with 'parentId' alias.
_root_ field had to be changed to stored="true" in the schema,
otherwise the value would not be returned (by default index only field).
I have also aliased the function result with the 'childScore' name.
You may need to escape special charecters within hand-crafted URI.
If so, use saxon, and you have access to the XSLT 2.0 encode-for-uri()
function - this will let you encode some parts of the URI.

 d. use the document() XSLT function for http subrequest from within the
XSLT transformation.
 Happily document() understands the 'http' protocol.
 All children docs processed with single sub-call within the main call!

 e. now process the main response with XSLT, update the score with the
results from the childDoc document.

 f. you may do whatever you want: add extra fields, update score, replace
code values with full text.

Works nicely even with 50 top level documents, each having 1-10 children.
Sub second response time to the client system, unnoticeable by the end user.

I really miss a way to call some aggreate function - in my case max() - on
a range of children documents call within LTR feature.
But XSLT with saxon have proven its value ;)
I could put it in wiki, with some more details, if somebody hinted me how
to do it.


Best regards,
Dariusz Wojtas


On Sun, Jan 21, 2018 at 12:35 PM, Dariusz Wojtas <dwoj...@gmail.com> wrote:

> Dear Solr Masters,
>
> I am using the LTR functionality with Solr, and it works beautifully.
> I have a nice catch-all query at the beginning, then I am recalculating
> the score with LTR.
> And I have already learned some nice tricks. but there is something that I
> still cannot do.
> I need to create LTR model features, that will operate on children
> document properties.
>
> Part of my doc structure:
> 
> b0001
> record
> male
> Krzysztof Kowalski
> 
> d1e12
> name
> alias
> Chris Kowalski
> 
> 
> d1e18
> name
> spelling
> Krzysiek Kowalski
> 
> 
>
>
>
> What is pretty easy:
> 1. return strdist() between the given ${fullNameParam} and field 'fullName'
> {
> "store": "myStore",
> "name": "scoreFullName",
> "class": "org.apache.solr.ltr.feature.SolrFeature",
> "params":{ "q": "{!func}strdist(\"${fullNameParam}\",fullName,edit)"
> }
> }
>
> 2. I may also execute conditional evaluation, but only on top level
> document attributes.
>
>
> But what I do not know how to achieve is:
> *  return max(strdist()) of the given $fullNameParam against field
> 'name_fullName' in all children documents (type='name')*
>
> I need max() because there may be several children documents, I only need
> to find the top matching one.
> No, synonyms do not fit here. The example with names above is only a part
> of my data, there are other cases.
>
>
> Can this be done? How?
> I am returning the top level documents, where type='record'. But do not
> know how to achieve children evaluation result, where there may be many
> children documents.
> I've tried parent and child block join, with no luck.
>
> Best regards,
> Dariusz Wojtas
>


Parameter expansion and functions

2018-03-13 Thread Dariusz Wojtas
Hi,

I am using SOLR 7.1
I've read some time ago the article
http://yonik.com/solr-query-parameter-substitution/
It's very helpful, knowing the trick with dummy parameter as a workaround
in some cases is very helpful.
  ${dummyParam:${localVariable}}

It is possible to use constructs like this in solrsonfig.xml

10
 ${dummyParam:${varA}} and some other content

It resolves as expected. This way I may calculate some parameters on the
fly.

But is there any chance to make this feature more complex?
Can I do something conditionally inside?
Can I use logical operators, or some other functions to construct local
parameters?
I know, one may say it may break security, etc.
Let it be turned off by default, let it require allowing local parameters
by name, to be allowed this. but can it be done?

My current need is to define the buusted query on the fly, but this feature
could help me in the past also. Example:

I have static 'q' param:
{!edismax
  qf='idx_IDs^5 idx_names^3 idx_address^1'
  v=$searchedTerms}

Sometimes it is longer.
And I'd like to decide on the fly which fields should be included in the
query, same as their weights.
It works if I want to assign parametrized weight:
{!edismax
  qf='idx_IDs^5 idx_names^3
idx_address^${dummyParam:${myWeightParam}}'
  v=$searchedTerms}

But can I make it more dynamic? Conditionals? I ytried with the  {!switch
case default... v... } but it looks like this function expansion is
applied at some later stage, and it fails when I try to put it in "q"
definition.


Best regards,
Dariusz Wojtas


Boosting with 0 factor

2018-03-13 Thread Dariusz Wojtas
Hi,

I have a question about boosting queries with ^0

I am using LTR. In the 1st step I want to narrow the query, but limit
'noise' results as much as possible. The 1st step is defined as follows:
{!edismax qf='keyword_id^10 keyword_nonid^2
keyword_lastNames^2 keyword_names^1' v=$searchedTerms}

And I have lots of such boosted fields, but want to narrow them dynamically.
I may pass boost factors as local params, with ${boostVal}. This part works.
If I set boost to 0, as in the example below for field 'keyword_id', will
it execute query on that field or skip it since boost of factor 0 will
boost the result?
In short: will it bring performance savings, or not?

{!edismax qf='keyword_id^0 keyword_nonid^2
keyword_lastNames^2 keyword_names^1' v=$searchedTerms}


Best regards,
Dariusz Wojtas


Empty XML output from SOLR streaming expression

2018-09-27 Thread Dariusz Wojtas
Hi,

I am working with SOLR 7.4.0 and use streaming expressions.
This works nicely, the result is produced in JSON format.
But I need to have it in XML.

Simplest query to show the problem:
search(myCollection,
  zkHost="localhost:9983",
  qt="/select",
  q="*:*",
  fl="id",
  sort="id desc")

Docs say that I should add 'wt=xml' to parameters, which I add to the URL
created by SOLR -> Admin -> Stream console, but the result is not what I
expected

*Case 1 - Stream query, no 'wt' param added to URL.*
The result js in JSON format, contains 3 rows, each with an id attribute.
The URL, produced by SOLR Admin -> Stream console:

http://localhost:8983/solr/myCollection/stream?_=1538060875379=search(myCollection,%0A++zkHost%3D%22localhost:9983%22,%0A++qt%3D%22%2Fselect%22,%0A++q%3D%22*:*%22,%0A++fl%3D%22id%22,%0A++sort%3D%22id+desc%22)


*Case 2 - the same URL, added '=xml' at the end*
I get empty XML resultset.
I am not sure if this will be correctly passed to the mailing list, but
literally the result is literally only a root 'response' element. No
contents inside.


If I add '=true' then the root element in the XML response contains
valid explanation block. Bot no resulting documents.

Any hint how to bring XML response to life?

Best regards,
Darek


Accessing multiValued field from within custom function

2019-01-03 Thread Dariusz Wojtas
Hi,

I am using SOLR 7.5 in the cloud mode.
I want to create a custom function similar to 'strdist' that works on
multivalued fields (multiValued=true) and finds the highest matching score.
Yes, I know the potential performance issues, but in my usecase this would
bring a huge benefit.

There is not much information on how to work with multiValued fields, but I
have found a piece of code that might be useful. It's how SOLR standard
functions are registered:
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/ValueSourceParser.java

The interesting part for me starts in line 424, when the 'field' function
is registered.
It optionally accepts a multivalue field for min/max calculation.
If the 2nd argument is 'min' or 'max' it tries to resolve the field as
SchemaField.
  SchemaField f = fp.getReq().getSchema().getField(fieldName);

Now the questions are:
1. Is this the path I should follow? If not - are there any other ways?
2. How to retrieve all the actual *String *or *Text *values from a
multivalue field, not just a single value? Some kind of a table or set of
values. How?
3. Does cloud mode change anything here? In my case the whole index is on a
single machine, but there are several replicas.

Best regards,
Dariusz Wojtas


Re: Accessing multiValued field from within custom function

2019-02-07 Thread Dariusz Wojtas
Hi,

Any hints on this topic?
How to access String / Text values from a multiValued field inside custom
function?

Best regards,
Dariusz Wojtas

On Thu, Jan 3, 2019 at 6:18 PM Dariusz Wojtas  wrote:

> Hi,
>
> I am using SOLR 7.5 in the cloud mode.
> I want to create a custom function similar to 'strdist' that works on
> multivalued fields (multiValued=true) and finds the highest matching score.
> Yes, I know the potential performance issues, but in my usecase this would
> bring a huge benefit.
>
> There is not much information on how to work with multiValued fields, but
> I have found a piece of code that might be useful. It's how SOLR standard
> functions are registered:
>
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/ValueSourceParser.java
>
> The interesting part for me starts in line 424, when the 'field' function
> is registered.
> It optionally accepts a multivalue field for min/max calculation.
> If the 2nd argument is 'min' or 'max' it tries to resolve the field as
> SchemaField.
>   SchemaField f = fp.getReq().getSchema().getField(fieldName);
>
> Now the questions are:
> 1. Is this the path I should follow? If not - are there any other ways?
> 2. How to retrieve all the actual *String *or *Text *values from a
> multivalue field, not just a single value? Some kind of a table or set of
> values. How?
> 3. Does cloud mode change anything here? In my case the whole index is on
> a single machine, but there are several replicas.
>
> Best regards,
> Dariusz Wojtas
>
>