This parameter referers to the Solr request, for example:
https://lucene.apache.org/solr/guide/7_0/result-grouping.html#grouping-by-query
Drupal should expose it in the API, I guess?
Cheers,
diego
From: solr-user@lucene.apache.org At: 12/02/19 14:47:06To:
solr-user@lucene.apache.org
Hi Ashis,
Short answer: No, i don't think it's possible.
I'm considering as well extending solr to allow plugging in features from
outside, but it will require time because at the moment the features can
see only the current document processed, while to do that ideally you want
to process in one
Hi Kamal,
You can use a MinMaxNormalizer[1], and get min and max from historical data,
for the original score won't guarantee that the value will be **always**
between 0..1 but it should happen in the majority of the cases, if the 0..1
constraint is not super strong I would rather use a
Another way to make queries faster is, if you can, identify a subset of
documents that are in general relevant for the users (most recent ones,
most browsed etc etc), index those documents into a separate collection and
then query the small collection and back out to the full one if the small
one
If you want a 'global' IDF across different fields, maybe one solution is to
use a copyfield to copy all the fields in a common field (e.g, title, authors,
body, footer all copied into a copyfield call text), and then you should be
able to use it with a function query or by implementing your
Hi all,
I just noticed this and I just wanted to share with you:
Full-text search is everywhere nowadays and FOSDEM 2019 will have a dedicated
devroom for search on Sunday the 3rd of February.
We would like to invite submissions of presentations from developers,
researchers, and users of
relevance.
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> >
> > On Thu, Sep 27, 2018 at 1:39 PM Diego Ceccarelli (BLOOMBERG/ LONDON) <
> > dceccarel...@bloomberg.net> wrote:
> >
> > > Yeah, I think Kmeans might be a way to implement the &q
gt; threshold.
I would allow to define the strategy and select it from the request.
From: solr-user@lucene.apache.org At: 09/27/18 18:25:43To: Diego Ceccarelli
(BLOOMBERG/ LONDON ) , solr-user@lucene.apache.org
Subject: Re: solr and diversification
I've thought about this problem a littl
Hi,
I'm considering to write a component for diversifying the results. I know that
diversification can be achieved by using grouping but I'm thinking about
something different and query biased.
The idea is to have something that gets applied after the normal retrieval and
selects the top k
Hi Akshay,
did you run solr enabling learning to rank?
./bin/solr -e techproducts -Dsolr.ltr.enabled=true
if you don't pass -Dsolr.ltr.enabled=true ltr will not be available.
Cheers,
Diego
From: solr-user@lucene.apache.org At: 07/16/18 09:00:39To:
solr-user@lucene.apache.org
Subject: Re:
Hello ilayaraja,
I think it would be good to move this discussion on the Jira item:
https://issues.apache.org/jira/browse/SOLR-8776?attachmentOrder=asc
You can add your comments there, and also in the page I explained how it works.
On the performance you are right: at the moment it is slow.
Hello,
I'm not sure 100% but I think that if you have multiple shards the number of
docs matched in each group is *not* guarantee to be exact. Increasing the rows
will increase the amount of partial information that each shard sends to the
federator and make the number more precise.
For
Thanks ilayaraja,
I updated the PR today integrating your and Alan's comments. Now it works
also in distributed mode. Please let me know what do you think :)
Cheers
Diego
On Wed, May 2, 2018, 17:46 ilayaraja wrote:
> Figured out that offset is used as part of the grouping
I just updated the PR to upstream - I still have to fix some things in
distribute mode, but unit tests in non distribute mode works.
Hope this helps,
Diego
From: solr-user@lucene.apache.org At: 04/15/18 03:37:54To:
solr-user@lucene.apache.org
Subject: Re: Learning to Rank (LTR) with
Patch has not been merged yet, it is available here:
https://github.com/apache/lucene-solr/pull/162
You can try to apply the patch on the current master and see if it fixes.
Please let us know if you have any questions.
Cheers,
Diego
From: solr-user@lucene.apache.org At: 04/05/18
I don't think you can define docTrasformer in the SolrConfig at the moment, I
agree it would be a cool feature.
Maybe one possibility could be to use the update request processors [1], and
precompute the fields at index time, it would be more expensive in disk and
index time, but then it
Hi Rick,
I don't think the issue is BM25 vs TFIDF (the old similarity), it seems more
due to the "matching" logic.
you are asking to match:
"(Action AND Technical AND Temporaries AND t/a AND CTR AND Corporation)"
This (in theory) means that you want to retrieve **only** the documents that
A similar problem came out with learning to rank models, and was fixed by
https://issues.apache.org/jira/browse/SOLR-11250
Maybe it can be useful..
From: solr-user@lucene.apache.org At: 02/26/18 13:13:28To:
solr-user@lucene.apache.org
Subject: FileDictionaryFactory:- pick source file from
Hi all,
We would like to perform a benchmark of
https://issues.apache.org/jira/browse/SOLR-11831
The patch improves the performance of grouped queries asking only for one
result per group (aka. group.limit=1).
I remember seeing a page showing a benchmark of the query performance on
Wikipedia,
ant -Dtests.slow=false
From: solr-user@lucene.apache.org At: 02/02/18 17:07:14To:
solr-user@lucene.apache.org
Subject: skip slow tests?
Hi *,
Some (slow) tests in Solr are annotated with @Slow. Is there a way to run ant
test skipping them?
thanks,
Diego
Hi *,
Some (slow) tests in Solr are annotated with @Slow. Is there a way to run ant
test skipping them?
thanks,
Diego
Hi Luigi, I don't know much that part of Lucene, I would check blog posts and
the code to understand if you can use NumericDocValues (my gut says yes).
Also, I don't know if it is important, but please note that if you index all
the documents at the beginning your scores will be different -
Hi Luigi,
What about using an updatable DocValue [1] for the field x ? you could
initially set it to -1,
and then update it for the docs in the step j. Range queries should still work
and the update should be fast.
Cheers
[1]
I think it really depends on the particular use case. Sometime the absolute
score is a good feature, sometimes no.
If you are using the default bm25, I think that increasing the number of terms
in the query will increase the average doc. score in the results. So maybe I
would normalize the
In theory it should be possible if you are indexing the positions of the tokens
in your field,
but I am not aware of any solr query that allows you to weight the matches
based on the position, does anyone know if is possible?
From: solr-user@lucene.apache.org At: 01/29/18 11:25:36To:
Hi Zahid, if you want to allow searching only if the query is shorter than a
certain number of terms / characters, I would do it before calling solr
probably, otherwise you could write a QueryParserPlugin (see [1]) and check
that the query is sound before processing it.
See also:
And you want to show to the users only the Lucene documents that matched the
original query sent to Solr? (what if a lucene document matches only part of
the query?)
From: solr-user@lucene.apache.org At: 01/23/18 13:55:46To: Diego Ceccarelli
(BLOOMBERG/ LONDON ) , solr-user
Rahul, can you provide more details on how you decide that the smaller lucene
objects are part of the same solr document?
From: solr-user@lucene.apache.org At: 01/23/18 09:59:17To:
solr-user@lucene.apache.org
Subject: Re: Using lucene to post-process Solr query results
Hi Rahul,
Looks like
Hi Fiz,
It is not possible at the moment, you will have to log the queries (from solr,
or before you sent them) and use external tools to do that.
There is a jira item on that if you are interested:
https://issues.apache.org/jira/browse/SOLR-10359
Diego
From: solr-user@lucene.apache.org At:
Hi Dariusz,
On Jan 12, 2018 14:40, "Dariusz Wojtas" wrote:
Hi,
I am working with the LTR rescoring.
Works beautifully, but I am curious about something.
How do I specify the feature store in a way different than using the
[features] syntax?
[features
natives would be helpful. Do I read your
>> response as needing to go to 7.0 when you say upstream?
>>
>> Thank you,
>> Roopa
>>
>>
>> On Tue, Dec 19, 2017 at 1:37 PM, Diego Ceccarelli <
>> diego.ceccare...@gmail.com> wrote:
>>
>>
I'm assuming that you are writing the cosine similarity and you have two
vectors containing the pairs . The two vectors could have
different sizes because they only contain the terms that have tfidf != 0.
if you want to compute cosine similarity between the two lists you just have
Maybe I misunderstood the question, but why you need to create the
full size vectors? can't you just compute the cosine using the sparse
vectors?
On Fri, Jan 5, 2018 at 10:09 PM, marco wrote:
> At the moment I have another problem: is there an efficient way to calculate
From: solr-user@lucene.apache.org At: 01/05/18 15:35:46To:
solr-user@lucene.apache.org
Subject: Re: Personalized search parameters
In particular we have to retrieve the documents with a normal search
followed by a result reranking phase where we calculate the cosine
similarity between the
Why you want the personalization to happen into Similarity?
Similarity will score all the docs matching your query, so it has too be really
fast. Unless your personalization is very easy (e.g., tf/idf computed in a
different way based on the user) I would not put it there..
Did you consider
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> executeProduceConsume(ExecuteProduceConsume.java:303)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> produceConsume(ExecuteProduceConsume.java:148)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
> Exec
)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Unknown Source)
Best regards,
Dariusz Wojtas
On Thu, Dec 28, 2017 at 1:03 PM, Diego Ceccarelli (BLOOMBERG/ LONDON) <
dceccarel...@bloomberg.net> wrote:
> Hello Dariusz,
>
> Can you look into the solr l
Hello Dariusz,
Can you look into the solr logs for a stack trace or ERROR logs?
From: solr-user@lucene.apache.org At: 12/27/17 19:01:29To:
solr-user@lucene.apache.org
Subject: SOLR 7.2 and LTR
Hi,
I am using SOLR 7.0 and use the ltr parser.
The configuration I use works nicely under SOLR
:
>>
>> Hi Diego,
>>
>> Thank you,
>>
>> I am interested in reranking the documents inside one of the groups.
>>
>> I will try the options you mentioned here.
>>
>> Thank you,
>> Roopa
>>
>> On Mon, Dec 11, 2017 at 6:
Instead of putting this into Solr, did you consider adding this logic
into the service that will call Solr?
On Tue, Dec 19, 2017 at 4:41 PM, Solrmails wrote:
> Thank you for your answer. I'd like to restrict the returned fields
> dynamicaly based on a permission
If you need to return only a subset of the fields for each request you can
set them as default in the solrconfig.xml.
On Dec 19, 2017 13:45, "Solrmails" wrote:
> I found a solution: I created a custom Search Handler and overridden
> 'handleRequestBody'. Then I modify
Hi Tomerg,
1. Did you consider using the collapse component?
https://lucene.apache.org/solr/guide/6_6/collapse-and-expand-results.html
it is compatible with rq.
2. If you implement group reranking as a separate component you will
end up with a lot of code duplicated from QueryComponent, you
<roop...@gmail.com> wrote:
> Hi Diego,
>
> Thank you, I will look into this and see how I could patch this.
>
> Thank you for your quick response,
> Roopa
>
>
> On Fri, Dec 8, 2017 at 5:44 PM, Diego Ceccarelli <
> diego.ceccare...@gmail.com> wrote:
>
Hi Roopa,
LTR is implemented using RankQuery, and at the moment grouping doens't
support RankQuery.
I opened a jira item time ago
(https://issues.apache.org/jira/browse/SOLR-8776) and I would be happy
to receive feedback on that. You can find the code here
Hello, I have a use case where I need to dedupe documents in each group based
on a particular field:
example:
doc1 = { field_a=1 field_b=2 }
doc2 = { field_a=1 field_b=2 }
doc3 = { field_a=1 field_b=3 }
doc4 = { field_a=2 field_b=3 }
doc5 = { field_a=2 field_b=3 }
and I want to run "Group
Hello Ilay,
Answers in line:
On Sat, Nov 18, 2017 at 2:22 PM, ilay wrote:
>
> 1. Does LTR only support phrase matching (complete user query) from training
> data for extracting feature score:
> ex.
> efi.user_query='tv+stand' matches the title feature only if title contains
Hello isspek,
Unfortunately no, it would be nice to patch RankLib to output the model in
json.
Jfyi, I've a script to convert the xml into the json format
https://github.com/bloomberg/lucene-solr/blob/ltr-demo-lucene-solr/py-solr-buzzwords/tree_model.py
Cheers,
Diego
From:
Hi all,
Yesterday Yahoo open sourced Vespa (i.e.: The open big data serving engine:
Store, search, rank and organize big data at user serving time.), looking at
the API they provide search.
I did a quick search on the code for lucene, getting only 5 results.
Does anyone know more about the
https://wiki.apache.org/solr/FAQ#How_can_I_delete_all_documents_from_my_index.3F
have a look also at the last post here: https://gist.github.com/nz/673027
I think there's a way to disallow delete by *:* in the solrconfig.xml but I
can't find it (I would take a look in the solrconfig just in
Hi Dariusz,
If you use *:* you'll rerank only the top N random documents, as Emir said,
that will not produce interesting results probably.
If you want to replace the original score, you can take a look at the learning
to rank module [1], that would allow you to reassign a
new score to the top
gt; {
> "name" : "FeatureA",
> "store" : "commonFeatureStore",
> "class" : "org.apache.solr.ltr.feature.SolrFeature",
> "params" : {
> "q" : "{!func}if(gt(ms(CutoffDate,NOW),0),exists
Hello Lawrence,
Which type did you use in the solr schema for your fields?
Cheers,
Diego
On Tue, Aug 29, 2017 at 5:34 PM, Elitzer, Lawrence <
lelit...@lgsinnovations.com> wrote:
> Hello!
>
>
>
> It seems I can correctly import (with DIH) UTF-8 characters such as J but
> I am unable to search
Hi Brian,
The plugin doesn't allow you to express multiple function queries in the
same feature. Maybe in this case you can express both the tw queries in one
unique function query, using the if function.
Something like:
"fq":"if(gt(ms(NOW,mydatefield),0,query(PreCutOffZones:${zone}), query(
Hi,
Sorry for the delay, here are my replies:
1. I'm not yet a spark user (but I'm working on that :))
2. I'm not sure I understand how you would use a feature that is not a float
into a model,
in my experience all the learning to rank methods always train and predict from
a list of
floats.
Hi All,
At the moment RankQueries [1] are not supported when you perform grouping:
if you perform a ReRankQuery and ask for the groups, reranking will be ignored
in the scoring.
In SOLR-8776, I added support for ReRankQueries in grouping and I opened a PR
on github [2].
ReRankQueries are
Hi Jeffery,
I submitted a patch to the README of the learning to rank example folder,
trying to explain better how to produce a training set given a log with
interaction data.
Patch is available here: https://issues.apache.org/jira/browse/SOLR-9929
And you can see the new version of the
Hi David,
I implemented bm25f for Europeana on Solr 4.x a couple of years ago,
you can find it here:
https://github.com/europeana/contrib/tree/master/bm25f-ranking
maybe I should contribute it back..
Please do not hesitate to contact me if you need help :)
Cheers,
Diego
From:
Dear Ali,
I'm not sure I understand what you are trying to do, please correct me if I
misunderstood:
given a document indexed into lucene you want to retrieve the top-k terms
with highest tf-idf right?
Could you please post your code somewhere? I don't understand what is
mlt :)
Cheers,
Diego
PM, Diego Ceccarelli
diego.ceccare...@gmail.com wrote:
Hi Everyone,
I need to use a RankQuery within a grouping [1].
I did some experiments with RerankQuery [2] and solr 4.10.2 and it seems
that
if you group on a field, the reranking query is completely ignored
(on the cloud
Hi Everyone,
I need to use a RankQuery within a grouping [1].
I did some experiments with RerankQuery [2] and solr 4.10.2 and it seems
that
if you group on a field, the reranking query is completely ignored
(on the cloud, and on a single instance).
I would expect to see the results in each group
60 matches
Mail list logo