My learnings dealing this problem
We faced a similar problem before, and did the following things:
1) Don't request totalGroupCount, and the response was fast. as computing
group count is an expensive task. If you can live without groupCount.
Although you can approximate pagination up to total
As Erick said, can you tell us a bit more about the use case?
There might be another way to achieve the same result.
What are these documents?
Why do you need 1000 docs per user?
From: java-user@lucene.apache.org At: 10/09/20 14:25:02To:
java-user@lucene.apache.org
Subject: Re:
Hi,
it seems I do not raise a lot of interest here... anyway I try again
with a simpler question.
Is MultiFieldQueryParser usable in 8.6.0 ?
thanks
Message initial
De: Stephane Passignat
Répondre à: java-user@lucene.apache.org
À: java-user@lucene.apache.org
Objet:
6_500_000 is the total count of groups in the entire collection. I only
return the top 1000 to users.
I use Lucene where I have documents that can have the same docvalue, and I
want to deduplicate this documents by this docvalue during search.
Also, i sort my documents by multiple fields and
This is going to be fairly painful. You need to keep a list 6.5M
items long, sorted.
Before diving in there, I’d really back up and ask what the use-case
is. Returning 6.5M docs to a user is useless, so are you’re doing
some kind of analytics maybe? In which case, and again
assuming you’re using
I have 12_000_000 documents, 6_500_000 groups
With sort: It takes around 1 sec without grouping, 2 sec with grouping and
12 sec with setAllGroups(true)
Without sort: It takes around 0.2 sec without grouping, 0.6 sec with
grouping and 10 sec with setAllGroups(true)
Thank you, Erick, I will look
At the Solr level, CollapsingQParserPlugin see:
https://lucene.apache.org/solr/guide/8_6/collapse-and-expand-results.html
You could perhaps steal some ideas from that if you
need this at the Lucene level.
Best,
Erick
> On Oct 9, 2020, at 7:25 AM, Diego Ceccarelli (BLOOMBERG/ LONDON)
> wrote:
How many documents in the collection, how many groups, and how long is it
taking to do the grouping vs no grouping?
Also, if you remove the custom sort is it still slow?
From: java-user@lucene.apache.org At: 10/09/20 12:27:25To: Diego Ceccarelli
(BLOOMBERG/ LONDON ) ,
Yes, it is
пт, 9 окт. 2020 г. в 14:25, Diego Ceccarelli (BLOOMBERG/ LONDON) <
dceccarel...@bloomberg.net>:
> Is the field that you are using to dedupe stored as a docvalue?
>
> From: java-user@lucene.apache.org At: 10/09/20 12:18:04To:
> java-user@lucene.apache.org
> Subject: Deduplication of
Is the field that you are using to dedupe stored as a docvalue?
From: java-user@lucene.apache.org At: 10/09/20 12:18:04To:
java-user@lucene.apache.org
Subject: Deduplication of search result with custom with custom sort
Hi,
I need to deduplicate search results by specific field and I have no
Hi,
I need to deduplicate search results by specific field and I have no idea
how to implement this properly.
I have tried grouping with setGroupDocsLimit(1) and it gives me expected
results, but has not very good performance.
I think that I need something like DiversifiedTopDocsCollector, but
11 matches
Mail list logo