## 3 December 2019, Apache Lucene™ 8.3.1 available
The Lucene PMC is pleased to announce the release of Apache Lucene 8.3.1.
Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for
nearly any application that
Background: i need to implement a document indexing and search for
POIs(point of interest) under LBS scene. A POI has name, address, and
location(LatLonPoint), and i want to combine a text query with a
geo-spatial 2d range filter.
The problem is, when i first build a native in-memory index which
IDF is a simple measure to calculate. So, if building a separate index for
each user is not an ideal solution, then I suggest you could try to
calculate these statistics upfront. Just maintain these statistics for each
user, then use them in the query process.
As the search time, you use these
>
> it is enough to give each its own field.
>
I kind of over-simplified the problem at hand. Apologies.
DOC_TYPE is just one aspect of the problem. The other one is that, it is
actually shared index where there are multiple-users (100-3000 users per
index). There are many hundreds of such
Hi Ravi,
Can you give more details on how you store an entity into lucene? what is a doc
type?
what fields do you have?
Cheers
From: java-user@lucene.apache.org At: 12/03/19 12:50:40To:
java-user@lucene.apache.org
Subject: Multi-IDF for a single term possible?
Hello,
We are using TF-IDF
it is enough to give each its own field.
On Tue, Dec 3, 2019 at 7:57 AM Adrien Grand wrote:
> Is there any reason why you are not storing each DOC_TYPE in its own index?
>
> On Tue, Dec 3, 2019 at 1:50 PM Ravikumar Govindarajan
> wrote:
> >
> > Hello,
> >
> > We are using TF-IDF for scoring
Is there any reason why you are not storing each DOC_TYPE in its own index?
On Tue, Dec 3, 2019 at 1:50 PM Ravikumar Govindarajan
wrote:
>
> Hello,
>
> We are using TF-IDF for scoring (Yet to migrate to BM25). Different
> entities (DOC_TYPES) are crunched & stored together in a single index.
>
>
Hello,
We are using TF-IDF for scoring (Yet to migrate to BM25). Different
entities (DOC_TYPES) are crunched & stored together in a single index.
When it comes to IDF, I find that there is a single value computed across
documents & stored as part of TermStats, whereas our documents are not