Simple Similarity Implementation to Count the Number of Hits

2016-05-11 Thread Luís Filipe Nassif
Hi, In the past (lucene 4) I have tried to implement a simple Similarity to only count the number of occurrences (term frequencies) into the documents, ignoring norms, doc frequencies, boosts... It worked for some queries like term and wildcard queries, but not for others, like phrase and range

RE: Optimizing number of segments in lucene index (no writes/deletes, only reads)

2017-06-14 Thread Luís Filipe Nassif
In the past I have tried IndexSearcher with an ExecutorService to parallelize searches on multiple segments on a SSD disk. That was with Lucene 4.9. Unfortunatelly the searches became slower with various number of threads in the pool, and much slower with 1 thread. There was some overhead with

Re: Get original DocValues from ICUCollationDocValuesField

2017-04-30 Thread Luís Filipe Nassif
Schindler <u...@thetaphi.de>: > Hi, > > No. Collation keys are a one-way function. You need to index it into 2 > different fields, once for sorting as collation key and once for facetting > or display. > > Uwe > > > Am 30. April 2017 22:29:23 MESZ schrieb "Lu

Re: Get original DocValues from ICUCollationDocValuesField

2017-04-30 Thread Luís Filipe Nassif
A related question: is it possible to do faceting on a SortedDocValuesField using Collation rules? Or faceting is always case sensitive? Thanks in advance, Luis 2017-04-30 12:35 GMT-03:00 Luís Filipe Nassif <lfcnas...@gmail.com>: > Hi Lucene community! > > I can successful g

Get original DocValues from ICUCollationDocValuesField

2017-04-30 Thread Luís Filipe Nassif
Hi Lucene community! I can successful get original doc values from fields indexed with SortedDocValues with code like: BytesRef bref = atomicReader.getSortedDocValues(field).get(doc); String value = bref.utf8ToString(); But as I need to use locale sorting, I use ICUCollationDocValuesField for

Re: N-dimensional Point Indexing

2018-02-06 Thread Luís Filipe Nassif
Is it limited up to 8 dimensions as described at https://www.elastic.co/blog/lucene-points-6.0? 2018-02-06 15:35 GMT-02:00 Luís Filipe Nassif <lfcnas...@gmail.com>: > Sorry, I was looking at the wrong place. Should I use BinaryPoint ( > https://lucene.apache.org/core/6_0_0/cor

Re: N-dimensional Point Indexing

2018-02-06 Thread Luís Filipe Nassif
Sorry, I was looking at the wrong place. Should I use BinaryPoint ( https://lucene.apache.org/core/6_0_0/core/org/apache/lucene/document/BinaryPoint.html) ? 2018-02-06 14:17 GMT-02:00 Luís Filipe Nassif <lfcnas...@gmail.com>: > Hi all, > > Lucene is able to index generic n-dim

N-dimensional Point Indexing

2018-02-06 Thread Luís Filipe Nassif
Hi all, Lucene is able to index generic n-dimensional points for efficient similarity or nearest neightbors search? I have looked at spatial package in the past but seems it is specific to geo points? The use case is to index image feature vectors to search for similar images in a corpus.

Re: N-dimensional Point Indexing

2018-02-26 Thread Luís Filipe Nassif
Hi Lucene community, Is BinaryPoint limited up to 8 dimensions? Thanks, Luis Em 6 de fev de 2018 16:07, "Luís Filipe Nassif" <lfcnas...@gmail.com> escreveu: Is it limited up to 8 dimensions as described at https://www.elastic.co/blog/lucene-points-6.0? 2018-02-06 15:35 GMT-0

Re: N-dimensional Point Indexing

2018-02-26 Thread Luís Filipe Nassif
Thank you, Adrian. Em 26 de fev de 2018 21:19, "Adrien Grand" <jpou...@gmail.com> escreveu: > Yes it is. > > Le mar. 27 févr. 2018 à 00:03, Luís Filipe Nassif <lfcnas...@gmail.com> a > écrit : > >> Hi Lucene community, >> >> Is Binary

Updating specific fields of huge docs

2019-02-13 Thread Luís Filipe Nassif
Hi all, Lucene 7 still deletes and re-adds docs when an update operation is done, as I understood. When docs have dozens of fields and one of them is large text content (extracted by Tika) and if I need to update some other small fields, what is the best approach to not reindex that large text

Re: Updating specific fields of huge docs

2019-02-14 Thread Luís Filipe Nassif
, you might find some of the > streaming capabilities useful for join kinds of operations of other > join options don't work out or you just prefer the streaming > alternative. > > Best, > Erick > > On Wed, Feb 13, 2019 at 11:43 AM Luís Filipe Nassif > wrote: > > > >

How to change sorting *after* getting search results

2021-11-30 Thread Luís Filipe Nassif
Hi Lucene community, Our users could do very heavy searches and they are able to change the sorting criteria multiple times after getting the results. We collect all of them, this is important for our use case, disabling scoring if the result size is too large to make the search faster. Currently

Re: How to change sorting *after* getting search results

2021-11-30 Thread Luís Filipe Nassif
th one of the various > Rescorers. Have you looked at those? > > On Tue, Nov 30, 2021, 9:15 AM Luís Filipe Nassif > wrote: > >> Hi Lucene community, >> >> Our users could do very heavy searches and they are able to change the >> sorting criteria multiple times after g