would it make sense to create a separate Lucene module for ANN search ?
we could then experiment with the different approaches and compare them
across the same benchmarks.
On Thu, 16 Jul 2020 at 23:14, Ali Akhtar wrote:
> I’m a bit of a layman in this area, but if we are talking about formats
hi Alex,
I had worked on a similar problem directly on Lucene (within Anserini
toolkit) using LSH fingerprints of tokenized feature vector values.
You can find code at [1] and some information on the Anserini documentation
page [2] and in a short preprint [3].
As a side note my current thinking
PMC vote: option C (current)
On Wed, 17 Jun 2020 at 07:58, Ignacio Vera Sequeiros
wrote:
> PMC vote: option A
>
> On Wed, Jun 17, 2020 at 7:36 AM Jeroen Lauwers
> wrote:
>
> > A. Definitely.
> >
> > Verstuurd vanaf mijn telefoon
> >
> > > Op 17 jun. 2020 om 03:46 heeft Jason Gerlowski
> >
+1, some time ago I also used the decompounder mentioned by Dawid and was
satisfied back then.
Regards,
Tommaso
Il giorno sab 16 set 2017 alle ore 09:29 Dawid Weiss
ha scritto:
> Hi Mike. Search lucene dev archives. I did write a decompounder with Daniel
> Naber. The
I think it'd be interesting to also investigate using TypeAttribute [1]
together with TypeTokenFilter [2].
Regards,
Tommaso
[1] :
https://lucene.apache.org/core/6_5_0/core/org/apache/lucene/analysis/tokenattributes/TypeAttribute.html
[2] :
improved locality of "near" documents could be used to avoid loading some
segments during the retrieval phase for certain use cases (e.g. spatial
search).
Il giorno mer 16 nov 2016 alle ore 09:45 Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> ha scritto:
I think it might be helpful to handle POS tags as TypeAttributes so that
the input and output texts would cleaner and you can still filter and
retrieve tokens by type (e.g. with TypeTokenFilter).
My 2 cents,
Tommaso
Il giorno mer 19 ott 2016 alle ore 11:56 Niki Pavlopoulou
ha
see simple one first. :-) Why don't we consider adding Analyzer
parameter
to assignClass()?
koji
(14/03/07 17:18), Tommaso Teofili wrote:
cool Koji, thanks a lot for sharing.
Some useful points / suggestions come out of it, let's see if we can
follow
up :)
Regards,
Tommaso
2014-03
cool Koji, thanks a lot for sharing.
Some useful points / suggestions come out of it, let's see if we can follow
up :)
Regards,
Tommaso
2014-03-07 3:30 GMT+01:00 Koji Sekiguchi k...@r.email.ne.jp:
Hello,
I just posted an article on Comparing Document Classification Functions
of Lucene and
2013/5/29 Koji Sekiguchi k...@r.email.ne.jp
Hi Rajesh,
Thanks!
I'm planning to open an NLP tool kit for Lucene, and the tool kit will
include
the following synonym library.
sounds nice, looking forward to it.
Tommaso
koji
(13/05/28 14:12), Rajesh Nikam wrote:
Hello Koji,
This
2013/1/15 VIGNESH S vigneshkln...@gmail.com
Hi All,
Thanks for your replies..
Actually I am trying to classify the email mail data in to categories
and also spam mails .. I have tried clustering but it is not useful
since we can not control categories.
I am looking for a light weight
Hi,
you can have a look at the (early stage) Lucene classification module on
trunk [1], see also a brief introduction given at last ApacheCon EU [2].
Hope this helps,
Tommaso
[1] :
http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/classification/
[2] :
that's nice!
Tommaso
2012/11/19 Uwe Schindler u...@thetaphi.de
Lol!
Many thanks for this support!
Uwes
Otis Gospodnetic otis.gospodne...@gmail.com schrieb:
Hi,
Quick announcement for Uwe Friends.
UweSays is now a super-duper-special query operator over on
Ok, that saves you from concurrency issue, but in my experience is just
much slower than local file system, so still NFS can be used but with some
tradeoff on performance.
My 2 cents,
Tommaso
2012/10/2 Jong Kim jong.luc...@gmail.com
The setup is I have a home-grown server process that has
2012/2/6 Ian Lea ian@gmail.com
Not sure if you got an answer to this or not. Don't recall seeing one
and gmail threading says not.
Is the use of payloads I've described appropriate?
Sounds OK to me, although I'm not sure why you can't store the
metadata as a Document Field.
Can I
[X] ASF Mirrors (linked in our release announcements or via the Lucene
website)
[X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.)
[] I/we build them from source via an SVN/Git checkout.
[] Other (someone in your company mirrors them internally or via a
downstream project)
16 matches
Mail list logo