Hi,
I googled it but could not find the jars of these classes can some help me
where to get the jars
import org.apache.lucene.corpus.stats.IDFCalc;
import org.apache.lucene.corpus.stats.TFIDFPriorityQueue;
import org.apache.lucene.corpus.stats.TermIDF;
Thanks
On Thu, Feb 12, 2015 at 11:01 PM,
Based on reading the same comments you read, I'm pretty doubtful that
Codec.getDefault() is going to work. It seems to me that this
situation renders the FilterCodec a bit hard to to use, at least given
the 'every release deprecates a codec' sort of pattern.
On Thu, Feb 12, 2015 at 3:20 AM, Uwe
Robert,
Let me lay out the scenario.
Hardware has .5T of Index is relatively small. Application profiling
shows a significant amount of time spent codec-ing.
Options as I see them:
1. Use DPF complete with the irritation of having to have this
spurious codec name in the on-disk format that has
WHOOPS.
First sentence was, until just before I clicked 'send',
Hardware has .5T of RAM. Index is relatively small (20g) ...
On Thu, Feb 12, 2015 at 4:51 PM, Benson Margulies ben...@basistech.com wrote:
Robert,
Let me lay out the scenario.
Hardware has .5T of Index is relatively small.
I think you can do it with 4 simple queries:
1) +flying +shooting
2) +flying +fighting
etc.
or BooleanQuery equivalents with MUST clauses. Use
aol.search.TotalHitCountCollector and it should be blazingly fast,
even if you have more that 100 docs.
--
Ian.
On Thu, Feb 12, 2015 at 5:42 PM,
Hi,
How about Codec.getDefault()? It does indeed not necessarily return the newest
one (if somebody changes the default using Codec.setDefault()), but for your
use case wrapping the current default one, it should be fine?
I have not tried this yet, but there might be a chicken-egg problem:
-
Honestly i dont agree. I don't know what you are trying to do, but if
you want file format backwards compat working, then you need a
different FilterCodec to match each lucene codec.
Otherwise your codec is broken from a back compat standpoint. Wrapping
the latest is an antipattern here.
On
On Thu, Feb 12, 2015 at 8:51 AM, Benson Margulies ben...@basistech.com wrote:
On Thu, Feb 12, 2015 at 8:43 AM, Robert Muir rcm...@gmail.com wrote:
Honestly i dont agree. I don't know what you are trying to do, but if
you want file format backwards compat working, then you need a
different
Hi,
FYI, this is the same issues like Locales have/had in ICU! If you try to render
an error message in Locales's constructors, this breaks with NPE - because
default Locale is not yet there... I think they implemented some fallback
that is guaranteed to be there.
But this would not help you,
On Thu, Feb 12, 2015 at 8:43 AM, Robert Muir rcm...@gmail.com wrote:
Honestly i dont agree. I don't know what you are trying to do, but if
you want file format backwards compat working, then you need a
different FilterCodec to match each lucene codec.
Otherwise your codec is broken from a
Might also look at concordance code on LUCENE-5317 and here:
https://github.com/tballison/lucene-addons/tree/master/lucene-5317
Let me know if you have any questions.
-Original Message-
From: Maisnam Ns [mailto:maisnam...@gmail.com]
Sent: Thursday, February 12, 2015 11:57 AM
To:
On Thu, Feb 12, 2015 at 11:58 AM, McKinley, James T
james.mckin...@cengage.com wrote:
Hi Robert,
Thanks for responding to my message. Are you saying that you or others have
encountered problems running Lucene 4.8+ on the 64-bit Java SE 1.7 JVM with
G1 and was it on Windows or on Linux? If
I did something like this sometime back. The objective was to find patterns
surrounding some keywords of interest so I could find keywords similar to
the ones I was looking for, sort of like a poor man's word2vec. It uses
SpanQuery as Jigar said, and you can find the code here (I believe it was
Hi,
Can someone help me if this use case is possible or not with lucene
Use case: I have a string say 'Japan' appearing in 10 documents and I want
to get back , say some results which contain two words before 'Japan' and
two words after 'Japan' may be something like this ' Economy of Japan is
Hi Robert,
Thanks for responding to my message. Are you saying that you or others have
encountered problems running Lucene 4.8+ on the 64-bit Java SE 1.7 JVM with G1
and was it on Windows or on Linux? If so, where can I find out more? I only
looked into the one bug because that was the only
This concept is called Proximity Search in general.
In Lucene they are achieved using SpanQuery.
On Thu, Feb 12, 2015 at 10:10 PM, Maisnam Ns maisnam...@gmail.com wrote:
Hi,
Can someone help me if this use case is possible or not with lucene
Use case: I have a string say 'Japan' appearing
Hi Shah,
Thanks for your reply. Will try to google SpanQuery meanwhile if you have
some links can you please share
Thanks
On Thu, Feb 12, 2015 at 10:17 PM, Jigar Shah jigaronl...@gmail.com wrote:
This concept is called Proximity Search in general.
In Lucene they are achieved using
Hi Allison and Sujit,
Thanks so much for your links I am so happy I am looking at exactly the
links that almost covers my use case.
Allison, sure will get back to you if I have some more questions.
Regards
NS
On Thu, Feb 12, 2015 at 10:49 PM, Sujit Pal sujit@comcast.net wrote:
I did
Hi,
Can someone help me with this use case.
Use case: Say there are 4 key words 'Flying', 'Shooting', 'fighting' and
'looking' in100 documents to search for.
Consider 'Flying' and 'Shooting' co- occurs (together) in 70 documents
where as
'Flying and 'fighting' co- occurs in 14 documents
19 matches
Mail list logo