hello all ,
is there any way to get all tokens from my index ? please anyone
suggest me
--
View this message in context:
http://www.nabble.com/get-all-tokens-from-index-tp25359411p25359411.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
hello all, is there any way to get all
tokens from my index ? please anyone
suggest me
The code below prints all terms of a field.
String path = E:\\ThesaurusSolrHome\\data\\index;
String field = contents;
IndexReader indexReader = IndexReader.open(path);
Hi
Thanks for your reponse. Here is the following testcase:
public class UnderwriterReferenceTest {
private Directory directory;
private Analyzer analyzer;
private IndexSearcher indexSearcher;
private IndexWriter indexWriter;
private Document layerDocumentA;
@Before
Thanks Ahmet , i found the solution. thanks a lot
Ahmet Arslan wrote:
hello all, is there any way to get all
tokens from my index ? please anyone
suggest me
The code below prints all terms of a field.
String path = E:\\ThesaurusSolrHome\\data\\index;
String field =
Thanks for your input Mark and Chris. I will take all into account
Chris
- Original Message -
From: Mark Miller markrmil...@gmail.com
Sent: Tue, 8/9/2009 8:06pm
To: java-user@lucene.apache.org
Subject: Re: New Stream closed exception with Java 6
Chris Hostetter wrote:
: I'm coming to
I've been testing 2.9 RC2 lately and comparing query performance to 2.3.2.
I'm seeing a huge increase in throughput (2x-10x) on an index that was built
with 2.3.2. The queries have a lot of BoostingTermQuerys and boolean clauses
containing a custom scorer. Using JProfiler, I observe that the
On Wed, Sep 9, 2009 at 8:57 AM, Peter Keeganpeterlkee...@gmail.com wrote:
Using JProfiler, I observe that the improvement
is due to a huge reduction in the number of calls to TermDocs.next and
TermDocs.skipTo (about 65% fewer calls).
Indexes are searched per-segment now (i.e. MultiTermDocs
On Wed, Sep 9, 2009 at 9:17 AM, Yonik
Seeleyyonik.see...@lucidimagination.com wrote:
On Wed, Sep 9, 2009 at 8:57 AM, Peter Keeganpeterlkee...@gmail.com wrote:
Using JProfiler, I observe that the improvement
is due to a huge reduction in the number of calls to TermDocs.next and
TermDocs.skipTo
It's all in the analyzers. Depending upon which analyzer you use manythings
happen to the input stream. Casing is one example, but that's just
the simplest. Which is why it's so important to use the same analyzer
when indexing and querying unless you have a *very* good reason not to.
I'd really
IndexSearcher.search is calling my custom scorer's 'next' and 'doc' methods
64% fewer times. I see no 'advance' method in any of the hot spots'. I am
getting the same number of hits from the custom scorer.
Has the BooleanScorer2 logic changed?
Peter
On Wed, Sep 9, 2009 at 9:17 AM, Yonik Seeley
How about the new score inorder/out of order stuff? It was an option
before, but I think now it uses whats best by default? And pairs with
the collector? I didn't follow any of that closely though.
- Mark
Peter Keegan wrote:
IndexSearcher.search is calling my custom scorer's 'next' and 'doc'
Right, BooleanQuery will now try to use BooleanScorer (does out of
order collection, which does not use skipTo/advance at all, I think)
when possible, instead of BooleanScorer2.
This only applies for boolean queries that have only SHOULD clauses,
and up to 32 MUST_NOT clauses (if there's even 1
Is it possible that skipTo is very costly with your custom scorer?
It's no more expensive than 'next'. The scorer's 'skipTo' and 'next' methods
call termdocs.skipTo or termdocs.next to get the next 'candidate' doc. This
just checks a BitVector to find the next non-deleted doc. But the scorer
http://svn.apache.org/viewvc?view=revrevision=630698
This may be it. The scorer is sparse and usually in a conjuction with a
dense scorer.
Does the index format matter? I haven't yet built it with 2.9.
Peter
On Wed, Sep 9, 2009 at 10:17 AM, Yonik Seeley yo...@lucidimagination.comwrote:
On
Hello,
I want to know how to calculate the DGaps value in *.del file?
For example, if there are 8000 bits and only bits 10,12,32 are set,
DGaps would be used:
(VInt) 1 , (byte) 20 , (VInt) 3 , (Byte) 1
I do not understand why the DGraps is 1 and 3.
Please tell
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hello Lucene users,
On behalf of the Lucene dev community (a growing community far larger
than just the committers) I would like to announce the third release
candidate for Lucene 2.9.
Please download and check it out – take it for a spin and kick
Has anyone done anything regarding the support of PayloadTermQuery in
MoreLikeThis?
I took a quick look at the code and it seems to be simply a matter of
swapping TermQuery with PayloadTermQuery. I guess a generic solution would
be to add a enable method to enable PayloadTermQuery, keeping
Looking for some help figuring out a problem with the IndexReader.isCurrent()
method and cached indexes.
We have a number of lucene indexes that we attempt to keep in memory after an
initial query is performed. In order to prevent the indexes from becoming
stale, we check for changes about
I'm using Lucene 2.3.2 and have a date field used for sorting, which is
MMDDHHMM. I get an exception when the FieldCache is being generated as follows:
java.lang.NumberFormatException: For input string: 190400-412317
Antony Bowesman wrote:
I'm using Lucene 2.3.2 and have a date field used for sorting, which
is MMDDHHMM. I get an exception when the FieldCache is being
generated as follows:
java.lang.NumberFormatException: For input string: 190400-412317
20 matches
Mail list logo