500 times the original data? Not true! :)
Otis
--- Xiaohong Yang (Sharon) [EMAIL PROTECTED] wrote:
Hi,
I agree that Google mini is quite expensive. It might be similar to
the desktop version in quality. Anyone knows google's ratio of index
to text? Is it true that Lucene's index is
I don't think there is a direct way to get the number of (unique) terms
in the index, so yes, I think you'll have to loop through TermEnum and
count.
Otis
--- Jonathan Lasko [EMAIL PROTECTED] wrote:
I'm looking for the total number of unique terms in the index. I see
that I can get a
Edwin,
--- Edwin Tang [EMAIL PROTECTED] wrote:
I have three indices really that I search via ParallelMultiSearcher.
All three
are being updated constantly. We would like to be able to perform a
search on
the indices and have the results reflect the latest documents
indexed. However,
that
Morus,
that description of 3 sets of index files is what I was imagining, too.
I'll have to test and add to the book errata, it seems.
Thanks for the info,
Otis
--- Morus Walter [EMAIL PROTECTED] wrote:
Otis Gospodnetic writes:
Hello,
Yes, that is how optimize works - copies all
:
Is Lucene-in-Action being sold anywhere in Singapore?
thanks!
Otis Gospodnetic [EMAIL PROTECTED] wrote: Gospodnetiæ
sounds like Gospodnetich and Eric is Erik :)
Otis
--- John Haxby wrote:
Otis Gospodnetic wrote:
I contacted both the US and UK Amazon sites and asked them
Adam,
Dawid posted some code that lets you use Carrot2 locally with Lucene,
without the componentized pipe line system described on Carrot2 site.
Otis
--- Adam Saltiel [EMAIL PROTECTED] wrote:
David, Hi,
Would you be able to comment on coincidentally recent thread RE: -
Grouping Search
If you are not married to Java:
http://search.cpan.org/~kilinrax/HTML-Strip-1.04/Strip.pm
Otis
--- sergiu gordea [EMAIL PROTECTED] wrote:
Karl Koch wrote:
I am in control of the html, which means it is well formated HTML. I
use
only HTML files which I have transformed from XML. No
Using different analyzers for indexing and searching is not
recommended.
Your numbers are not even in the index because you are using
StandardAnalyzer. Use Luke to look at your index.
Otis
--- Hetan Shah [EMAIL PROTECTED] wrote:
Hello,
How can one search for a document based on the query
Get and try Lucene 1.4.3. One of the older versions had a bug that was
not deleting old index files.
Otis
--- [EMAIL PROTECTED] wrote:
Hi,
When I run an optimize in our production environment, old index are
left in the directory and are not deleted.
My understanding is that an
The QueryParser is analyzing your Field.Keyword (genre field) fields,
because it doesn't know that genre is a Keyword field and should not be
analyzed.
Check section 4.4. here:
http://www.lucenebook.com/search?query=queryparser+keyword
Otis
--- Mike Rose [EMAIL PROTECTED] wrote:
Perhaps
Hi,
lucene.apache.org seems to work now.
Here is the query syntax:
http://lucene.apache.org/queryparsersyntax.html
[] is used as [BEGIN-RANGE-STRING TO END-RANGE-STRING]
Otis
--- Jim Lynch [EMAIL PROTECTED] wrote:
First I'm getting a
The requested URL could not be retrieved
Hi Paul,
If I understand your setup correctly, it looks like you are running
multiple threads that create IndexWriter for the ame directory. That's
a no no.
This section (first hit) describes all various concurrency issues with
regards to adds, updates, optimization, and searches:
The most obvious answer is that the full-text indexing features of
RDBMS's are not as good (as fast) as Lucene. MySQL, PostgreSQL,
Oracle, MS SQL Server etc. all have full-text indexing/searching
features, but I always hear people complaining about the speed. A
person from a well-known online
Or you could just open a new IndexSearcher, forget the old one, and
have GC collect it when everyone is done with it.
Otis
--- Chris Lamprecht [EMAIL PROTECTED] wrote:
I should have mentioned, the reason for not doing this the obvious,
simple way (just close the Searcher and reopen it if a
Matt,
Erik and I have some code for this in Lucene in Action, but David
Spencer did this since the book was published:
http://www.lucenebook.com/blog/announcements/more_like_this.html
Otis
--- Matt Chaput [EMAIL PROTECTED] wrote:
Is there a simple, efficient way to compute similarity of
this leave open file handles? I had a problem where there
were lots of open file handles for deleted index files, because the
old searchers were not being closed.
On Fri, 18 Feb 2005 13:41:37 -0800 (PST), Otis Gospodnetic
[EMAIL PROTECTED] wrote:
Or you could just open a new IndexSearcher
Make sure you are not indexing your documents using the compound index
format (default in the newer versions of Lucene). Then you will see
the .frq file. Here is an example from one of Simpy's Lucene indices:
-rw-r--r--1 simpysimpy 629073 Feb 26 13:14 _1ao.frq
Otis
--
Ben,
You do need to use a separate instance of those 3 classes for each
index yes. But this is really something like:
IndexWriter writer = new IndexWriter();
So it's normal code-writing process you don't really have to create
anything new, just use existing Lucene API. As for locking,
701 - 718 of 718 matches
Mail list logo