Hello all,
I don't know if this is a somehow naive question, but here we go:
Does Lucene support index by sections? Like having a text document with
three sections divided by XML tags indexed in a way we could do a search
by work and section. Does Lucene itself support this kind of indexing or
If its only about the search, you could have section as just another field in
your index. You could simply search on work as well as section.
Otherwise, if you are looking at aggregating category hits, then look at
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200605.mbox/[EMAIL
Yes, your application can do this using Lucene. Lucene is a low level
search enabling library, it is up to your application to give meaning
to what you put in it.
One way doing what you want is to give each section its own Field for
any given document.
Cheers,
Grant
On Nov 13, 2007, at
We've run into a blocking problem with our use of Lucene: we get
OutOfMemoryError when performing a one-term search in our index. The
search, if completed, should give only a few thousand hits, but from
inspecting a heap dump it appears that many more documents in the index
get stored in Lucene
On Dienstag, 13. November 2007, Lars Clausen wrote:
Can it be right that memory usage depends on size of the index rather
than size of the result?
Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to
the JVM now?
Regards
Daniel
--
http://www.danielnaber.de
Hardy:
I'm certainly not an expert on ranking and scoring, but I've got to assume
that this approach influences scoring.
Another issue is how you indexed multiple values. If you took a hint from
the SynonymAnalyzer example in Lucene In Action, and indexed all the
substrings with an increment of
Hi,
On Tue, 2007-11-13 at 07:32 -0500, Grant Ingersoll wrote:
Yes, your application can do this using Lucene. Lucene is a low level
search enabling library, it is up to your application to give meaning
to what you put in it.
One way doing what you want is to give each section its own
If you only have a maximum of a few sections, then indexing
as different fields should work fine. If you have a big upper limit
you might need to do something like index all the data in one field
with a special marker (e.g. $$$) between sections, then use
termdocs/termenum on the result set to
: Can it be right that memory usage depends on size of the index rather
: than size of the result?
:
: Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to
: the JVM now?
and in general: yes. Lucene is using memory so that *lots* of searches
can be fast ... if you
On Nov 13, 2007, at 7:21 AM, Cláudio Fernandes wrote:
Hello all,
I don't know if this is a somehow naive question, but here we go:
Does Lucene support index by sections? Like having a text document
with
three sections divided by XML tags indexed in a way we could do a
search
by work and
On Nov 13, 2007, at 11:59 AM, Steven D. Majewski wrote:
Lucene is great at finding documents, but not quite as good at finding
things IN documents. The index contains pointers to the terms, but
they are
pointers to a token in the parsed token stream, so to find a
character index
into a
We have seen similar exceptions (with Lucene 2.2) when were doing the
following mistakes,
1) Not closing the old searchers and re-creating a new one for every
new search (fixed it by closing the searcher every time, if you want
you could only one searcher instance as well)
2) Not having any jvm
vivek sar [EMAIL PROTECTED] wrote:
I think if the indexer is abruptly stopped while it's in progress
the index corruption can happen.
One correction here: as far as I know, the index should not become
corrupt if the JVM is kill -9'd or JVM crashes. If that seems to be
happening then we need
Hi,
Are we closed to release Lucene 2.3? Is it stable enough to production? I
thought it's supposed to be released in October.
Thanks,
--
View this message in context:
http://www.nabble.com/How%27s-2.3-doing--tf4802426.html#a13740560
Sent from the Lucene - Java Users mailing list archive at
testn wrote:
Hi,
Are we closed to release Lucene 2.3? Is it stable enough to production? I
thought it's supposed to be released in October.
Thanks,
I think it's very close. There are a couple of outstanding issues:
15 matches
Mail list logo