I am glad to introduce a new project on SourceForge that is related to Lucene.
Lucene Server is a java server application for simply create and manage Jakarta
Lucene Indexes. It is designed to help you integrate Lucene in distributed
environnements.
The first release 0.1 is available for
Hi all,
first, here's how to reproduce the problem:
Go to http://www.denic.de/en/special/index.jsp and enter obscure
service in the search field. You'll get 132 hits. Now enter obscure
service* - and you only get 1 hit.
The above website is running Lucene 1.3rc3, but I was able to reproduce
Ulrich Mayring writes:
Hi all,
first, here's how to reproduce the problem:
Go to http://www.denic.de/en/special/index.jsp and enter obscure
service in the search field. You'll get 132 hits. Now enter obscure
service* - and you only get 1 hit.
The above website is running Lucene
Morus Walter wrote:
Your number/handle samples look ok to me if the default operator is AND.
But it's OR ;-)
Using AND explicitly I get different results and using OR explicitly I
get the same results as documented.
Note that wildcard expressions are not analyzed so if service is
stemmed to
Ulrich Mayring writes:
Will do, thank you very much. However, how do I get at the analyzed form
of my terms?
Instanciate the analyzer, create a token stream feeding your input,
loop over the tokens, output the results.
Morus
Guys
Apologies
Am I doing Wrong or is ther a bug with Lucene on Linux O/s When using '
MultiSearcher with Sort '
Please Somebody Reply me ASAP
Tested both Lucene-1.4-final.jar,Lucene-1.4.1.jar
hits = multiSearcher.search(query,sortField);
Exception raised on Linux O/s Only
Dear all,
I saw a post about an attempt to integrate Carrot2 with Lucene. It was a
while ago, so I'm curious if any outcome has been achieved.
Anyway, as the project coordinator I can offer my help with such
integration; if you're looking for some ready-to-use code then there is
a clustering
hmm ok,
but how will i be able to set different boosts to fields, if this value
is not stored?! i dont really understand why i can set a boost factor
and it is not stored and used.
what i want to do, is to weight my searchable index fields (type:
Field.UnStored) with a different factors for
The boost is not thrown away, but rather combined with the length
normalization factor during indexing. So while your actual boost value
is not stored directly in the index, it is taken into consideration for
scoring appropriately.
Erik
On Sep 23, 2004, at 8:17 AM, Bastian Grimm
Hi Dawid,
I would like to use Carrot2 with lucene. Do you have examples ?
Thanks a lot,
William.
From: Dawid Weiss [EMAIL PROTECTED]
Reply-To: Lucene Users List [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: Clustering lucene's results
Date: Thu, 23 Sep 2004 13:36:03 +0200
Dear all,
I saw a
Hi William,
No, I don't have examples because I never used Lucene directly. If you
provide me with a sample index and an API that executes a query on this
index (I need document titles, summaries, or snippets and an anchor
(identifier), can be an URL).
Send me such a snippet and I'll try to
thanks for your reply, eric.
so i am right that its not possible to change the boost without
reindexing all files? thats not good... or is it ok only to change the
boosts an optimize the index to take changes effecting the index?
if not, will i be able to boost those fields in the searcher?
Karthik,
I have a kind of similar problem. Test the following: when you
create a field, don't use Field(String), instead use Field(String, int)
where int is a constant for the field's type. May be this could help.
-Mensaje original-
De: Karthik N S [mailto:[EMAIL PROTECTED]
The best way is to use IndexReader's getCurrentVersion() method to check
whether the index has changed. If it has, just get a new Searcher
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexReade
r.html#getCurrentVersion(java.lang.String)
Aviran
-Original Message-
Erik Hatcher wrote:
Look at AnalysisDemo referred to here:
http://wiki.apache.org/jakarta-lucene/AnalysisParalysis
Keep in mind that phrase queries do not support wildcards - they are
analyzed and any wildcard characters are likely stripped and cause
tokens to split.
Ok, I did all that and
You can change field boosts without re-indexing.
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexReader.html#setNorm(int,%20java.lang.String,%20byte)
Doug
Bastian Grimm [Eastbeam GmbH] wrote:
thanks for your reply, eric.
so i am right that its not possible to change the
Ulrich Mayring wrote:
If the user searches for 007001 handle, the MultiFieldQueryParser,
which searches in the fields title and contents, changes that query to:
(title:007001 +title:handl) (contents:007001 +contents:handl)
Ok, I cleared this up, there was some invisible magic going on in the
Dawid Weiss wrote:
Hi William,
No, I don't have examples because I never used Lucene directly. If you
provide me with a sample index and an API that executes a query on this
index (I need document titles, summaries, or snippets and an anchor
(identifier), can be an URL).
Hi Dawid :-)
I believe
Hi Fred,
We were originally attempting to use the demo html parser (Lucene 1.2), but as
you know, its for a demo. I think its threaded to optimize on time, to allow
the calling thread to grab the title or top message even though its not done
parsing the entire html document. That's just a
Hi guys,
So we started upgrading to 1.4 and we need to add some of our own custom code.
After compiling with ant, I noticed that the 1.4 ant script builds a jar
called lucene-1.5-rc1-dev.jar, not lucene-1.4-final.jar. I'm pretty sure I
did not download the wrong source. Is this just a wrong
If you obtained the 1.4.1 source distribution, then you're fine and
its simply an issue with the properties. We keep the properties set to
the _next_ version of Lucene (or as a beta/rc version label) to avoid
the CVS HEAD codebase from building as a release label when it is very
likely not
Hi Andrzej :)
Yep, ok, I'll take a look at it. After I come back from abroad (next
week). I just wanted to save myself some time and have an already
written code that fetches the information we need for clustering; you
know what I mean, I'm sure. But I'll start from scratch when I get back.
D.
Hi,
Does anyone know a good tool to processing MS Power Point
file (*.ppt) into plain text so we can use lucene to index it?
I looked at jakarta/POI, and only see Word and Excel documents
can be processed, some JavaDoc pages mentioned ppt, but
status is not clear to me?
Thanks very much for
Hi Dawid,
The demos (under /src/demo) are very good. They have the basic usage
scenario.
Thanks Andrzej.
William.
Dawid Weiss wrote:
Hi William,
No, I don't have examples because I never used Lucene directly. If you
provide me with a sample index and an API that executes a query on this
index
[EMAIL PROTECTED] wrote:
We were originally attempting to use the demo html parser (Lucene 1.2), but as
you know, its for a demo. I think its threaded to optimize on time, to allow
the calling thread to grab the title or top message even though its not done
parsing the entire html document.
yeah... I know there have to be demos... I tried to be lazy, you know :)
Anyway, as I told Andrzej -- I'll take a look at it (and with a
pleasure) after I come back. i don't think the delay will matter much.
And if it does, ask Andrzej -- he has excellent experience with both
projects -- he's
I am working on extending Lucene to support documents with special islands
of an XML language, and I want to index the islands differently from the
text. My current plan is to break the document's contents into two Fields,
one with all the text and one with all the special islands, and use a
Greg Langmead wrote:
Am I right in saying that the design of Token's support for highlighting
really only supports having the entire document stored as one monolithic
contents Field?
No, I don't think so.
Has anyone tackled indexing multiple content Fields
before that could shed some light?
Do you
Doug Cutting wrote:
Do you need highlights from all fields? If so, then you can use:
TextFragment[] getBestTextFragments(TokenStream, ...);
with a TokenStream for each field, then select the highest scoring
fragments across all fields. Would that work for you?
Thanks for the reply.
29 matches
Mail list logo