Hi Eric and Grant:
Thanks for the replies and this is certainly encouraging. As
suggested, I will post furthere such discussions to the dev list.
Thanks
-John
On Tue, 20 Jul 2004 15:37:35 -0400, Grant Ingersoll [EMAIL PROTECTED] wrote:
It seems to me the answer to this is not necessarily
In general, yes.
By splitting up a large index into smaller indicies, you are
linearizing the search time.
Furthermore, that allows you to make your search distributable.
-John
On Wed, 21 Jul 2004 13:00:28 +1000, Anson Lau [EMAIL PROTECTED] wrote:
Hello guys,
What are some general techniques
Hi,
It is possible to retrieve tokens offsets (Token.startOffset(),
Token.endOffset()) later when document is found and returned in hit
collection? I need these values for hihglighting. I've already looked to
Highlighter in sandbox but it actually re-analyzes the original
document's field.
On Jul 21, 2004, at 6:59 AM, Stepan Mik wrote:
It is possible to retrieve tokens offsets (Token.startOffset(),
Token.endOffset()) later when document is found and returned in hit
collection?
No offsets are not stored in the index. In fact, the only place
they are currently used is with the
Depending on what MySQL Full-text search support you probably will lose some
of the advance things you get for free from Lucene, such as proximity
search, wildcard search, search term and search field boosting, scoring of
the documents, etc.
Afterall it depends on what you need to do. In our dev
Interestingly (and ironically) enough, the project I'm currently
working on requires full-text searching of Word and PDF resumes. SQL
Server is already the required database as well, so we are leveraging
the full-text indexing capabilities it has. There is a special trick
to drop a BLOB into
Is the package information and import paths ready to deploy on Tomcat server. I tried
extracting lucene on the server, but when i compile files, it just throws numerous no
class definition errors and errors relating to the package.
Ian
Has anyone tried splitting up an index into smaller chunks, without putting
the different indicies on a different physical disk/box? What sort of
performance gain do you get from it?
Anson
-Original Message-
From: John Wang [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 21, 2004
On Jul 21, 2004, at 8:10 AM, Ian McDonnell wrote:
Is the package information and import paths ready to deploy on Tomcat
server. I tried extracting lucene on the server, but when i compile
files, it just throws numerous no class definition errors and errors
relating to the package.
Huh? Lucene
You can create a new field which contains the full untokened string and use
it as a sort field.
-Original Message-
From: Florian Sauvin [mailto:[EMAIL PROTECTED]
Sent: Tuesday, July 20, 2004 20:13 PM
To: Lucene Users List
Subject: Sorting on tokenized fields
I see in the Javadoc that
When rc3 came out, I modified the classes used for
Sorting to, in addition to Integer, Float and
String-based sort keys, use Long values. All I did
was add extra statements in 2 classes (SortField and
FieldSortedHitQueue) that made a special case for
longs, and created a LongSortedHitQueue
Since I had to implement sorting in lucene 1.2 I had to write my own sorting
using something similar to a lucene's contribution called SortField.
Yesterday I did some tests, trying to use lucene 1.4 Sort objects and I
realized that my old implementation works 40% faster then Lucene's
Well when i extracted it, it created the org/apache/lucene directories in the
public_html directory. When i try to compile any of the source it just throws numerous
errors. I've got the classpath set to web-inf/classes.
Have i extraced it to the wrong directory?
--- Erik Hatcher [EMAIL
Hi,
What is the best way to get Lucene to assign weightings to certain fields
from a database? For example, the 'name' field should be weighted higher
than the 'description' field.
Thanks,
John.
-
To unsubscribe, e-mail:
Also another silly question, do i need to setup a war on the server?
--- Ian McDonnell [EMAIL PROTECTED] wrote:
Well when i extracted it, it created the org/apache/lucene directories in the
public_html directory. When i try to compile any of the source it just throws numerous
errors. I've got
Hi Ian,
Depending on what you want to do, you could also follow the installation
instructions on http://www.zilverline.org. It describes how to install
zilverline, but the same goes for the lucene war.
Hope this helps,
Michael Franken
Ian McDonnell wrote:
Also another silly question, do i
I was looking at your instructions there, but couldnt really figure out what you mean.
Can i manually add the extracted directories onto the tomcat server, if so what should
my root directory be?
Say for example the extracted directories org/apache/lucene/
Should i have that as
On Jul 21, 2004, at 10:09 AM, Anson Lau wrote:
Apply boost factor to fields when you do a lucene search.
Or... set the boost on the Field during indexing.
Erik
Anson
-Original Message-
From: John Patterson [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 22, 2004 12:07 AM
To: [EMAIL
There is no need to extract Lucene's JAR file.
Your questions indicate that you have some Tomcat and Java web
application learning to do and this forum is not the most appropriate
place to ask. Lucene includes a web application demo that you could
try deploying by following the steps here:
Hi Ian,
You don't extract war files, or jar files. To deploy a web application
that comes as a war file, you just have to drop it into
webserver/servlet engine. So just: copy lucene.war
tomcatserver/webapps. That's it. I advice you to read some of the
documentation on the Tomcat website on
No sorry i didnt mean that i was trying to extract the jars at all.
I meant the extraction of the original lucene source bundle. I have been developing in
java for going on 5 years now, but am relatively new to Web Apps. I have some
experience in TomCat from days as an undergrad and do
Thanks, that was what I was after!
- Original Message -
From: Erik Hatcher [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Wednesday, July 21, 2004 9:52 PM
Subject: Re: Weighting database fields
On Jul 21, 2004, at 10:09 AM, Anson Lau wrote:
Apply boost factor to
Erik,
Is there any benefit to set the boost during indexing rather than set it
during query?
I usually set it when doing a query because you can change that boost values
easily without having to re-index.
Thanks,
ANson
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
On Jul 21, 2004, at 11:19 AM, Ian McDonnell wrote:
No sorry i didnt mean that i was trying to extract the jars at all.
I meant the extraction of the original lucene source bundle. I have
been developing in java for going on 5 years now, but am relatively
new to Web Apps. I have some experience
On Jul 21, 2004, at 11:40 AM, Anson Lau wrote:
Is there any benefit to set the boost during indexing rather than set
it
during query?
It allows setting each document differently. For example,
TheServerSide is using field-level boosts at index time to control
ordering by date, such that newer
Lucene cannot parse those document formats that you mentioned. You
need 3rd party parsers to do that. For example, POI will parse Excel
and MS Word docs, PDFBox will parse PDF.
Otis
--- Natarajan.T [EMAIL PROTECTED] wrote:
Hi Guys,
I have a small query, ie. Lucene 1.4 APIs directly
I've done a bit more snooping around; it seems that in
FieldSortedHitQueue.getCachedComparator(line 153),
calls to lookup a stored comparator in the cache
always return null. This occurs even for the built-in
sort types (I tested it on integers and my code for
longs). The comparators don't even
I switched the Comparators and FieldCache classes to
use java.util.HashMap instead of
java.util.WeakHashMap, and got the performance boost I
was looking for (test index of 100K documents; initial
search took 991 ms, all subsequent searchs took
90ms. Before, I was seeing initial query of ~1sec,
I think I found the problem
FieldCacheImpl uses WeakHashMap to store the cached objects, but since there
is no other reference to this cache it is getting released.
Switching to HashMap solves it.
The only problem is that I don't see anywhere where the cached object will
get released if you open a
I just saw this post, I guess we both came to the same conclusion.
The only problem is that the cached object never gets released, and a new
one will get created every time you open a new IndexReader
Aviran
-Original Message-
From: Greg Gershman [mailto:[EMAIL PROTECTED]
Sent:
The key in the WeakHashMap should be the IndexReader, not the Entry. I
think this should become a two-level cache, a WeakHashMap of HashMaps,
the WeakHashMap keyed by IndexReader, the HashMap keyed by Entry. I
think the Entry class can also be changed to not include an IndexReader
field.
I will post a patch soon
Aviran
-Original Message-
From: Doug Cutting [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 21, 2004 13:56 PM
To: Lucene Users List
Subject: Re: Sort: 1.4-rc3 vs. 1.4-final
The key in the WeakHashMap should be the IndexReader, not the Entry. I
think this
Sorry for the slightly off topic post, but I have a need to use luke with my
Analyzer. Has anyone done this? I have added a jar file to my classpath,
but that didn't help.
Thanks in advance
Rob
-
To unsubscribe, e-mail:
Worked for me.
I added my jar to the classpath and my analyzer appeared in the analyzers list in the
search tab as well as in the analyzers list in the plugins tab.
I am using Luke v 0.5 (2004-05-25)
Kannan
-Original Message-
From: Rob Jose [mailto:[EMAIL PROTECTED]
Sent: Wednesday,
Sorry typo in the version date in my previous mail -- I meant Luke v 0.5 (2004-06-25)
-Original Message-
From: Chellappa, Kannan
Sent: Wednesday, July 21, 2004 12:16 PM
To: Lucene Users List
Subject: RE: Slightly off topic, I need to have luke use my Analyzer
Worked for me.
I added
Hi Erik
On Jul 21, 2004, at 11:40 AM, Anson Lau wrote:
Is there any benefit to set the boost during indexing rather than set
it
during query?
It allows setting each document differently. For example,
TheServerSide is using field-level boosts at index time to control
ordering by date,
Ernesto De Santis wrote:
If some field have set a boots value in index time, and when in search time
the query have another boost value for this field, what happens?
which value is used for boost?
The two boosts are both multiplied into the score.
Doug
Guys/Gals,
Does and one have any pointers for this kind of query?
Thanks.
Need some help with creating a query. Here is the scenario:
Field 1:
Field 2:
Field 3:
MultiSelect 1 :
Ok Thanks.
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 21, 2004 9:33 PM
To: Lucene Users List
Subject: Re: Use of Convertes or Parser
Lucene cannot parse those document formats that you mentioned. You
need 3rd party parsers to do that. For
hi
Just Copy the lucene.war file into the TomCat webApps Directory, and then
start the Tomcat
On the Browser type... http://localhost:8080/luceneweb will serve u the
Pages.
But first u have to index u'r directory for the web module to Serve u the
searchable hits ,
I think there
40 matches
Mail list logo