Creating indexes

2002-06-12 Thread Chris Sibert
I have a big ( 40 MB or so) file to index. The file contains a whole bunch of documents, which are each pretty small, about a few typewritten pages long. There's a title, date, and author for each document, in addition to the documents' actual text. I'm not quite sure how you index this in

RE: Creating indexes

2002-06-12 Thread Nader S. Henein
depending on the build of the document, but I guess not, I had to write my own XML parser, you get better results when you customize something like that to your needs. -Original Message- From: Chris Sibert [mailto:[EMAIL PROTECTED]] Sent: Wednesday, June 12, 2002 10:27 AM To: Lucene

Modifications to the QueryParser

2002-06-12 Thread Peter Carlson
This was originally posted to the developer list, but should have been posted here. On 6/12/02 11:32 AM, none none [EMAIL PROTECTED] wrote: hi, i asked already help on the QueryParser.jj about: 1.Case insensitive operator, someone said do that in your code and pass the right sintax to

Memory-based indexing

2002-06-12 Thread James Ricci
I've been doing a few tests, and I'm finding creating an index in Lucene to be somewhat slower than other engines I've worked with. Is there a way to cache, batch, or otherwise speed up indexing of a large number of documents? This is mainly a problem when creating the index for the first time.

Re: Creating indexes

2002-06-12 Thread none none
Lucene doesn't know where a file start or ends, actually it knows, but in your case 1 Docuemtn contains more small documents.If you want to split your big file in small files you must to that by yourself, Take a look at the Document class and you will see that Lucene use a Reader to index the

Re: Memory-based indexing

2002-06-12 Thread Otis Gospodnetic
Yes, there are a few things one can do. See http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]msgId=117057 Otis --- James Ricci [EMAIL PROTECTED] wrote: I've been doing a few tests, and I'm finding creating an index in Lucene to be somewhat slower than other engines I've worked

Re: Thread safety

2002-06-12 Thread Otis Gospodnetic
Yeah, I think you are right, that matrix isn't 100% correct. I'll have to change it...thanks for checking. Otis --- David Smiley [EMAIL PROTECTED] wrote: Maybe I'm just not with it right now... but that matrix doesn't seem to make sense to me. From my understanding, two write requests

ANNOUNCEMENT: Release of Lucene 1.2

2002-06-12 Thread Peter Carlson
The Lucene Team is proud to announce the release of Lucene 1.2. This is the first production release of Lucene since it moved to the Apache project. This release contains many features and bug fixes over the previous 1.0.2 release - see CHANGES.txt for details. Jakarta Lucene is a