Size limit for indexing ?

2002-10-09 Thread Christophe GOGUYER DESSAGNES
Hi, I use lucene 1.2 and I index a text document wich size is near 500 ko. (I use Field.UnStored method) It seems that only the beginning of this document is indexing ! If I search a term that is at the end of this document, I don't find it (but If find term at the beginning). So, I split my

RE: Size limit for indexing ?

2002-10-09 Thread Nader S. Henein
The size of the document is limited only by the OS constraints and 500 kb is really small, I have documents in the hundreds of megs it's fine .. check you indexing and searching you might find the problem there also are you using wildcard searches because they don't work from both sides Nader

RE: Size limit for indexing ?

2002-10-09 Thread Materna, Wolf-Dietrich (empolis B)
Hello, I use lucene 1.2 and I index a text document wich size is near 500 ko. (I use Field.UnStored method) It seems that only the beginning of this document is indexing ! If I search a term that is at the end of this document, I don't find it (but If find term at the beginning). So, I

Re: Size limit for indexing ?

2002-10-09 Thread Christophe GOGUYER DESSAGNES
Thank you for your help, it solved my problem. - Christophe - Message d'origine - De : Materna, Wolf-Dietrich (empolis B) [EMAIL PROTECTED] À : 'Lucene Users List' [EMAIL PROTECTED] Envoyé : mercredi 9 octobre 2002 10:33 Objet : RE: Size limit for indexing ? Hello, I use lucene

Deleting a document found in a search

2002-10-09 Thread lucene . user
I am just getting started with Lucene and I think I have a problem understanding some basic concepts. I am using two-part identifiers to uniquely identify a document in the index. So whenever I want to index a document, I first want to find and delete the old form. To find it, I intend to

Enumerating all Terms

2002-10-09 Thread lucene . user
Is there a way of getting a list of all Terms that have been indexed? I guess it would approximate a wildcard query of the form *:* if that were valid, and instead of returning matching documents, just returning the fields and values. -- Thanks, Adrian. -- To unsubscribe, e-mail:

Re: Deleting a document found in a search

2002-10-09 Thread Otis Gospodnetic
You mean d.get(Id); ? Otis --- [EMAIL PROTECTED] wrote: I am just getting started with Lucene and I think I have a problem understanding some basic concepts. I am using two-part identifiers to uniquely identify a document in the index. So whenever I want to index a document, I first

Re: Deleting a document found in a search

2002-10-09 Thread lucene . user
No, I mean HitDoc.id, the document number field stored in the HitDoc class. This number is needed when calling IndexReader.delete(int docnum) but it is not publicly accessible. -- Adrian At 06:32 09/10/2002 -0700, Otis Gospodnetic wrote: You mean d.get(Id); ? --- [EMAIL PROTECTED] wrote: I

RE : Enumerating all Terms

2002-10-09 Thread Laurent Trillaud
Yes You can. IQ-Computing, one of the contributors, has already made the job for you, when they implement the highlighting for Lucene. http://www.iq-computing.de/lucene/highlight.htm Follow their instructions and you will be able to use a getTerms(). Laurent Trillaud -Message

Re: Deleting a document found in a search

2002-10-09 Thread Doug Cutting
[EMAIL PROTECTED] wrote: My first thought is to define a Field.Keyword(composite-key, domain + \u + id). This would allow me to use the delete(Term) interface to delete the key. That sounds like a good way to solve this. You could also use a HitCollector with a Query, but I think the

RE: IndexSearcher on JAR resources?

2002-10-09 Thread Tim Dawson
I wrote: I need to do almost exactly the same thing as Erik - create a read-only index on our help webapp that will be packaged inside an ear file. I figured out a way around the lack of a Jar index searcher. Basically I created the jar file from the index dir and added a bean for my search

Lucene and Geographic Searching

2002-10-09 Thread David Kendig
Hi, I'm very interested in migrating our current search engine to use Lucene. After evaluating Lucene, I have become very impressed and have been telling lots of people about it. One requirement that we have is to be able to search our documents by specifying a geographical boundary. I

Whats the type of inverted file Lucene is using??

2002-10-09 Thread Jacob Gutierrez
Hi everybody I was just wondering the type of implementation used for the inverted file that its used by Lucene in the index. Is it using a sorted array?? Jacob Gutiérrez R. Cochabamba - Bolivia -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail:

Web search engine size optimisation problems..

2002-10-09 Thread Kyriakos Ktorides
Hello, I've been trying for a while to create a web search engine to spider a small number of websites (around 1000 of them). Before even considering Lucene I used a dbms and tried crawling a site while taking in all keywords from the html files (filtering out stopwords etc). Unfortunately this