Re: indexReader close method

2004-12-06 Thread Morus Walter
Helen Warren writes: //close the IndexReader object myReader.close(); //return results return hits; The myReader.close() line causes the IOException to be thrown. To try Are you sure it's the myReader.close() that fails? I'd suspect that to fail as soon as you want to do anything

Re: Unexpected TermEnum behavior

2004-12-08 Thread Morus Walter
Chris Hostetter writes: I thought it was documented in the TermEnum interface, but looking at it now I realize that not only does the TermEnum javadoc not explain it very well, but the class FilteredTermEnum (which implements TermEnum) acctually documents the oposite behavior... public

Re: NUMERIC RANGE BOOLEAN

2004-12-16 Thread Morus Walter
Erik Hatcher writes: TooManyClauses exception occurs when a query such as a RangeQuery expands to more than 1024 terms. I don't see how this could be the case in the query you provided - are you certain that is the query that generated the error? Why not: the terms might be 0003

Re: Queries difference

2004-12-20 Thread Morus Walter
Alex Kiselevski writes: Hello, I want to know is there a difference between queries: +city(+London Amsterdam) +address(1_street 2_street) And +city(+London) +city(Amsterdam) +address(1_street) +address(2_street) I guess you mean city:(... and so on. The first query searches

Re: Synonyms for AND/OR/NOT operators

2004-12-21 Thread Morus Walter
Erik Hatcher writes: On Dec 21, 2004, at 3:04 AM, Sanyi wrote: What is the simplest way to add synonyms for AND/OR/NOT operators? I'd like to support two sets of operator words, so people can use either the original english operators and my custom ones for our local language. There

Re: Synonyms for AND/OR/NOT operators

2004-12-21 Thread Morus Walter
Sanyi writes: Well, I guess I'd better recognize and replace the operator synonyms to their original format before passing them to QueryParser. I don't feel comfortable tampering with Lucene's source code. Apart from knowing how to compile lucene (including the javacc code generation) you

Re: (Offtopic) The unicode name for a character

2004-12-22 Thread Morus Walter
Hi Peter, The Question: In Java generally, Is there an easy way to get the unicode name of a character? (e.g. LATIN SMALL LETTER A from 'a') ... I'm considering taking the unicode name for each character I encounter and regexping it against something like: ^LATIN .* LETTER (.) WITH

Re: QueryParser, default operator

2004-12-29 Thread Morus Walter
Paul writes: the following code QueryParser qp = new QueryParser(itemContent, analyzer); qp.setOperator(org.apache.lucene.queryParser.QueryParser.DEFAULT_OPERATOR_AND); Query query = qp.parse(line, itemContent, analyzer); doesn't produce the expected result because a query foo bar

Re: Deleting index for DB indexing

2004-12-30 Thread Morus Walter
mahaveer jain writes: I am using lucene for my DB indexing. I have 2 columns which are Keyword. Now I want to delete my index based on this 2 keyword. Is it possible ? If no. What is other alternative ? You can delete documents based on document number from an index reader. You can get

Re: Check to see if index is optimized

2005-01-07 Thread Morus Walter
Crump, Michael writes: Is there a simple way to check and see if an index is already optimized? What happens if optimize is called on an already optimized index - does the call basically do a noop? Or is it still and expensive call? Why don't you just try that? E.g. using luke. Or three

Re: HELP! Directory is NOT getting closed!

2005-01-12 Thread Morus Walter
Joseph Ottinger writes: According to IndexWriter.java, line 246 (in 1.4.3's codebase), if closeDir is set, it's supposed to close the directory. That's fine - but that leads me to believe that for some reason, closeDir is *not* set. Why? Under what circumstances would this not be true, and

Re: IndexSearcher and number of occurence

2005-01-13 Thread Morus Walter
Bertrand VENZAL writes: Im quite new in this mailing list. I ve many difficulties to find the number of a word (occurence) in a document, I need to use indexSearcher because of the query but the score returning is not wot i m looking for. I found in the mailing List the class TermDoc but it

Re: Best way to find if a document exists, using Reader ...

2005-01-14 Thread Morus Walter
Praveen Peddi writes: Does it makes sense to call docFreq or termDocs (which ever is faster) before calling delete? IMO no. calling termDocs is what Reader.delete(Term) does: public final int delete(Term term) throws IOException { TermDocs docs = termDocs(term); if (docs ==

RE: closing an IndexSearcher

2005-01-20 Thread Morus Walter
Hi Cocula, And now here is a code that works : the only differance with the previous one is the QueryParser call before new IndexWriter. The QueryParser .parse statement seems to close the IndexReader but I really can't figure how. I rather suspect your OS/filesystem to delay the effect

Re: English and French documents together / analysis, indexing, searching

2005-01-20 Thread Morus Walter
[EMAIL PROTECTED] writes: you could try to create a more complex query and expand it into both languages using different analyzers. Would this solve your problem ? Would that mean I would have to actually conduct two searches (one in English and one in French) then merge the results

Re: Newbie: Human Readable Stemming, Lucene Architecture, etc!

2005-01-20 Thread Morus Walter
Owen Densmore writes: 1 - I'm a bit concerned that reasonable stemming (Porter/Snowball) apparently produces non-word stems .. i.e. not really human readable. (Example: generate, generates, generated, generating - generat) Although in typical queries this is not important because the

Re: document numbers

2005-01-31 Thread Morus Walter
Hi Jonathan, Yet another burning question :-). Can someone explain how the document numbers in Lucene documents work? For example, the TermDocs.doc() method returns the current doc number. How can I get this doc number if I just have a Document? I don't think you can. A document does

Re: Disk space used by optimize

2005-02-06 Thread Morus Walter
Bernhard Messer writes: However, three times the space sounds a bit too much, or I make a mistake in the book. :) there already was a discussion about disk usage during index optimize. Please have a look to the developers list at: http://mail-archives.apache.org/eyebrowse/[EMAIL

Re: sounds like spellcheck

2005-02-09 Thread Morus Walter
Aad Nales writes: Steps 2 and 3 have been discussed at length in this forum and have even made it to the sandbox. What I am left with is 1. My thinking is processing a series of replacement statements that go like: -- g sounds like ch if the immediate predecessor is an s. o sounds like

RE: Concurrent searching re-indexing

2005-02-17 Thread Morus Walter
Paul Mellor writes: 1. If IndexReader takes a snapshot of the index state when opened and then reads the files when searching, what would happen if the files it takes a snapshot of are deleted before the search is performed (as would happen with a reindexing in the period between opening an

Re: select where from query type in lucene

2005-02-18 Thread Morus Walter
Miles Barr writes: On Fri, 2005-02-18 at 03:58 +0100, Miro Max wrote: how can i search for content where type=document or (type=document OR type=view). actually i can do it with: (type:document OR type:entry) AND queryText as QueryString. but does exist any other better way to realize

Re: Search performance with one index vs. many indexes

2005-02-27 Thread Morus Walter
Jochen Franke writes: Topic: Search performance with large numbers of indexes vs. one large index My questions are: - Is the size of the wordlist the problem? - Would we be a lot faster, when we have a smaller number of files per index? sure. Look: Index lookup of a word is O(ln(n))

Re: help with boolean expression

2005-02-27 Thread Morus Walter
Omar Didi writes: I have a problem understanding how would lucene iterpret this boolean expression : A AND B OR C . it neither return the same count as when I enter (A AND B) OR C nor A AND (B OR C). if anyone knows how it is interpreted i would be thankful. thanks A AND B OR C creates a

Re: Sorting date stored in milliseconds time

2005-02-27 Thread Morus Walter
Ben writes: I store my date in milliseconds, how can I do a sort on it? SortField has INT, FLOAT and STRING. Do I need to create a new sort class, to sort the long value? Why do you need that precicion? Remember: there's a price to pay. The memory required for sorting and the time to set up

Re: Boost doesn't works

2005-02-28 Thread Morus Walter
Claude Libois writes: Hello. I'm using Lucene for an application and I want to boost the title of my documents. For that I use the setBoost method that is applied on the title field. However when I look with luke(1.6) I don't see any boost on this field and when I do a search the score isn't

Re: Boost doesn't works

2005-02-28 Thread Morus Walter
Claude Libois writes: The explanation given by the IndexSearcher indicate me that the boost of my title is 1.0 where it should be 10.0. I really don't understand what it's wrong. AFAIK you cannot get the boost of a field from the index because it's not stored as such. It's calculated in the

<    1   2