Where do I get org.apache.commons.collections package sources?

2006-09-05 Thread Venkateshprasanna
I saw these classes and want to use them for my implementation as well. But I am not getting the source code for the specified package: org.apache.commons.collections Is there any other way of implementing the same? Why only classes from that package has to be used? Regards, Venkateshprasanna

Re: QueryParser returns all documents

2006-09-05 Thread lude
Why would you want to do this? This is a 'feature-request' of our searchengine. The user should have the possibilty to query for all(!) documents. This would allow him to see all available document listet. Is there a simple way to define a query that returns all documents of an index? Thanks

Re: QueryParser returns all documents

2006-09-05 Thread Ronnie Kolehmainen
You could define your own query syntax (for example an empty string) for a query matching all docs, examine the query string before passing it to QueryParser, and instead create a MatchAllDocsQuery when a you have a match.

Re: QueryParser returns all documents

2006-09-05 Thread Laurent Hoss
Why not add a single Field to each Document, like |d.add(*new *Field(doctype,document, Field.Store.YES, Field.Index.TOKENIZED));| Then searching for doctype:document returns all documents -Laurent lude wrote: Why would you want to do this? This is a 'feature-request' of our searchengine.

IndexSearcher executed concurrently

2006-09-05 Thread jacky
hi, The source code in the end is the class to search sth. 1. I wander if concurrent users can get the right results with different queries since the class has only one IndexSearcher instance. 2. As we know, a new IndexSearcher can be created when user request his query. If first

Scoring based on fields and categorization

2006-09-05 Thread Gonçalo Gaiolas
Hi there, I need to make two changes to Lucene : - Scoring should take in consideration not only the relevance of the contents, but also two numerical values in other document fields. For example, let’s assume that the normal score for Document A is 0.33 (as calculated by Lucene).

Re: Where do I get org.apache.commons.collections package sources?

2006-09-05 Thread karl wettin
On Tue, 2006-09-05 at 02:38 -0700, Venkateshprasanna wrote: I saw these classes and want to use them for my implementation as well. But I am not getting the source code for the specified package: org.apache.commons.collections http://jakarta.apache.org/commons/collections/

Re: Scoring based on fields and categorization

2006-09-05 Thread karl wettin
On Tue, 2006-09-05 at 11:54 +0100, Gonçalo Gaiolas wrote: - Scoring should take in consideration not only the relevance of the contents, but also two numerical values in other document fields. For example, let’s assume that the normal score for Document A is 0.33 (as calculated by

Re: IndexSearcher executed concurrently

2006-09-05 Thread karl wettin
On Tue, 2006-09-05 at 17:57 +0800, jacky wrote: 1. I wander if concurrent users can get the right results with different queries since the class has only one IndexSearcher instance. 2. As we know, a new IndexSearcher can be created when user request his query. If first method gets the right

RE: Scoring based on fields and categorization

2006-09-05 Thread Gonçalo Gaiolas
Hi Karl, Thanks for the super quick response! One question - should this boosting occur during index time or at query time? I'm a bit confused as to where should I apply this boost in order to affect the results of a search query. Once again thanks a lot! Gonçalo -Original Message-

RE: Scoring based on fields and categorization

2006-09-05 Thread karl wettin
On Tue, 2006-09-05 at 13:32 +0100, Gonçalo Gaiolas wrote: should this boosting occur during index time or at query time? I'm a bit confused as to where should I apply this boost in order to affect the results of a search query. You boost at index time.

Re: IndexSearcher executed concurrently

2006-09-05 Thread jacky
Oh, that is great! I didn't notice this javadoc. Maybe i need to update my lucene lib. I had thought one user requests his query, other queries maybe impact on the result since using a single IndexSearcher. Forget these mails. Thanks a lot.. On 9/5/06, karl wettin [EMAIL PROTECTED]

obtaining the number of documents stored in a .cfs file

2006-09-05 Thread Stanislav Jordanov
Suppose I have a bunch of valid .cfs files while the segmens/segments.new file is missing or invalid. The task is to 'recover' the present .cfs files into a valid index. I think it will be necessary and sufficient to create a segments file that references the .cfs files. The only problem I've

Filter inside SpanQuery

2006-09-05 Thread Mark Miller
Anybody experimented with a filter in a spanquery? Pipedream? thanks, Mark

RE: Filter inside SpanQuery

2006-09-05 Thread Mark Miller
Okay, more realistically, anyone have any experience with Randy Puttnick's modifaction of wildcardquery and fuzzyquery? Any ideas on getting something like those in a SpanQuery? - Mark

Re: Highlighting really found terms

2006-09-05 Thread mark harwood
See here for a thread reviewing the challenges and possible solutions associated with this problem: http://www.mail-archive.com/java-user@lucene.apache.org/msg02543.html An alternative highlighter implementation was recently contributed here:

Re: Highlighting really found terms

2006-09-05 Thread Karel Tejnora
Not for now, but I'd like to contribute span support soon. Karel An alternative highlighter implementation was recently contributed here: http://issues.apache.org/jira/browse/LUCENE-644?page=all I've not had the time to study this alternative in detail (I hope to soon) so I can't say if it

Re: Filter inside SpanQuery

2006-09-05 Thread Paul Elschot
On Tuesday 05 September 2006 15:59, Mark Miller wrote: Okay, more realistically, anyone have any experience with Randy Puttnick's modifaction of wildcardquery and fuzzyquery? Any ideas on getting something like those in a SpanQuery? You can use the IndexSearcher method that searches a query

Re: obtaining the number of documents stored in a .cfs file

2006-09-05 Thread Andrzej Bialecki
Stanislav Jordanov wrote: Suppose I have a bunch of valid .cfs files while the segmens/segments.new file is missing or invalid. The task is to 'recover' the present .cfs files into a valid index. I think it will be necessary and sufficient to create a segments file that references the .cfs

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Chris Hostetter
: So, if I do as you suggest below (using PerFieldAnalyzerWrapper with : StandardAnalyzer) then I still need to enclose in quotes the phrases : (keywords with spaces) when I issue the search, and they are only returned Yes, quotes will be neccessary to tell the QueryParser this is one chunk of

Re: Scoring based on fields and categorization

2006-09-05 Thread Chris Hostetter
: the contents, but also two numerical values in other document fields. For : example, let’s assume that the normal score for Document A is 0.33 (as : calculated by Lucene). What I need is that it’s true score is 0.33 * (value : of field A) * (value of field B). What is the best way to accomplish

Re: WildcardFilter

2006-09-05 Thread Chris Hostetter
: Could someone with some experience spot-check this WildcardFilter...it seems : to work fine in simple testing, but I'd like to know if there are any : glaring deficiencies. Have not had much to do with filters before. It looks fine to me. -Hoss

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Mark Miller
Some info to help you on you're journey :) 1. If you add a field as untokenized then it will not be analyzed when added to the index. However, QueryParser will not know that this happened and will tokenize queries on that field. 2. The solution that Hoss has explained to you is to leave the

jvm crashes on FieldCache.DEFAULT.getStrings(reader, field);

2006-09-05 Thread Doron Cohen
[discussion moved here from dev-list] Could it be an out-of-mem error? Can you run it with a debugger, to see what really happens? JVMs usually create a javacore file, and in case of an out-of-mem also a heapdump file - these give more info on the problem. In case this file was not created in

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Philip Brown
Here's a little sample program (borrowed some code from Erick Erickson :)). Whether I add as TOKENIZED or UN_TOKENIZED seems to make no difference in the output. Is this what you'd expect? - Philip package com.test; import java.io.IOException; import java.util.HashSet; import

parser question

2006-09-05 Thread Chris Salem
With all the parsers I have tried a space in a query, such as doing a search for sales manager, interprets the space as an OR, is there a way to change it so that it interprets a space as an AND? Chris Salem 440.946.5214 x5458 [EMAIL PROTECTED] (The following links were included with this

Re: parser question

2006-09-05 Thread Mark Miller
QueryParser.setDefaultOperator(Operator op) Chris Salem wrote: With all the parsers I have tried a space in a query, such as doing a search for sales manager, interprets the space as an OR, is there a way to change it so that it interprets a space as an AND? Chris Salem 440.946.5214 x5458

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Chris Hostetter
1) consider using JUnit tests .. it makes it a lot easier for other people to understand your expecations, and if it winds up demonstraing a genuine bug in Lucene, it's easy to add to the test tree. 2) as i said before, your fields must be TOKENIZED, or your analyzer is irrelevant at index time.

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Philip Brown
Sorry for the confusion and thanks for taking the time to educate me. So, if I am just indexing literal values, what is the best way to do that (what analyzer)? Sounds like this approach, even though it works, is not the preferred method. analyzer = new

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Chris Hostetter
: Sorry for the confusion and thanks for taking the time to educate me. So, if : I am just indexing literal values, what is the best way to do that (what : analyzer)? Sounds like this approach, even though it works, is not the : preferred method. if you truely want just the literal values then

which way to index pdf,word,excel

2006-09-05 Thread James liu
i find lius many question so i wanna give up and find new. who recommend ?

Re: which way to index pdf,word,excel

2006-09-05 Thread James liu
i wanna find frame which can index xml,word,excel,pdf,,,not one. 2006/9/6, Doron Cohen [EMAIL PROTECTED]: Lucene FAQ - http://wiki.apache.org/jakarta-lucene/LuceneFAQ - has a few entries just for this: How can I index HTML documents? How can I index XML documents? How can I index