Re: Possible bug in FieldSortedHitQueue?

2006-03-17 Thread Brian Riddle
Hej Paul, Then, if no comparator is found in the cache, a new one is created (line 193) and then stored in the cache (line 202). HOWEVER, both the cache lookup() and store() do NOT take into account locale; if we, on the same index reader, try to do one search sorted by Locale.FRENCH and one

Re: TooManyClauses exception in Lucene (1.4)

2006-03-17 Thread Erik Hatcher
On Mar 17, 2006, at 6:15 AM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Thanks to everyone for the explanation. Given that RangeQuery is clearly unsuitable for out requirements, ConstantScoreRangeQuery looks ideal. However, we're building our queries (at the moment) using QueryParser. Is

Re: Lucene and Tomcat, too many open files

2006-03-17 Thread Erik Hatcher
What version of Lucene are you using? Are you removing the index completely and rebuilding it from scratch with the compound flag enabled (by default since 1.4)? You really shouldn't have massive numbers of files created when using the compound format, so I suspect something is fishy

Re: FunctionQuery example request

2006-03-17 Thread Paul Lynch
Thanks for all the replies to my previous posting, I was not subscribed to the list properly and did not see all of the replies. Please disregard this post. Thanks again, Paul --- Paul Lynch [EMAIL PROTECTED] wrote: Hi, have implemented the DistanceComparatorSource example from Lucene In

Appending * to each search term

2006-03-17 Thread Florian Hanke
Hello all, I'd like to append an * (create a WildcardQuery) to each search term in a query, such that a query that is entered as e.g. term1 AND term2 is modified (effectively) to term1* AND term2*. Parsing the search string is not very elegant (of course). I'm thinking that overriding

Re: Appending * to each search term

2006-03-17 Thread Eric Jain
Florian Hanke wrote: I'd like to append an * (create a WildcardQuery) to each search term in a query, such that a query that is entered as e.g. term1 AND term2 is modified (effectively) to term1* AND term2*. Parsing the search string is not very elegant (of course). I'm thinking that

Best practice setup in multi server environment?

2006-03-17 Thread jens bertheau
Hi, I am currently working on switching from MySQL fulltext search to Lucene. The indexing and searching already works pretty well. I have the following environment: 1 web server running PHP 1 MySQL server (which will still be used, but not for fulltext queries) 1 server running Lucene The

Re: Appending * to each search term

2006-03-17 Thread Florian Hanke
Thank you very much - that did the trick! :) Am 17.03.2006 um 13:51 schrieb Eric Jain: Perhaps you could subclass the QueryParser and override the getFieldQuery method: protected Query getFieldQuery(String field, String term) { return new PrefixQuery(new Term(field, term)); }

Re: Appending * to each search term

2006-03-17 Thread Erik Hatcher
Interestingly, the last two consulting jobs I've had dealt with this very issue - having user entered terms be interpreted as partial string to match in any indexed term. Care must be taken to avoid the classic TooManyClauses exception or a more insidious OutOfMemory exception. By using

Re: Throughput doesn't increase when using more concurrent threads

2006-03-17 Thread Peter Keegan
I did some additional testing with Chris's patch and mine (based on Doug's note) vs. no patch and found that all 3 produced the same throughput - about 330 qps - over a longer period. So, there seems to be a point of diminishing returns to adding more cpus. The dual core Opterons (8 cpu) still win

Re: Non scoring search

2006-03-17 Thread Peter Keegan
I experimented with this by using a Similiarity class that returns a constant (1) for all values and found that had no noticable affect on query performance. Peter On 12/6/05, Chris Hostetter [EMAIL PROTECTED] wrote: : I was wondering if there is a standard way to retrive documents WITHOUT :

Re: Lucene job

2006-03-17 Thread adasal
Ottis, I'm interested in the remote option. With thanks, AdamOn 17/03/06, Otis Gospodnetic [EMAIL PROTECTED] wrote: Hello,Somebody asked me if I knew any good Lucene people who'd be interested in some work that involves a good amount of Lucene...Here is some info.The company is in New York

Grouping results by choosen field

2006-03-17 Thread Java Programmer
Hello, I tried to search myself for soultion, but without any good result, so I want to ask group. My problem concerns result grouping, the best example will be Google search where you have results sorted by relevance, and also grouped by domain (they have little indent/margin). In my project I

Re: Throughput doesn't increase when using more concurrent threads

2006-03-17 Thread Doug Cutting
Peter Keegan wrote: I did some additional testing with Chris's patch and mine (based on Doug's note) vs. no patch and found that all 3 produced the same throughput - about 330 qps - over a longer period. Was CPU utilizaton 100%? If not, where do you think the bottleneck now is? Network? Or

Unnormalized score

2006-03-17 Thread Nick Atkins
Hi, Apparently there is a way of retrieving the unnormalized score from a Hit but I have been unable to track this down. I need to return this value because an external client wants to compile results from multiple queries itself. Any help much appreciated. Cheers, Nick.

Re: Grouping results by choosen field

2006-03-17 Thread Chris Hostetter
I believe hte topic you are refering to is typically refered to as clustering ... you may wnat to search for that. I've never really looked at it, but carrot2 seems to be a favorite among those who do result clustering. : Date: Fri, 17 Mar 2006 16:36:44 +0100 : From: Java Programmer [EMAIL

Re: Unnormalized score

2006-03-17 Thread Chris Hostetter
: Apparently there is a way of retrieving the unnormalized score from a : Hit but I have been unable to track this down. I need to return this : value because an external client wants to compile results from multiple : queries itself. Not from the Hits class itself, but the raw score is

Re: Best practice setup in multi server environment?

2006-03-17 Thread Chris Hostetter
: 1 web server running PHP : 1 MySQL server (which will still be used, but not for fulltext : queries) : 1 server running Lucene : The Lucene index will be created out of the MySQL data. : : My question: How can I send a query from the webserver using PHP to : the : Lucene server and get

Re: Grouping results by choosen field

2006-03-17 Thread karl wettin
17 mar 2006 kl. 16.36 skrev Java Programmer: My problem concerns result grouping, the best example will be Google search where you have results sorted by relevance, and also grouped by domain (they have little indent/margin). In my project I want to get similar functionality, without very

Possibility of a relational query?

2006-03-17 Thread Nina Khosravi
I have a need to issue a query that is typically performed on a relational database. I may have to give up on this idea but thought I may ask if there is a way to handle this type query. Let's say my documents all have 2 fields fieldA and fieldB. Is there a query that can return hits for all

Re: Lucene job

2006-03-17 Thread Michael Wechner
Erik Hatcher wrote: I'm increasingly getting more and more requests for Lucene consulting myself, and simply don't have the bandwidth to tackle most of them. I have said yes a few times recently though, so don't count me out though ;) If you are skilled with Lucene, and interested in

Re: Lucene job

2006-03-17 Thread Doug Cutting
Michael Wechner wrote: Maybe it would make sense to sort it alphabetically [ ... ] +1 This should be sorted alphabetically be business name or last name. That's what it says on the page, although a few entries are out of place. Please feel free to fix this. Doug

Re: Lucene and Tomcat, too many open files

2006-03-17 Thread Nick Atkins
Guys, thanks for your help yesterday, I solved my problem! I was actually using an IndexSearcher in another thread that I had forgotten all about. Whoever suggested that IndexReader was to blame was right on the money. I now make sure I close my Readers and, bingo, the open files are managed

another lucene-based application

2006-03-17 Thread Artem Vasiliev
Hi guys! I'd like to thank the developers and contributors of Lucene project for the fantastic library. And thanks Otis and Erik for a great book! I'm writing an open source file searcher application 'sharehound' (http://sharehound.sourceforge.net/) based on Lucene. It can now search SMB file

Bug in Directory + FSDirectory (?)

2006-03-17 Thread Alexandru Popescu
Hi! This is my first post to Lucene ML, so please excuse the following message if completely wrong :-). We are trying to upgrade Jackrabbit to support Lucene 1.9.1. At a first problem we needed to change the access modified of a method from protected to public, but this was not a problem. The

Re: Grouping results by choosen field

2006-03-17 Thread karl wettin
17 mar 2006 kl. 21.01 skrev karl wettin: 17 mar 2006 kl. 16.36 skrev Java Programmer: My problem concerns result grouping, the best example will be Google search where you have results sorted by relevance, and also grouped by domain (they have little indent/margin). In my project I

Re: another lucene-based application

2006-03-17 Thread Xia Dennis
what's the difference from dotLucene? 2006/3/18, Artem Vasiliev [EMAIL PROTECTED]: Hi guys! I'd like to thank the developers and contributors of Lucene project for the fantastic library. And thanks Otis and Erik for a great book! I'm writing an open source file searcher application

Re: About index deletion

2006-03-17 Thread Xia Dennis
add a field to store the time you add the index 2006/3/17, hu andy [EMAIL PROTECTED]: Because I will delete the indexed document periodically, So the index files must be deleted after that. If I just want to delete some documents added before some past day from the index, How should i do

Re[2]: another lucene-based application

2006-03-17 Thread Artem Vasiliev
Hello Xia, XD what's the difference from dotLucene? Why dotLucene? dotLucene is the .Net port of Lucene, so your question is pretty much the same as 'what's the difference from Lucene?' dotLucene as Lucene itself is not a search application, it's a library, so that's the difference :). Some of