from:"Jeff"

Hit Count per Document

2007-12-20 Thread Jeff

slow brown fox jumped over the lazy dog If I searched for "quick brown", is there a way I could see that it was hit 4 times within the document? Thanks, Jeff

Re: Hit Count per Document

2007-12-20 Thread Jeff

If I am not mistaken, that is for a term.. Is it possible for a query? In the below example, I don't want to know how many times brown is in the document I want to know how many times "quick brown" is in the document. Thanks, Jeff On Dec 20, 2007 3:03 PM, Mark Miller <[EMAIL

RE: ways to minimize index size?

2007-03-14 Thread Jeff

I found that reducing my index from 8G to 4G (through not stemming) gave me about a 10% performance improvement. How did you do this? I don't see this as an option. Jeff

replace values in index

2007-07-12 Thread Jeff

eperator. Is there an easy way to add ',' as a token seperator? Thanks, -Jeff

How about lucene's delete performance ?

2010-10-13 Thread Jeff Zhang

Hi all, I only want to index the latest one week's data, the previous data can be deleted. So I'd like to know about lucene's delete performance and whether it will has impact on the search performance when I do lots of delete operation in the meantime. Thanks -- Best Rega

Unexpected scoring results

2017-07-18 Thread Jeff Wallace

been fixed and/or reduced in later versions (say 5.x or 6.x)? Thank you for any info. Jeff Wallace Software Development, FileNet IBM Corp. 1540 Scenic Ave. Costa Mesa, CA 92626 (714) 327-7163 direct - To unsubscribe, e-mail: java

G1 warming on lucene wiki

2018-09-27 Thread Jeff Courtade

Hello, I have been looking into tuning the garbage collector for solr. I found this entry on the lucene wiki that seems to be out of date. The bug referenced is reported as resolved now. Could someone validate whether it is safe to use G1 garbage collection with lucene? "Do not, under any circum

Feasibility question

2008-11-10 Thread Jeff Capone

document and I though I would treat each field as a key word to minimize processing. Assuming you have clusters operating on independent datasets (so I guess it would scale linearly) and you want to process Terabytes of logs per day, is such a solution even feasible? Thank you, Jeff Capone

Re: Distinct terms values? (like in Luke)

2009-05-10 Thread Jeff Turner

ligion" in documents published within a range of dates. Thanks Jeff On May 10, 2009, at 11:35 AM, Uwe Schindler wrote: You can get this list using IndexReader.terms(new Term(fieldname,"")). This returns an enumeration of all terms starting with the given one (the field name). Just

Edit distance and wildcard searching with PhraseQuery

2009-11-11 Thread Jeff Plater

omplish this? Right now I am having to hit a look up table to translate the city before searching against the main index - not a fan of this option. Thanks. -Jeff Plater

RE: Edit distance and wildcard searching with PhraseQuery

2009-11-11 Thread Jeff Plater

Thanks - I tried it out and it seems to work for "Philadelphid~0.75 PA" but I can't get it working for "Phil* PA" yet. Perhaps it is an issue with my Analyzer (I am using WhitespaceAnalyzer)?. Have you used it with wildcard before? -Jeff -Original Messag

RE: Edit distance and wildcard searching with PhraseQuery

2009-11-11 Thread Jeff Plater

Thanks for the suggestion - I double checked the case and it was OK. Turned out I needed to use the StandardAnalyzer instead of the WhitespaceAnalyzer. -Jeff -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, November 11, 2009 6:52 PM To: java-user

Sort fields shouldn't be tokenized

2009-11-16 Thread Jeff Plater

words and such) which can produce an invalid sort order? Thanks. -Jeff - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: Sort fields shouldn't be tokenized

2009-11-16 Thread Jeff Plater

Thanks - so if my sort field is a single term then I should be ok with using an analyzer (to lowercase it for example). -Jeff -Original Message- From: J.J. Larrea [mailto:j...@panix.com] Sent: Monday, November 16, 2009 11:19 AM To: java-user@lucene.apache.org Subject: Re: Sort fields

RE: Lower/Uppercase problem when searching in a not-analyzed field

2009-12-14 Thread Jeff Plater

h time you won't be able to use wildcard searching (unless you don't care about wildcard searching). -Jeff -Original Message- From: Michel Nadeau [mailto:aka...@gmail.com] Sent: Mon 12/14/2009 4:36 PM To: java-user@lucene.apache.org Subject: Lower/Uppercase problem when searchi

Re: Scale Out

2010-02-08 Thread Jeff Zhang

, e-mail: java-user-h...@lucene.apache.org > > -- Best Regards Jeff Zhang

What is the best practice of using synonymy ?

2010-03-22 Thread Jeff Zhang

ow which one is better, any help is appreciated. -- Best Regards Jeff Zhang

Re: What is the best practice of using synonymy ?

2010-03-23 Thread Jeff Zhang

t; > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Best Regards Jeff Zhang

Lucene and DRBD

2007-08-17 Thread Jeff Gutierrez

a.org/wiki/DRBD Thanks, Jeff - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Question about Qsol parser and phrase searches

2007-09-09 Thread Jeff French

ere a way to default to no slop, preferrably without changing all of our queries? Thanks for any pointers. Jeff -- View this message in context: http://www.nabble.com/Question-about-Qsol-parser-and-phrase-searches-tf4410480.html#a12582

Query to ignore certain phrases

2008-08-11 Thread Jeff French

We're trying to perform a query where if our intended search term/phrase is part of a specific larger phrase, we want to ignore that particular match, but not the entire document (unless of course there are no other hits with our intended term/phrase). For example, a query like: "white house"

Query question

2006-11-02 Thread jeff . richley

I am wanting to be able to put sets of data in a very structured way and query Lucene for only 100% matches. Is there a way to do this? I seem to be getting back at best 0.30685282. I appreciate any help and insite. Jeff Richley, Vice President Southeast Virginia Java Users Group [EMAIL

Re: Query question

2006-11-02 Thread jeff . richley

Ah good question. The data that I am needing to query on is not a set definition of tables or columns like a database is. Let me give two examples: 1.) I have data like name="Jeff" lastname="Richley" age="33" and I need to be able to query by any combination such

Re: Query question

2006-11-04 Thread jeff . richley

help would be greatly appreciated. > > : 1.) I have data like name="Jeff" lastname="Richley" age="33" and I need > to > : be able to query by any combination such as name="Jeff" age="33". But > if > : I query with name=&qu

Re: Query question

2006-11-04 Thread jeff . richley

;, "/a/b/c", Field.Store.YES, Field.Index.UN_TOKENIZED); document.add(location); Field name = new Field("name", "Jeff Richley", Field.Store.YES,

Re: Query question

2006-11-05 Thread jeff . richley

ueryParser to build your queries for you, use the KeywordAnalyzer > to > : > make sure no lowercasing or stemming takes place. > : > 2) OMIT_NORMs when indexing .. they only matter if you want the > lengths > : > of fields to affect the score, and you don't -- you only want t

Partial Word Matches

2006-11-11 Thread Storey, Jeff

Hi. I'm using Lucene to do some searching (using the Searcher object and passing it a ParsedQuery). I search for a word such as "long" and it is returning partial matches, such as "belong" and "along." Is there a way to turn off this behavior and only match whole words? Thank you, Jeff

RE: Partial Word Matches

2006-11-11 Thread Storey, Jeff

IndexSearcher to search the parsed query created in Step 3. That's it. Is this the proper way to be doing searching? Thanks. Jeff -Original Message- From: Paul Borgermans [mailto:[EMAIL PROTECTED] Sent: Saturday, November 11, 2006 3:06 PM To: java-user@lucene.apache.org Subject: Re: Partial

RE: Partial Word Matches

2006-11-11 Thread Storey, Jeff

arch for the term "yellow~" I might get something like "bellow." Is there a way to list what Lucene found in the document that made it relevant? Thanks for all the help. Jeff -Original Message- From: Paul Borgermans [mailto:[EMAIL PROTECTED] Sent: Saturday, November 11,

RE: Partial Word Matches

2006-11-11 Thread Storey, Jeff

Erick, Very useful answers -- I'll be reading up more with the links you've provided. Thanks. Jeff -Original Message- From: Erick Erickson [mailto:[EMAIL PROTECTED] Sent: Saturday, November 11, 2006 5:51 PM To: java-user@lucene.apache.org Subject: Re: Partial Word Matches

RE: Q: Highlighter + Search symbols "*, ?, ~"

2006-11-21 Thread Storey, Jeff

Thanks for the quick reply. I'll be implementing this in the next couple of days. Appreciate it! Jeff -Original Message- From: Stephan Spat [mailto:[EMAIL PROTECTED] Sent: Monday, November 20, 2006 8:43 AM To: java-user@lucene.apache.org Subject: Re: Q: Highlighter + Search sy

Re: Search terms on a single "instance" of field

2007-07-29 Thread Jeff French

rmB"~99) I did this playing around with table cells, and it seems to work so far. Jeff rossini wrote: > > Actually no, > >Because I'd like to retrieve terms that were computed on the same > instance of Field. Taking your example to ilustrate better, I have 2 >

Re: Nested Fields

2007-08-09 Thread Jeff French

od to the buffer for each parent element. Then I removed the current element and added its content as a Field. I should add that I am also fairly new to Lucene, so just because I did it that way doesn't mean it's the best or even a good way. Jeff Spencer Tickner wrote: > &

Nested concept fields

2007-08-12 Thread Jeff French

do something like this (in search pseudocode): sent:(expired num[1 TO 5] "days ago") I don't see how to do this using either Lucene's QueryParser or the QsolParser. Is it possible to do it using the Query API (and the appropriate indexing changes)? Thanks for any pointers.

Re: MaxFieldLength or MaxFields?

2005-10-26 Thread Jeff Rodenburg

thanks Erik On 10/26/05, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > > On 26 Oct 2005, at 02:50, Jeff Rodenburg wrote: > > I'm considering building out an index that will flatten a data > > structure, > > such that some Document "A" will have

Re: Help with Search Java Code set up

2005-10-26 Thread Jeff Rodenburg

Kevin - Maybe I'm misunderstanding, but how is this not a BooleanQuery with two clauses? - j On 10/26/05, Kevin L. Cobb <[EMAIL PROTECTED]> wrote: > > I've been using Lucene happily for a couple of years now. But, this new > search functionality I'm trying to add is somewhat different that what

Re: Items in multiple category: distinct search?

2005-11-15 Thread Jeff Rodenburg

Hi John - It sounds like you're thinking of your index in terms of sql constructs -- multiple rows for the same record. We do this very same thing with categories; if you have a record that lives in multiple categories, just add additional category field/value pairs for your original record. It's

High CPU utilization with sort

2005-11-20 Thread Jeff Rodenburg

27;ve seen performance in terms of requests/second drop by a factor of 10, compared to similar tests executing only search requests (no sorts). CPU appears to be our bottleneck, and I'm trying to determine if this is expected behavior or if we're outside the bounds of typical performance. Thanks, jeff

Re: High CPU utilization with sort

2005-11-20 Thread Jeff Rodenburg

(especially for numeric fields). > > If you haven't already, you should compare the query times of a > "warmed" searcher. Sorted queries will still take longer, but I > haven't measured how much longer. > > -Yonik > Now hiring -- http://forms.cnet.com/slink?

Re: A couple of questions regarding load balancing and failover

2005-11-30 Thread Jeff Rodenburg

On 11/30/05, Daniel Pfeifer <[EMAIL PROTECTED]> wrote: > > > 1.) Does Lucenes MultiSearcher implement some kind of automatic failover > and/or load-balancing mechanism if both Searchables which I supply in > MultiSearchers constructor go to two different servers but to the very same > index-files?

Re: lucene and database searching, keeping score

2005-12-02 Thread Jeff Rodenburg

George - There are a number of SQL Server specific ways you can do this. Email me off-list as the solution is not relevant to Lucene. -- j On 12/2/05, George Abraham <[EMAIL PROTECTED]> wrote: > > All, > I have created a Lucene index from data in a SQL Server db. When I conduct > a > Lucene sea

Distributed sort

2005-12-03 Thread Jeff Rodenburg

In one of the Google Labs whitepapers ( http://labs.google.com/papers/mapreduce-osdi04.pdf), a programming construct known as MapReduce is used in a variety of jobs/tasks within Google's operation. As an example of the application of MapReduce, the whitepaper refers to Distributed Sorting. Essent

Re: Distributed sort

2005-12-04 Thread Jeff Rodenburg

thanks Erik On 12/3/05, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > > On Dec 3, 2005, at 1:26 PM, Jeff Rodenburg wrote: > > > In one of the Google Labs whitepapers ( > > http://labs.google.com/papers/mapreduce-osdi04.pdf), a programming > > construct > >

Re: How to do refined search based on attributes and never return zero results

2005-12-07 Thread Jeff Rodenburg

Check out Chris Hostetter's methodology for doing this at cnet. http://mail-archives.apache.org/mod_mbox/lucene-java-user/200508.mbox/[EMAIL PROTECTED] This sounds like it matches your requirements. cheers, j On 12/7/05, Ching-Pei Hsing <[EMAIL PROTECTED]> wrote: > > Has anyway solved the foll

Re: ApacheCon next week

2005-12-12 Thread Jeff Rodenburg

Well done, Grant. Very informative. Question on Term Vectors: with their inclusion in an index, have you noticed any degradation in performance, either from a search effiiciency or maintenance point-of-view? Given the power of term vectors, if the perf impact is negligible, I'm curious to the re

best strategy to deal with large index file

2005-12-16 Thread Jeff Liang

index file? I start jvm with 800MB. thanks, Jeff

RE: best strategy to deal with large index file

2005-12-16 Thread Jeff Liang

field that should retrieve a lot of records, it normally throws the exception. I will look at MultiSearcher. do you think split the index file based on date field is a good choice? I somehow feel it requires a lot of coding to create many indexes based on date field. Thanks,

Lucene and geo queries

2006-01-04 Thread Jeff Rodenburg

I'm very interested in incorporating smart geographic querying capabilities (distance calcs are just scratching the surface) into Lucene and came across this whitepaper: http://www.clef-campaign.org/2005/working_notes/workingnotes2005/leidner05.pdf Just curious, has anyone ventured down this path

Re: deleting duplicate documents from my index

2006-01-29 Thread Jeff Rodenburg

One way to do this (depending on your system and index size) is to remove and add every url you find. This would ensure that every document in the index is unique. No need to worry about sorting and iteration and doc_ids and the like. It rebuilds your entire index, but if you have a duplication

Re: Help with indexing and query strategy

2006-01-30 Thread Jeff Rodenburg

Have you considered evaluating doc-score thresholds for limiting your results? Since the perfect answers to these situations lie in the constant tweaking and twiddling of analysis and tokenization, one way I've found to help is to evaluate result scores. In your "Ontario CA" example, limiting res

Re: How do I send search query to Multiple search Indexes ?

2006-02-02 Thread Jeff Rodenburg

Vikas - Start with the RemoteSearchable class. Technology will be RMI. Hope this helps. On 2/2/06, Vikas Khengare <[EMAIL PROTECTED]> wrote: > > Hi Friends > > How do I send one search query to multiple search Indexes which are > on remote machines ? > > Which Technology will help me (A

Inappropriate content detection

2006-02-05 Thread Jeff Thorne

to tackle this problem with Lucene or another api if doing so makes more sense? Thanks, Jeff

Re: Inappropriate content detection

2006-02-05 Thread Jeff Rodenburg

You can generate a token stream for a block of text without having to index it. Take a look at the highlighter code, it does this very thing. On 2/5/06, Jeff Thorne <[EMAIL PROTECTED]> wrote: > > I am trying to figure out whether or not Lucene is an appropriate solution > for a p

RE: Inappropriate content detection

2006-02-06 Thread Jeff Thorne

The site will have million+ posts. I am not familiar with Bayesian algorithms. Is there an off the shelf API that can provide this type of capability. As for performance would Bayesian be the way to go over Lucene? Thanks for the help, Jeff -Original Message- From: gekkokid [mailto

Hacking proximity search: looking for feedback

2006-02-28 Thread Jeff Rodenburg

ted approximately 145 clauses within the final constructed query. In validation testing, this approach has proven to be: 1) Accurate. 2) Performant (thus far). At last, my question to everyone who cares to respond (and read this far): feedback? Thanks, -- jeff

Re: Hacking proximity search: looking for feedback

2006-02-28 Thread Jeff Rodenburg

el [mailto:[EMAIL PROTECTED] > Sent: Tuesday, February 28, 2006 2:49 PM > To: java-user@lucene.apache.org > Subject: RE: Hacking proximity search: looking for feedback > > Jeff - > > This is an interesting approach. On our end, we have experimented with > two variants: > &g

Re: Hacking proximity search: looking for feedback

2006-02-28 Thread Jeff Rodenburg

component of relevance? We have a need for distance sorting, but I'm trying to slay that beast at a later stage. -- jeff On 2/28/06, Bryzek.Michael <[EMAIL PROTECTED]> wrote: > > Jeff - > > This is an interesting approach. On our end, we have experimented with > two va

Re: Hacking proximity search: looking for feedback

2006-02-28 Thread Jeff Rodenburg

the notes. -- jeff On 2/28/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > > : Geo definition: > : Boxing around a center point. It's not critical to do a radius search > with > : a given circle. A boxed approach allows for taller or wider frames of > : reference

Re: Hacking proximity search: looking for feedback

2006-03-01 Thread Jeff Rodenburg

FunctionQueries to influence your scores based on distance fro mthe > center of hte box. > > : > : Great feedback, thanks for the notes. > : > : -- jeff > : > : On 2/28/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: > : > > : > > : > : Geo d

Re: Hacking proximity search: looking for feedback

2006-03-01 Thread Jeff Rodenburg

Very good note, I missed that. I need the development environment in front of me to remember all the different class names correctly. ;-) -- j On 3/1/06, Doug Cutting <[EMAIL PROTECTED]> wrote: > > Jeff Rodenburg wrote: > > Following on the Range Query approach, how is per

Re: Search on many indexes at once

2006-03-03 Thread Jeff Rodenburg

Raul - You'll want to look at the MultiSearcher and ParallelMultiSearcher classes for this. On 3/3/06, Raul Raja Martinez <[EMAIL PROTECTED]> wrote: > > Is it possible to search many indexes in one query and get back the Hits > ordered by relevance? > > Can someone point me out to some document o

Re: Question

2006-03-07 Thread Jeff Rodenburg

We've done this, and it's not that complex. (Sorry, client won't allow me to release the code.) It's AJAX on the front end, so that background call is simply executing a search against an index that consists of the aggregated search terms. We do wildcard queries to get the results we want. For u

Index validation utility

2006-03-11 Thread Jeff Rodenburg

data types, etc. I'm working on this mostly for myself, but if anyone is interested just send me an email off-list. cheers, -- jeff r.

Business stop words?

2006-03-16 Thread Jeff Rodenburg

Does anyone have a lead on "business" stop words? Things like "inc", "llc", "md", etc. I'd rather not reinvent this wheel. :-) cheers, jeff

Re: Speed up Indexing

2006-03-23 Thread Jeff Rodenburg

I run Lucene.Net as well, and your indexing performance is dependent on more factors aside from whether you're using the Java or C# version. As a basic suggestion, learn what you can about minMergeDocs and mergeFactor as well as the compound file format. Try different combinations to understand w

Why is BooleanQuery.maxClauseCount static?

2006-04-15 Thread Jeff Rodenburg

that use a high number of clauses, but another set that needs a low number of clauses (different indexes searched, and efficiencies dictate the high/low clause range.) cheers, jeff

Re: Why is BooleanQuery.maxClauseCount static?

2006-04-15 Thread Jeff Rodenburg

y can sometimes cause problems when both types of queries need to execute simultaneously. -- j On 4/15/06, Paul Elschot <[EMAIL PROTECTED]> wrote: > > On Saturday 15 April 2006 18:20, Jeff Rodenburg wrote: > > What was the thinking behind making the BooleanQuery maxClauseCount a > &

Re: Backing up indexes, reliability and robustness

2006-05-12 Thread Jeff Rodenburg

Marc - We built our index maintenance operation to assume a breakdown would occur in process (because it happened several times.) We exist in an environment where "always on, always available" is a business requirement. We also do a lot of updates on a cyclical basis (every 10 minutes), so malf

Re: Analyzer question

2006-05-19 Thread Jeff Rodenburg

The Keyword analyzer does no stemming or input modification of any sort: think of it as WYSIWYG for index population. The Whitespace analyzer simply removes spaces from your input (still no stemming), but the tokens are the individual words. I don't have the code in front of me, so I'm not sure

RE: search performance benchmarks

2006-06-26 Thread Wang, Jeff

3.6Ghz I think.) I frankly haven't tested out scalability yet. Jeff Emptoris, Inc. -Original Message- From: Vladimir Olenin [mailto:[EMAIL PROTECTED] Sent: Monday, June 26, 2006 7:56 AM To: java-user@lucene.apache.org Subject: search performance benchmarks Hi, I'm evaluat

RE: Lock File

2006-06-29 Thread Wang, Jeff

I have a clustered environment, with a load-balancer in the front assigning connections. Is it better to have one of the cluster running a searcher as a webservice (to be accessed by the other machines in the cluster) or to have a IndexReader/Searcher for each machine in the cluster? Jeff

RE: Nutch- Better than Lucene?

2006-07-07 Thread Wang, Jeff

Heh, you said it better than I. I was just about to reply with the witty "Nutch is Lucene, isn't it?" Jeff -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Friday, July 07, 2006 10:28 AM To: java-user@lucene.apache.org Subject: Re: Nutch- Bet

Architecture for indexing/searching mailing list archives

2006-07-24 Thread Jeff Schnitzer

but I would like to understand the bounds of the problem a bit better. Any advice? Thanks, Jeff Schnitzer SubEtha Mailing List Manager - http://subetha.tigris.org/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Distributed Search

2006-07-27 Thread Jeff Rodenburg

Hi Mark - Having gone down this path for the past year, I echo comments from others that scalability/availability/failover is a lot of work. We migrated away from a custom system based on Lucene running on Windows to Solr running on Linux. It took us 6 months to get our system to a solid five-n

Re: 30 milllion+ docs on a single server

2006-08-12 Thread Jeff Rodenburg

Why is a single server so important? I can scale horizontally much cheaper than I scale vertically. On 8/11/06, Mark Miller <[EMAIL PROTECTED]> wrote: I've made a nice little archive application with lucene. I made it to handle our largest need: 2.5 million docs or so on a single server. Now

Re: 30 milllion+ docs on a single server

2006-08-13 Thread Jeff Rodenburg

On 8/12/06, Mark Miller <[EMAIL PROTECTED]> wrote: The single server is important because I think it will take a lot of work to scale it to multiple servers. The index must allow for close to real-time updates and additions. It must also remain searchable at all times (other than than during the

Re: non-lexical comparisons

2005-07-07 Thread Jeff Davis

Have you considered left-padding your numbers with zeros to make each number a string of the same length? e.g., The number 5 would be indexed/queried as "5", which can be correctly compared to 10 ("00010"), 2345 ("02345"), etc. in a lexical comparison... Jeff O

QueryParser handling of backslash characters

2005-07-19 Thread Jeff Davis

this a bug, or is it intentional, or am I missing something? Thanks, Jeff - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: QueryParser handling of backslash characters

2005-07-20 Thread Jeff Davis

That fix works perfectly, as far as I can tell. As for the unit test, it should actually be: assertEquals("192.168.0.15\\public", discardEscapeChar ("192.168.0.15\\\\public")); Jeff On 7/20/05, Eyal <[EMAIL PROTECTED]> wrote: > I think this sh

Most efficient "all documents" query?

2005-08-19 Thread Jeff Davis

s. I'm running it through the query parser at present; if I end up sticking with this method I'll just build a BooleanQuery with a clause for each log type to avoid the parsing overhead. Other than the BooleanQuery, is there a more efficient way of

Re: Ideal Index Fragmentation

2005-09-01 Thread Jeff Rodenburg

explored the concept of executing multiple sub-searches to get filters and their subsequent counts, but my requirements allows me a sustainable, smaller set of potential dynamic filters. This is the concept, haven't put it into practice so have no idea if it scales any better than the brute force method. -- jeff

"Right" combination of analyzers for indexing and searching

2005-09-04 Thread Jeff Rodenburg

tc.? What would you do differently if you were starting from scratch? Cheers from sunny Seattle, jeff r.

BooleanQuery or QueryFilter?

2005-09-09 Thread Jeff Rodenburg

Ids. Should I be looking at BooleanQuery, QueryFilter or a custom filter? QueryFilter (or customfilter, as I may pull the id values from a db) seem like the *best* approach for my scenario. I'm just looking for feedback from others who have gone this route and what their experiences yielded. Thanks, jeff

Version 1.9

2005-09-11 Thread Jeff Rodenburg

Is there a consensus or estimate on when v1.9 will be considered a stable release? I'm prepping a deployment on v1.4.3 but would like an idea of when 1.9 might be considered stable in the eyes of the community. -- Jeff Rodenburg

Is George Aroush still around?

2005-09-13 Thread Jeff Rodenburg

Mayday, mayday Has anyone had recent contact with George Aroush? He's presently managing the C# port of Lucene. Thanks, Jeff Rodenburg

Hits issue or custom filter issue?

2005-09-13 Thread Jeff Rodenburg

nitial thought is the problem lies in the custom filter I've created. myCustomFilter extends Filter, and I'm following the BitSet comparitive example as found in the LIA book. I've done nothing in myCustomFilter regarding caching. I'm doubting this is a bug, but rather something I've overlooked. thanks, jeff r.

Re: Hits issue or custom filter issue?

2005-09-13 Thread Jeff Rodenburg

Might be the same issue, haven't been able to determine during a step-through on the code exec. You're right, no need to add a new FilteredQuery to the statement, just a search on combinedQuery with a new myCustomFilter. Unfortunately, no joy; same response. -- j On 9/13/05, Chris Hostetter <[E

Re: Hits issue or custom filter issue?

2005-09-13 Thread Jeff Rodenburg

uals() then there's your problem. Will do the step-through following this manner and post the results. -- j : Date: Tue, 13 Sep 2005 17:22:49 -0700 > : From: Jeff Rodenburg <[EMAIL PROTECTED]> > : Reply-To: java-user@lucene.apache.org, [EMAIL PROTECTED] > : To: Chris Hoste

Re: Hits issue or custom filter issue?

2005-09-14 Thread Jeff Rodenburg

Good call, Chris.I followed the BitSet comparison route and found that the custom filter was working exactly as it should, but *I* wasn't passing it correct data. Rookie mistake. Doh! I hate it when that happens. -- j On 9/13/05, Jeff Rodenburg <[EMAIL PROTECTED]> wrote: >

Re: Stopping Duplicates

2005-09-17 Thread Jeff Rodenburg

indexed. This is an operational question, so the *best* way depends on your overall operation, as both of these approaches have consequences on index maintenance operations. Hope this helps. -- jeff On 9/17/05, Ben Gill <[EMAIL PROTECTED]> wrote: > > Hi, > > I am storing

Re: Sort by relevance+distance

2005-09-18 Thread Jeff Rodenburg

trimming the post further: On 9/18/05, James Huang <[EMAIL PROTECTED]> wrote: > > >The problem is quite generic, I believe. What I like to do is similar to > LIA-ch6, i.e. to find a "good Chinese Hunan-style restaurant near me." I > prefer Hunan-style; however, if a good Human-style one is 12 m

Re: Is Lucene right for my app?

2005-09-18 Thread Jeff Rodenburg

plenty of support on this mailing list, but you can educate yourself much more effectively with that book. The authors lurk on this list. It's the cheapest consulting ($40) you can get. Cheers, jeff On 9/18/05, Kevin Stembridge <[EMAIL PROTECTED]> wrote: > > > Would Lucene

Re: Sort by relevance+distance

2005-09-18 Thread Jeff Rodenburg

I like Erik's suggestion here as a starting point. I would guess you might find some direction in the Scorer class, but I haven't gone through this in detail. Conceptually a sliding weight based on proximity sounds correct... -- jeff On Sep 18, 2005, at 3:39 PM, James Huang wrote:

Re: Sort by relevance+distance

2005-09-19 Thread Jeff Rodenburg

This is interesting, one I had not considered. Mark - are there any code samples that implement this approach? Or maybe something similar in approach? thanks, jeff On 9/19/05, mark harwood <[EMAIL PROTECTED]> wrote: > > I think the HitCollector approach was fine but needed &

Suggestions for analysis

2005-09-21 Thread Jeff Rodenburg

ple, a search for "Wedgewood WA" would ideally not match "Wedgewood GA". I'm starting with the StandardAnalyzer and thinking of possibly extending it to carry in some of the business rules meant to come into play for tie-breakers. Comments appreciated. Thanks, jeff r.

RemoteSearchable and sorting

2005-10-05 Thread Jeff Rodenburg

Are there known limitations or issues with sorting and RemoteSearchable? I'm encountering problems attempting to sort through a MultiSearcher (ParallelMultiSearcher, actually). I'm using an array of RemoteSearchable objects as the Searchable[] source. If I change the source indexes to be local Inde

Re: RemoteSearchable and sorting

2005-10-05 Thread Jeff Rodenburg

Thanks Rasik. If this is the case, why is this exposed in the API? Should the overloaded search method on ParallelMultiSearcher that takes a Sort object be removed? I'm using the 1.4.3 codebase. -j On 10/5/05, Rasik Pandey <[EMAIL PROTECTED]> wrote: > > Hi Jeff, > > Sor

Re: RemoteSearchable and sorting

2005-10-08 Thread Jeff Rodenburg

p the exceptions appropriately. -- j On 10/5/05, Rasik Pandey <[EMAIL PROTECTED]> wrote: > > Hi Jeff, > > Sorting needs access to an IndexReader so it can do Term lookups, and > I don't think there is a remote impl of IndexReader probably because, > among other reasons

Custom sort with multiple fields?

2005-10-09 Thread Jeff Rodenburg

lit them out in a string[] similar to the LIA example? cheers, jeff r.

Hitcollectors and remotesearchables

2005-10-10 Thread Jeff Rodenburg

so need to pass in a *HitCollector* implementation that subclasses UnicastRemoteObject, so that the callbacks can return to the original VM. So, if you can, it's considerably simpler and more efficient to use TopDocs-based search when you're working remotely." Is this still consider

1 2 >

1 - 100 of 108 matches

Mail list logo