Re: Searching documents on big index by using ParallelMultiSearcher is slow...

2006-10-03 Thread Scott
ut knowing a LOT more about your searches, and your index, it's kind of hard to come up with solutions Best Erick On 10/3/06, Scott <[EMAIL PROTECTED]> wrote: Hi, I have a question about ParallelMultiSearcher performance. I want to search documents on about 10 gigabytes of index

Re: Searching documents on big index by using ParallelMultiSearcher is slow...

2006-10-04 Thread Scott
officially beyond my competence, so I'll have to wait for people who actually know Although if I read your stats right, you're getting approximately 1 sec response time over 10M documents on a 10G index. That's not bad at all. What kind of response time do you need? On 10/3/06

Re: Searching documents on big index by using ParallelMultiSearcher is slow...

2006-10-04 Thread Scott
Although if I read your stats right, you're getting approximately 1 sec response time over 10M documents on a 10G index. That's not bad at all. What kind of response time do you need? On 10/3/06, Scott <[EMAIL PROTECTED]> wrote: Hi, Well, the first question is always "are yo

Re: Searching documents on big index by using ParallelMultiSearcher is slow...

2006-10-04 Thread Scott
uests. Wow, it is very interesting... -- Scott - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to search with empty content

2006-10-09 Thread Scott
Santhosh -- Scott - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Searching documents on big index by using ParallelMultiSearcher is slow...

2006-10-02 Thread Scott
remote searchables from each search slaves and build ParallelMultiSearcher. Then search. Any solution? -- Scott - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Bizarre Search order request

2012-05-25 Thread Scott Smith
blog" and "website". If there aren't 10 of one of these, then the I'm allowed to exceed the maximum of 10 so that I get 20 results. What I don't want is 20 "mail" documents if there are "blog" and/or "website" documents to display. Is something like this even possible? Any thoughts would be appreciated. Scott

Lucene reorganizing indexes

2012-07-16 Thread Scott Smith
before going to 3.5) severely increased the disk activity which is interfering with other things running on the boxes. Does any of this make sense to anyone? Is there an explanation? Thoughts about what we might do about it? Thanks in advance. Scott

RE: Lucene reorganizing indexes

2012-07-17 Thread Scott Smith
armed up after a commit and the never ending full GCs. Greets Ralf -Ursprüngliche Nachricht- Von: Scott Smith [mailto:ssm...@mainstreamdata.com] Gesendet: Montag, 16. Juli 2012 22:29 An: java-user@lucene.apache.org Betreff: Lucene reorganizing indexes We have an application that has

Highlighting html pages

2012-10-23 Thread Scott Smith
I need to take an html page that I retrieve from my lucene search and highlight all of the terms that are part of the search. I need to skip over any html tags since I don't want any words in tags which happen to match the search to be highlighted. Note that I don't want sections of the docum

Lucene 4.0 delete by ID

2012-10-26 Thread Scott Smith
I'm currently converting some lucene code to 4.0. It appears that you are no longer allowed to delete a document by its ID. Is that correct? Is my only option to figure some kind of query (which obviously isn't based on ID) and do the delete from there?

lucene 4.0 indexReader is changed

2012-10-26 Thread Scott Smith
How do I determine if the index has been modified in 4.0? The ifchanged() and isChanged() appear to have been removed.

RE: Lucene 4.0 delete by ID

2012-10-29 Thread Scott Smith
The lucene integer doc id. -Original Message- From: Lance Norskog [mailto:goks...@gmail.com] Sent: Sunday, October 28, 2012 5:09 PM To: java-user@lucene.apache.org Subject: Re: Lucene 4.0 delete by ID Scott, did you mean the Lucene integer id, or the unique id field? - Original

RE: Lucene 4.0 delete by ID

2012-10-29 Thread Scott Smith
I understand the issue of the lucene doc id changing. I'll probably look to see if I can delete stuff just based on some field that I have that I know won't change. I've used the doc id for a long time, but maybe it's time for a change. Thanks for all of the input.

RE: lucene 4.0 indexReader is changed

2012-10-29 Thread Scott Smith
OK. I'll take a look at that. Thanks for the help. Scott -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Friday, October 26, 2012 6:07 PM To: java-user@lucene.apache.org Subject: Re: lucene 4.0 indexReader is changed How about DirectoryReader

Norms and Term Vectors in Lucene 4.0

2012-10-29 Thread Scott Smith
orms on a field, then do I need to use the new streamlined Field() and set the appropriate FieldType object parameters? Is that my only option? I assume I also have to go through the new Field() if I need to control TermVectors? Where's LIA3 when you need it :) Scott

RE: Norms and Term Vectors in Lucene 4.0

2012-10-30 Thread Scott Smith
Thanks Simon. Appears I had it mostly figured out correctly--except for the last question :-) Thanks for the suggestion on caching the fieldtype. Cheers Scott -Original Message- From: Simon Willnauer [mailto:simon.willna...@gmail.com] Sent: Tuesday, October 30, 2012 2:10 AM To: java

Highlighting and InvalidTokenOffsetsException in Lucene 4.0

2012-10-31 Thread Scott Smith
here (even though it took a couple of "minor" changes to get it to compile in 4.0 This code used to work in 3.5. Anyone have any ideas? Scott Code fragment: try { ctf = new CachingTokenFilter(myCustomAnalyzer .tokenStream(M

4.0 tokenStream or SimpleAnalyzer bug?

2012-11-01 Thread Scott Smith
I was doing some tokenizer/filter analysis attempting to fix a bug I have in highlighting under 4.0. I was running the displayTokensWithFullDetails code from LIA2. I would get an exception with a bad index value of -1. I fixed the problem by doing a reset() immediately after creating my Token

RE: Highlighting html pages

2012-11-01 Thread Scott Smith
id of punctuation (commas, periods, semicolons, etc.) after the HTML stripping, is there a filter? Essentially, I want to get it back to what StandardTokenizer would give me after I've stripped the HTML. Suggestions? Scott -Original Message- From: Michael Sokolov [mailto:soko...@if

Near Real Time for multiple applications

2012-11-05 Thread Scott Smith
I've been reading about NRT thinking it might be good to integrate it into my code. However, I have a question. Suppose that the index writer and the index reader run in totally different JVMs (i.e., they are different applications and only communicate via the disk). Am I correct in thinking

RE: Highlighting html pages

2012-11-05 Thread Scott Smith
tags being properly nested. Cheers Scott -----Original Message- From: Scott Smith [mailto:ssm...@mainstreamdata.com] Sent: Thursday, November 01, 2012 7:16 PM To: Michael Sokolov; java-user@lucene.apache.org Subject: RE: Highlighting html pages I was trying to play with this. Am I correct in

RE: Near Real Time for multiple applications

2012-11-07 Thread Scott Smith
ccandless.com] Sent: Tuesday, November 06, 2012 5:32 AM To: java-user@lucene.apache.org Subject: Re: Near Real Time for multiple applications On Mon, Nov 5, 2012 at 6:33 PM, Scott Smith wrote: > I've been reading about NRT thinking it might be good to integrate it into my > code. However,

CJKWidthFilter vs ICUFoldingFilter

2012-11-14 Thread Scott Smith
, etc. Can I just use the ICUFoldingFilter? Cheers Scott

Which stemmer?

2012-11-14 Thread Scott Smith
Does anyone have any experience with the stemmers? I know that Porter is what "everyone" uses. Am I better off with KStemFilter (better performance) or ?? Does anyone understand the differences between the various stemmers and how to choose one over another?

RE: CJKWidthFilter vs ICUFoldingFilter

2012-11-14 Thread Scott Smith
Thanks -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Wednesday, November 14, 2012 12:17 PM To: java-user@lucene.apache.org Subject: Re: CJKWidthFilter vs ICUFoldingFilter On Wed, Nov 14, 2012 at 9:47 AM, Scott Smith wrote: > Reading the documentation for these

RE: Which stemmer?

2012-11-14 Thread Scott Smith
Perhaps the kstemmer is "just right" :-) Cheers Scott -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Wednesday, November 14, 2012 4:14 PM To: java-user@lucene.apache.org Subject: Re: Which stemmer? What is your use case? If you don't have a spec

RE: Which stemmer?

2012-11-15 Thread Scott Smith
dog dog dog dog's dog'dog's dog' dogs dog dogs dog dogs' dog dogs dog Now, if someone would answer my question on the Solr list ("Custom Solr Indexer/Search&q

Handling a closed IndexWriter in Solr

2013-03-13 Thread Danzig, Scott
Hey all, We're using a Solr 4 core to handle our article data. When someone in our CMS publishes an article, we have a listener that indexes it straight to solr. We use the previously instantiated HttpSolrServer, build the solr document, add it with server.add(doc) .. then do a server.commit(

Lucene slow performance

2013-03-15 Thread Scott Smith
several thousand (3000+) .cfs files. We do optimize the index once per day. This is a system that probably gets several thousand document deletes and additions per day (spread out across the day). Any thoughts. We didn't really notice this until we went to 4.x. Scott

RE: Lucene slow performance

2013-03-15 Thread Scott Smith
n has changed since 1.4, but does it not merge all of the various files into a few files? -Original Message- From: Scott Smith [mailto:ssm...@mainstreamdata.com] Sent: Friday, March 15, 2013 4:15 PM To: java-user@lucene.apache.org Subject: Lucene slow performance We have a system th

RE: Lucene slow performance

2013-03-15 Thread Scott Smith
a custom merge policy or somthing like this, any special IndexWriter settings? On Fri, Mar 15, 2013 at 11:15 PM, Scott Smith wrote: > We have a system that is using lucene and the searches are very slow. The > number of documents is fairly small (less than 30,000) and each document is

RE: Lucene slow performance

2013-03-15 Thread Scott Smith
ling all merges)?" Frankly I don't quite understand what this means. When I "close" the indexwriter, I simply call close(). Is that the wrong thing? Thanks Scott -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, March 15, 2013 4:49 PM To:

RE: Lucene slow performance

2013-03-15 Thread Scott Smith
March 16, 2013 12:08 AM > To: java-user@lucene.apache.org > Subject: Re: Lucene slow performance > > On Sat, Mar 16, 2013 at 12:02 AM, Scott Smith > wrote: > > " Do you always close IndexWriter after adding few documents and > > when > closing, disable "

RE: Lucene slow performance

2013-03-15 Thread Scott Smith
nauer [mailto:simon.willna...@gmail.com] Sent: Friday, March 15, 2013 5:08 PM To: java-user@lucene.apache.org Subject: Re: Lucene slow performance On Sat, Mar 16, 2013 at 12:02 AM, Scott Smith wrote: > " Do you always close IndexWriter after adding few documents and when > closing, d

RE: Lucene slow performance

2013-03-16 Thread Scott Smith
Thanks for the help. The reindex was done this morning and searches now take less than a second. I will make the change to the code. Cheers Scott -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, March 15, 2013 11:17 PM To: java-user@lucene.apache.org

RE: Lucene slow performance -- still broke

2013-03-20 Thread Scott Smith
tRAMBufferSizeMB(50.0); Any help in figuring out what is causing this problem would be appreciated. I do now have an offline system that I can play with so I can do some intrusive things if need be. Scott -Original Message- From: Scott Smith [mailto:ssm...@mainstreamdata.com] Sent:

RE: Lucene slow performance -- still broke

2013-03-20 Thread Scott Smith
Duh...it's supposed to be setMergeFactor(). Thanks Scott -Original Message- From: Simon Willnauer [mailto:simon.willna...@gmail.com] Sent: Wednesday, March 20, 2013 3:53 PM To: java-user@lucene.apache.org Subject: Re: Lucene slow performance -- still broke quick question, w

classic.QueryParser - bug or new behavior?

2013-05-19 Thread Scott Smith
a forward slash, I'm confused why it would need escaping of any of the characters in the string with the "/EXPIRED". Has anyone seen this? Scott

RE: classic.QueryParser - bug or new behavior?

2013-05-19 Thread Scott Smith
help. Scott -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Sunday, May 19, 2013 1:26 PM To: java-user@lucene.apache.org Subject: Re: classic.QueryParser - bug or new behavior? Yeah, just go ahead and escape the slash, either with a backslash or by enclosin

QueryParser in 3.x

2010-09-16 Thread Scott Smith
I recently upgraded to Lucene 3.0 and am seeing some new behavior that I don't understand. Perhaps someone can explain why. I have a custom analyzer. Part of the analyzer uses the AsciiFoldingFilter. If I run a word with an umlaut through that analyzer using the AnalyzerDemo code in LIA2,

RE: QueryParser in 3.x

2010-09-17 Thread Scott Smith
the reusableTokenStream and I now get the result I wanted. The above code snippet generates the word without the umlaut in both cases. So, problem solved. Thanks to Simon for putting on the right track. Scott -Original Message- From: Simon Willnauer [mailto:simon.willna...@google

Fuzzy Phrase Search

2010-10-27 Thread Andrew Scott
Hi Guys, I am wondering how I can go about doing a Fuzzy Phrase search using Lucene.NET 2.9.2 - I've tired looking around everywhere but there doesn't really seem to be any resources related to this anywhere. I found this stackoverflow link

RE: [ANNOUNCEMENT] NLP-based Analyzer library for Lucene

2011-02-14 Thread Scott Smith
One thing to note is that the Stanford POS Tagger is licensed using GPL v2. A commercial license is available, but it doesn't appear to be free ($3k min if I read correctly). I wonder what it would take to make this available using OpenNLP which has a friendlier license. -Original Message

MoreLikeThis Interface changes

2011-09-21 Thread Scott Smith
something where you boost the MLT words from the subject and as opposed to the body of the document you are looking for similar items on? Thanks Scott

RE: MoreLikeThis Interface changes

2011-09-22 Thread Scott Smith
Understand. Thanks for the information. -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Wednesday, September 21, 2011 6:59 PM To: java-user@lucene.apache.org Subject: Re: MoreLikeThis Interface changes On Wed, Sep 21, 2011 at 5:17 PM, Scott Smith wrote: >

RE: MoreLikeThis Interface changes

2011-09-26 Thread Scott Smith
test fails. If I include it, it passes. I'm using MLT as follows: _query = new BooleanClause(mlt.like(new InputStreamReader(is), "EVERYTHING"), BooleanClause.Occur.MUST); "is" is the input stream. Did I miss something in your response? Scott -O

RE: MoreLikeThis Interface changes

2011-09-26 Thread Scott Smith
OK. Thanks -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, September 26, 2011 12:15 PM To: java-user@lucene.apache.org Subject: Re: MoreLikeThis Interface changes On Mon, Sep 26, 2011 at 2:06 PM, Scott Smith wrote: > "is" is the input stream

Lucene Query Syntax with analyzed and unanalyzed text

2013-09-16 Thread Scott Smith
I want to be sure I understand this correctly. Suppose I have a search that I'm going to run through the query parser that looks like: body:"some phrase" AND keyword:"my-keyword" clearly "body" and "keyword" are field names. However, the additional information is that the "body" field is anal

Can you escape characters you don't want the analyzer to modify

2013-09-17 Thread Scott Smith
Suppose I have a string like "ab@cd%d". My analyzer will turn this into "ab cd d". Can I pass it "ab\@cd\%d" and force it to treat it as a single word? I want to use the Query parser, but I don't want it messing with fields that have not been analyzed.

RE: Can you escape characters you don't want the analyzer to modify

2013-09-18 Thread Scott Smith
ounds like you either need to have a custom analyzer or a field-aware analyzer. -- Jack Krupansky -Original Message----- From: Scott Smith Sent: Tuesday, September 17, 2013 4:26 PM To: java-user@lucene.apache.org Subject: Can you escape characters you don't want the analyzer to modify S

Phrase highlight

2013-11-26 Thread Scott Smith
I'm doing some highlighting with the following code fragment: formatter = new SimpleHTMLFormatter(, ); Scorer score = new QueryScorer(myQuery); ht = new Highlighter(formatter, score); ht.setTextFragmenter(new NullFragmenter());

Highlighting phrases

2013-11-27 Thread Scott Smith
I'm doing some highlighting with the following code fragment: formatter = new SimpleHTMLFormatter(, ); Scorer score = new QueryScorer(myQuery); ht = new Highlighter(formatter, score); ht.setTextFragmenter(new NullFragmenter());

RE: Highlighting phrases

2013-11-27 Thread Scott Smith
Never mind. I figured it out. Thanks anyway. -Original Message- From: Scott Smith [mailto:ssm...@mainstreamdata.com] Sent: Wednesday, November 27, 2013 9:27 AM To: java-user@lucene.apache.org Subject: Highlighting phrases I'm doing some highlighting with the following code fra

Analyzers aren't reusable?? (lucene 4.2.1)

2013-12-05 Thread Scott Smith
" and "/>". Is this expected behavior? I thought analyzers were thread-safe and reusable. Am I wrong on that point? I would expect the output of all three to be the same. Can someone explain to me what's going on? What am I missing? Scott

RE: Analyzers aren't reusable?? (lucene 4.2.1)

2013-12-05 Thread Scott Smith
Thanks for the quick response. I'll read through the references. Thanks again Scott -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Thursday, December 05, 2013 1:46 PM To: java-user@lucene.apache.org Subject: RE: Analyzers aren't reusable?? (lucene 4

RE: Analyzers aren't reusable?? (lucene 4.2.1)

2013-12-05 Thread Scott Smith
phi.de eMail: u...@thetaphi.de > -Original Message- > From: Scott Smith [mailto:ssm...@mainstreamdata.com] > Sent: Thursday, December 05, 2013 9:36 PM > To: java-user@lucene.apache.org > Subject: Analyzers aren't reusable?? (lucene 4.2.1) > > I wrote the following

Running Lucene tests on a custom Directory subclass

2013-12-17 Thread Scott Schneider
ectory) and gave it a 0-argument constructor. Apologies if this was addressed elsewhere. In googling for an answer, the term "Directory" is basically invisible. I found a page on running Lucene's tests on a custom codec and approximated those steps. Scott

RE: Running Lucene tests on a custom Directory subclass

2013-12-18 Thread Scott Schneider
Never mind... the problem was that I compiled my jar against Lucene 3.3, but tried running against Lucene 4.4. It works when I also run against 3.3. (Or, at least, I get test failures that make sense!) Scott > -Original Message- > From: Scott Schneider [mailto:scott_

Debugging unit tests with Eclipse

2013-12-18 Thread Scott Schneider
"ant test -Dblahblah" works, but this doesn't use that argument and gets the same 54 test failures, like normal. Please help! Thanks, Scott

Unit test help

2013-12-20 Thread Scott Schneider
at shouldn't be blocking the deletion. And I don't see how any other code could open a handle to this file, since it's created in a temp directory created by Lucene Transform. I can't think of any reason for the difference between ant and eclipse! Please help! Thanks, Scott

RE: Unit test help

2013-12-22 Thread Scott Schneider
set -Dlucene.version=3.3-SNAPSHOT. When running the test through ant, I think common-build.xml sets that property. My other problem running the tests on my own Directory subclass was a noob mistake. I had to specify -Dtests.directory=foo in VM arguments, rather than program arguments. Scott

Performance testing Lucene

2014-01-20 Thread Scott Schneider
gle, general query test. It's not hard to come up with a decent set of queries, but I'd really like something representative of real world queries. If there some standard set of commonly used queries, that would be ideal. Thanks! Scott

RE: Performance testing Lucene

2014-01-23 Thread Scott Schneider
Thanks! I ran this Directory subclass through the Lucene unit tests (and found 3 race conditions). Unit tests are wonderful. Scott > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Wednesday, January 22, 2014 7:05 AM > To:

RE: Performance testing Lucene

2014-01-27 Thread Scott Schneider
many unit tests! Scott > -Original Message- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Friday, January 24, 2014 3:03 AM > To: java-user@lucene.apache.org > Subject: RE: Performance testing Lucene > > Hi Scott, > > the unit tests are also a good

Exact Phrase Search returning in correct results

2014-06-11 Thread Scott Selvia
I’m having an issue searching for an exact phrase with Lucene 4.7. My use case loaded the Declaration of Independence into a Lucene search database. I search for “it becomes” and I get two hits; one for “it, becomes” and another for a line that just has “becomes” at the end of the line. Expec

Re: Exact Phrase Search returning in correct results

2014-06-11 Thread Scott Selvia
o be able to search stop words consider adding > CharArraySet.EMPTY_SET to the StandardAnalyzer's initializer. > > > > -Original Message- > From: Scott Selvia [mailto:ssel...@gmail.com] > Sent: Wednesday, June 11, 2014 12:48 PM > To: java-user@lucene.apache

Searching with String that Represents a Signature

2014-08-14 Thread Scott Selvia
We have OCR a document with a signature, you can select the signature and copy the text representation for searching in a lucene 4.7 index. We have surrounded the search text with double quotes since it has invalid search characters without the use of the double quotes. Search Text: ":J!/z&”

problem using faceting in 5.3

2015-11-02 Thread scott cote
lucene-queryparser 3.6.0 Here is the code to retrieve facet data from the version 3.6 index (which does work against version 3.6 lucene): public class FacetRunner { public static void main(final String[] args) throws Exception { File indexDirFile = new File("/Users/

Re: Highlighting deprecation?

2015-12-01 Thread scott cote
checkout the highlight package … https://lucene.apache.org/core/5_3_0/highlighter/org/apache/lucene/search/highlight/package-summary.html <https://lucene.apache.org/core/5_3_0/highlighter/org/apache/lucene/search/highlight/package-summary.html> SCott > On Dec 1, 2015, at 4:16 PM

multivalued index search

2016-01-29 Thread scott cote
What is the best approach to implementing a multivalued index field search in the current version of Lucene? SCott - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h

update of a numerical field in a lucene document

2016-04-08 Thread scott cote
he if statement that ensures that the document has the field), but then is not gated into “processEvents(true,false);” step as the line 1515 docWriter.updateDocValues(….) returns false. can’t seem to pin down why that is happening. What am I missing? SCott

Re: upgrading lucene 4 to 6

2016-04-26 Thread scott cote
Jamie, I just went through an upgrade from 3 to 5. We used faceting, highlighting, search, explanation, etc ….It took us 3 months and that was a hard push (2 to 3 people dedicated to the effort). Don’t put off the upgrade. The performance is worth the pain. SCott > On Apr 26, 2016,

Re: Call for MODERATORs on the dev and java-user mailing lists

2017-02-03 Thread scott cote
Let me ask if I can get some cycles to do this. I’m interested but I have to check first. SCott scott.c...@lucidworks.com > On Feb 3, 2017, at 3:14 PM, Steve Rowe wrote: > > FYI I’m holding off on creating the INFRA JIRA until Aurelian has > acknowledged subscribing to dev@luc

Re: Call for MODERATORs on the dev and java-user mailing lists

2017-02-03 Thread scott cote
Thanks Steve. SCott > On Feb 3, 2017, at 3:57 PM, Steve Rowe wrote: > > Hmm, can’t do math today: the average per list is more like 1 message every 3 > days on a per list basis, assuming matches for subject:MODERATE on the > gmail.com web UI is accurate. It's burs

Max Field Length

2022-09-22 Thread Scott Guthery
Lucene 9.3 seems to have a (post-Analyzer) maximum field length of 32767. Is there a way of increasing this without resorting to the source code? Thanks for any guidance. Cheers, Scott

Re: Max Field Length

2022-09-23 Thread Scott Guthery
g maximums. Cheers, Scott > >

Boosting results

2008-11-06 Thread Scott Smith
y. I was setting the boost on the category A term to 1.0 and the boost on the category B term to 0.0. Any thoughts how to skin this? Scott

RE: Boosting results

2008-11-07 Thread Scott Smith
n multiple fields and "score" (aka relevancy) is one of the pseudo fields. That'll work. Thanks. Scott -Original Message- From: Erick Erickson [mailto:[EMAIL PROTECTED] Sent: Friday, November 07, 2008 5:59 AM To: java-user@lucene.apache.org Subject: Re: Boosting results duuu

Optimization error

2009-02-02 Thread Scott Smith
I'm optimizing a database and getting the error: maxClauseCount is set to 1024 I understand what that means coming out of the query parser, but what does it mean coming from the optimizer? Scott

Determining lucene version programmatically

2009-06-16 Thread Scott Smith
Is there any way to programmatically determine the version of lucene being loaded?

Getting results for a specific date

2009-06-16 Thread Scott Smith
Mostly, our users want to see search results in reverse date order (newer hits first). I know how to do that with a Sort object and it works fine. However, sometimes our users want to do a search and get results in date order starting at a certain date. Say for example, they want to start the

RE: Determining lucene version programmatically

2009-06-16 Thread Scott Smith
().getImplementationVersion(); cheers, João On Tue, Jun 16, 2009 at 11:36 PM, Scott Smith wrote: > Is there any way to programmatically determine the version of lucene > being loaded? > > > > -- Cumprimentos, João Carlos Galaio da Silva --

Queries and Filters

2009-06-16 Thread Scott Smith
The last few versions of lucene have deprecated several of the interfaces we were using and this is necessitating a fairly major upgrade of our code (which hasn't had much done to it for several years). I'm not complaining; the changes are probably necessary. In reading LIA2, I've learned abou

RE: Getting results for a specific date

2009-06-17 Thread Scott Smith
Clarification: Obviously, I should have said "June 11" when I talked of a newer date. ____ From: Scott Smith [mailto:ssm...@mainstreamdata.com] Sent: Tue 6/16/2009 5:41 PM To: java-user@lucene.apache.org Subject: Getting results for a specific date M

RE: Queries and Filters

2009-06-17 Thread Scott Smith
t; -Original Message- > From: Scott Smith [mailto:ssm...@mainstreamdata.com] > Sent: Wednesday, June 17, 2009 2:15 AM > To: java-user@lucene.apache.org > Subject: Queries and Filters > > The last few versions of lucene have deprecated several of the > interfaces we

caching an indexreader

2009-06-19 Thread Scott Smith
In my environment, one of the concerns is that new documents are constantly being added (and some documents may be deleted). This means that when a user does a search and pages through results, it is possible that there are new items coming in which affect the search-thus changing where items are

Filters vs Queries - revisited

2009-06-19 Thread Scott Smith
As I read about Filters, it seems to me that a filter is preferred for any portion of the query string where you are setting the boost to 0 (meaning you don't want it to contribute to the relevancy score). But, relevancy is only interesting if you are displaying the documents in relevancy ord

RE: caching an indexreader

2009-06-19 Thread Scott Smith
gt; wrote: > On Fri, Jun 19, 2009 at 2:40 PM, Scott Smith > wrote: > > In my environment, one of the concerns is that new documents are > > constantly being added (and some documents may be deleted). This means > > that when a user does a search and pages through results, it

Highlighting phrases in 2.9

2009-09-30 Thread Scott Smith
I've been looking at the changes I have to make in my code to go from 2.4.1 to 2.9. One of the features I have is to highlight query hits in documents which meet the search criteria. If the query has a phrase, then I need to highlight the phrase, but not isolated words from the phrase which also

Question about how to speed up custom scoring

2009-10-08 Thread scott w
, Scott

Re: Question about how to speed up custom scoring

2009-10-08 Thread scott w
core + (1 - bias) * termWeightedScore; } } On Thu, Oct 8, 2009 at 7:54 AM, scott w wrote: > I am trying to come up with a performant query that will allow me to use a > custom score where the custom score is a sum-product over a set of query > time weights where each weight gets applied only if the

Re: Question about how to speed up custom scoring

2009-10-09 Thread scott w
Thanks for the suggestions Erick. I am using Lucene 2.3. Terms are stored and given Andrzej's comments in the follow up email sounds like it's not the stored field issue. I'll keep investigating... thanks, Scott On Thu, Oct 8, 2009 at 8:06 AM, Erick Erickson wrote: > I suspect

Re: Question about how to speed up custom scoring

2009-10-09 Thread scott w
/getting-started-with-payloads/ >> >> -Grant >> >> On Oct 8, 2009, at 11:56 AM, scott w wrote: >> >> Oops, forgot to include the class I mentioned. Here it is: >>> >>> public class QueryTermBoostingQuery extends CustomScoreQuery { >>>

Re: Question about how to speed up custom scoring

2009-10-09 Thread scott w
personalization where you have a default set of weights and you want to adjust them on the fly although our use case is a little different. thanks, Scott On Fri, Oct 9, 2009 at 10:40 AM, Jake Mannix wrote: > Scott, > > To reiterate what Erick and Andrzej's said: calling > IndexReader.doc

Re: Question about how to speed up custom scoring

2009-10-09 Thread scott w
score. Hopefully that make more sense. The other use case I had in mind is one where it doesn't care about the indexed value and only looks at whether the field is present or not and then uses the query supplied weight to measure the relative importance of that field. thanks, Scott O

Re: Question about how to speed up custom scoring

2009-10-09 Thread scott w
Thanks Jake! I will test this out and report back soon in case it's helpful to others. Definitely appreciate the help. Scott On Fri, Oct 9, 2009 at 3:33 PM, Jake Mannix wrote: > On Fri, Oct 9, 2009 at 3:07 PM, scott w wrote: > > > Example Document: > > model_1_score

Re: Question about how to speed up custom scoring

2009-10-10 Thread scott w
at 5:32 PM, Jake Mannix wrote: > Great Scott (hah!) - please do report back, even if it just works fine and > you have no more questions, I'd like to know whether this really is > what you were after and actually works for you. > > Note that the FieldCache is kinda "mag

Re: Question about how to speed up custom scoring

2009-10-11 Thread scott w
vsq1, vsq2, vsq3 }; > > Query textQuery = QueryParser.parse("company:Microsoft"); > > Query q = new QueryTermBoostingQuery(textQuery, vsq, bias); > --- > > Does this work for you? > Yes I think this should work! Thanks for taking the time to clearly write up a solution. Will report back after testing it out. best, Scott

Polishing up my Lucene integration, customizing analyzer

2009-11-15 Thread Scott Ribe
hem are simple settings to StandardAnalyzer, but not all, particularly those first two items... Any hints or directions appreciated. -- Scott Ribe scott_r...@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice -

  1   2   >